-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wip: Bitmap operations SIMD logic refactor #942
base: main
Are you sure you want to change the base?
Conversation
byte* dstMaxEnd = dstPtr + dstLen; | ||
int offset = 0; | ||
while (dstCurr < dstMaxEnd) | ||
else if (Vector128.IsHardwareAccelerated && Vector128<byte>.IsSupported) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if all three paths were tested. In practice, all of Github's x86 based runners will have Vector256.IsHardwareAcceleated as true, not testing the other paths by default.
(If Garnet had an arm64 runner - which it should anyway - then I strongly suspect it would have Vector128.IsHardwareAccelerated as true.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it would be nice. I would prefer to have "outerloop" test runs that either manually and/or scheduled with DOTNET_EnableAVX2=0
or DOTNET_EnableHWIntrinsic=0
environment variables for all of the tests.
ISimdVector
in future .NET versions is gonna push this testing effort back to .NET runtime team hopefully, as the 128 & 256 & other paths are just Vector### templates and if there was non-typo bug in them, it would be bug in the .NET BCL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do the same TestProcess solution #1041 ended up with for these 2-3 specific tests (it can be generalized a bit further, but we just need to test the paths once or twice, so adding the same using var p = TestProcess and env.Add() isn't much).
It's also possible to do what the Vector documentation suggests to do with BDN and allow an option to run the entire suit with acceleration off (I suspect these hardware accelerations affect other parts of the program, via library code or JIT).
It would also make sense to have a regular ARM runner, as ARM has other differences like memory model which may be important (the BITCOUNT portable code issue wasn't detected since no runner - despite Garnet offering binaries for these platforms...).
And fast-path memcpy for single key
wip, todo: pr description
Now we can do BITOPs on when running Garnet inside Android VMs!
screen-20250316-174515.mp4
TODO:
DOTNET_EnableHWIntrinsic=0
+DOTNET_MaxVectorTBitWidth=128/256/512
Follow-up work:
Refactor BITCOUNT logic similarly to be cross-platform(Platform independent bitcount operation #1034) and to follow "modern" .NET SIMD practices