Skip to content

Conversation

@scdub
Copy link

@scdub scdub commented Aug 5, 2023

If in an envioronment where the desired output code will be run on a broad set of x86 CPUs, BMI2 isn't guaranteed to be available. When compiling against MSVC, bzhi will still be inserted into the instruction stream because the SNAPPY_HAVE_BMI2 check will succeeed (MSVC compiles without issue), but cannot be run by Sandy Bridge and earlier CPU generations.

If in an envioronment where the desired output code will be run on a
broad set of x86 CPUs, BMI2 isn't guaranteed to be available. When
compiling against MSVC, `bzhi` will still be inserted into the
instruction stream because the `SNAPPY_HAVE_BMI2` check will succeeed
(it compiles without issue), but cannot be run by Sandy Bridge and early
CPUs.
@rhubner
Copy link

rhubner commented Jul 19, 2024

Hello,

I can confirm that this real issue. snappy have special flags for AVX(2), and you expect that if it's off, you will not get any modern AVX(2) or BMI2 instruction in output binary. Unfortunately this is not true. in RocksDB we have this bug. So far we are going to fix it with -DSNAPPY_HAVE_BMI2=0 But I don't thing it's right way. The right way is to have proper build flag like this PR.

From my side, please merge this PR, it will help and it fix real issue.

Radek

@danilak-G
Copy link
Collaborator

Thanks, do we need to disable BMI2 for MSVC? Otherwise we cannot guarantee that it is present.

Maybe set SNAPPY_REQUIRE_BMI2 to OFF if the compiler matches MSVC?

@rhubner
Copy link

rhubner commented Aug 19, 2024

Hello @danilak-G

Please correct me if I'm wrong, but I think by default, MSVC doesn't use AVX2 or BMI2. Only when you specify /arch:AVX/AVX2/AVX512 parameter.

I think we should little refactor build system. The way how we detect BMI2/AVX2 is little non standard. We should create two CMake properties, one for AVX2, which at the moment will be only compiler flag, and second for BMI2(Imply AVX2) which will setup compiler flags and also enable BMI2 intrinsic implementation. This should give us good flexibility and make the CPU flags cleaner for other developers.

Radek

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants