-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FTZ/DAZ handling with -ffast-math targets is inconsistent for x86-64 #81204
Comments
@llvm/issue-subscribers-clang-driver Author: Andy Kaylor (andykaylor)
The behavior of intializing the FTZ/DAZ state when using -ffast-math with x86-64 is inconsistent. It depends on the options used on the link command (assuming we're using clang for linking) rather than the options used during compilation.
This can be demonstrated by comparing the results of compiling and linking in one step versus compiling to an object file and linking separately.
This is happening because we are setting the FTZ/DAZ state of the MXCSR by linking with crtfastmath.o, as described here: https://clang.llvm.org/docs/UsersManual.html#a-note-about-crtfastmath-o However, if the -ffast-math option isn't specified on the command-line used to link the executable, we will not link with crtfastmath.o. This is problematic because not only do we not get this advertised benefit of -ffast-math, but we are, nevertheless, setting the "dernomal-fp-math"="preserve-sign,preserve-sign" attribute for all compiled function, indicating that the compiler can assume that denormals will be flushed to zero. As far as I can tell, the primary effect of this for x86-64 targets is that we flush denormals to zero when constant folding. It may be worth noting that this behavior is consistent with gcc. |
GCC does all the crtfastmath gunk, but it doesn't change its handling of denormals in constant evaluation. |
That's a fair point. We don't claim numeric consistency with fast-math, so if fast-math is enabled it may not matter how we handle denormals in constant evaluation, but I'm not sure we gain anything by flushing them to zero so maybe it's better to use IEEE evaluation. What I meant to say was consistent with gcc was the fact that you get different behavior depending on whether or not fast/unsafe math is enabled on the link command-line. |
Pardon the noise, just adding an xref for discoverability between related issues: #57589 |
The behavior of intializing the FTZ/DAZ state when using -ffast-math with x86-64 is inconsistent. It depends on the options used on the link command (assuming we're using clang for linking) rather than the options used during compilation.
This can be demonstrated by comparing the results of compiling and linking in one step versus compiling to an object file and linking separately.
This is happening because we are setting the FTZ/DAZ state of the MXCSR by linking with crtfastmath.o, as described here: https://clang.llvm.org/docs/UsersManual.html#a-note-about-crtfastmath-o
However, if the -ffast-math option isn't specified on the command-line used to link the executable, we will not link with crtfastmath.o. This is problematic because not only do we not get this advertised benefit of -ffast-math, but we are, nevertheless, setting the "dernomal-fp-math"="preserve-sign,preserve-sign" attribute for all compiled function, indicating that the compiler can assume that denormals will be flushed to zero. As far as I can tell, the primary effect of this for x86-64 targets is that we flush denormals to zero when constant folding.
It may be worth noting that this behavior is consistent with gcc.
The text was updated successfully, but these errors were encountered: