FTZ/DAZ handling with -ffast-math targets is inconsistent for x86-64 #81204

andykaylor · 2024-02-08T23:54:18Z

The behavior of intializing the FTZ/DAZ state when using -ffast-math with x86-64 is inconsistent. It depends on the options used on the link command (assuming we're using clang for linking) rather than the options used during compilation.

This can be demonstrated by comparing the results of compiling and linking in one step versus compiling to an object file and linking separately.

$ cat ftz.c

#include <float.h>
#include <stdio.h>

float foo(float);

int main(int argc, char **argv) {
  float two = argc + argc;
  printf("x = %e\n", FLT_MIN / two);
  return 0;
}

$ ../build/bin/clang -O2 -ffast-math -o ftz.exe ftz.c
$ ./ftz.exe

x = 0.000000e+00

$ ../build/bin/clang -c -O2 -ffast-math -o ftz.o ftz.c
$ ../build/bin/clang -o ftz.exe ftz.o
$ ./ftz.exe

x = 5.877472e-39

This is happening because we are setting the FTZ/DAZ state of the MXCSR by linking with crtfastmath.o, as described here: https://clang.llvm.org/docs/UsersManual.html#a-note-about-crtfastmath-o

However, if the -ffast-math option isn't specified on the command-line used to link the executable, we will not link with crtfastmath.o. This is problematic because not only do we not get this advertised benefit of -ffast-math, but we are, nevertheless, setting the "dernomal-fp-math"="preserve-sign,preserve-sign" attribute for all compiled function, indicating that the compiler can assume that denormals will be flushed to zero. As far as I can tell, the primary effect of this for x86-64 targets is that we flush denormals to zero when constant folding.

It may be worth noting that this behavior is consistent with gcc.

The text was updated successfully, but these errors were encountered:

llvmbot · 2024-02-08T23:54:33Z

@llvm/issue-subscribers-clang-driver

Author: Andy Kaylor (andykaylor)

The behavior of intializing the FTZ/DAZ state when using -ffast-math with x86-64 is inconsistent. It depends on the options used on the link command (assuming we're using clang for linking) rather than the options used during compilation.

This can be demonstrated by comparing the results of compiling and linking in one step versus compiling to an object file and linking separately.

$ cat ftz.c

#include &lt;float.h&gt;
#include &lt;stdio.h&gt;

float foo(float);

int main(int argc, char **argv) {
  float two = argc + argc;
  printf("x = %e\n", FLT_MIN / two);
  return 0;
}

$ ../build/bin/clang -O2 -ffast-math -o ftz.exe ftz.c
$ ./ftz.exe

x = 0.000000e+00

$ ../build/bin/clang -c -O2 -ffast-math -o ftz.o ftz.c
$ ../build/bin/clang -o ftz.exe ftz.o
$ ./ftz.exe

x = 5.877472e-39

This is happening because we are setting the FTZ/DAZ state of the MXCSR by linking with crtfastmath.o, as described here: https://clang.llvm.org/docs/UsersManual.html#a-note-about-crtfastmath-o

However, if the -ffast-math option isn't specified on the command-line used to link the executable, we will not link with crtfastmath.o. This is problematic because not only do we not get this advertised benefit of -ffast-math, but we are, nevertheless, setting the "dernomal-fp-math"="preserve-sign,preserve-sign" attribute for all compiled function, indicating that the compiler can assume that denormals will be flushed to zero. As far as I can tell, the primary effect of this for x86-64 targets is that we flush denormals to zero when constant folding.

It may be worth noting that this behavior is consistent with gcc.

jyknight · 2024-02-09T21:32:11Z

As far as I can tell, the primary effect of this for x86-64 targets is that we flush denormals to zero when constant folding.

It may be worth noting that this behavior is consistent with gcc.

GCC does all the crtfastmath gunk, but it doesn't change its handling of denormals in constant evaluation.

andykaylor · 2024-02-09T21:51:50Z

GCC does all the crtfastmath gunk, but it doesn't change its handling of denormals in constant evaluation.

That's a fair point. We don't claim numeric consistency with fast-math, so if fast-math is enabled it may not matter how we handle denormals in constant evaluation, but I'm not sure we gain anything by flushing them to zero so maybe it's better to use IEEE evaluation.

What I meant to say was consistent with gcc was the fact that you get different behavior depending on whether or not fast/unsafe math is enabled on the link command-line.

h-vetinari · 2024-04-20T07:55:59Z

Pardon the noise, just adding an xref for discoverability between related issues: #57589

andykaylor added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' floating-point Floating-point math labels Feb 8, 2024

andykaylor mentioned this issue Feb 8, 2024

Disable FTZ/DAZ when compiling shared libraries by default. #80475

Merged

EugeneZelenko removed the clang Clang issues not falling into any other category label Feb 8, 2024

Zentrik mentioned this issue Feb 19, 2025

Feature Request: An option to disable flushing to zero on CPU openxla/xla#22858

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FTZ/DAZ handling with -ffast-math targets is inconsistent for x86-64 #81204

FTZ/DAZ handling with -ffast-math targets is inconsistent for x86-64 #81204

andykaylor commented Feb 8, 2024

llvmbot commented Feb 8, 2024

jyknight commented Feb 9, 2024

andykaylor commented Feb 9, 2024

h-vetinari commented Apr 20, 2024

FTZ/DAZ handling with -ffast-math targets is inconsistent for x86-64 #81204

FTZ/DAZ handling with -ffast-math targets is inconsistent for x86-64 #81204

Comments

andykaylor commented Feb 8, 2024

llvmbot commented Feb 8, 2024

jyknight commented Feb 9, 2024

andykaylor commented Feb 9, 2024

h-vetinari commented Apr 20, 2024