Comparing doubles with floats: precision issues

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Comparing doubles with floats: precision issues

David Gobbi
Hi All,

This is one of those picky math questions that deals with numerical precision.  Let's say that one has a data set with scalar type "float", and wants to select values within a range (minval, maxval) where minval, maxval are of type "double":

    if (fval >= minval && fval <= maxval) { ... }

Now let's say you don't want "fval" to be converted to double, because floats are faster than doubles on your hardware:

   float fminval = static_cast<float>(minval);
   float fmaxval = static_cast<float>(maxval);
   ...
   if (fval >= fminval && fval <= fmaxval) { ... }

Unfortunately, there are some cases where fval <= fmaxval even though fval > maxval.  Reducing the precision of the range has invalidated the check.  In order to fix things, you must choose fminval to be the value closest to but not more than minval, and choose fmaxval to be the value closest to but not less than maxval.

    float fminval = NearestFloatNotGreaterThan(minval);
    float fmaxval = NearestFloatNotLessThan(maxval);

With these, (fval >= fminval && fval <= fmaxval) gives the same result as (fval >= minval && val <= fmaxval) and all is right with the world.

So my question is, have any other devs created a solution for this issue, either for VTK or for related code?  I'm considering a solution based on the C++11 function std::nextafter(), as described on this stackoverflow page: https://stackoverflow.com/questions/15294046/round-a-double-to-the-closest-and-greater-float

 - David





_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Search the list archives at: http://markmail.org/search/?q=vtk-developers

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/vtk-developers

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Comparing doubles with floats: precision issues

Scott, W Alan

Haven’t seen a great reply, and this was a while ago, but here goes.  This is also all theoretical, from days 20 years ago when I worked on OpenGL drivers. 

 

How about the following:

 

test = (minval – static_cast<double>fval) & (static_cast<double>fval – maxval)

 

Now, test will be positive if your case should pass, negative if not.  I’m sure I missed some casts above, but you can see what I am doing.  By And’ing the positive bit on the two floats, you see if result is  positive. So, either

if(test >= 0)

  succeed;

else

  not so much;

 

Or, if your hardware is faster in integer space (and dirty),

if (<unsigned double>test & 0x800000000k)

 

I suspect this whole test should run in under 10 clocks...

 

Be sure to test with different compilers, and especially optimized.

 

Alan

 

From: vtk-developers [mailto:[hidden email]] On Behalf Of David Gobbi
Sent: Monday, June 12, 2017 12:16 PM
To: VTK Developers <[hidden email]>
Subject: [EXTERNAL] [vtk-developers] Comparing doubles with floats: precision issues

 

Hi All,

 

This is one of those picky math questions that deals with numerical precision.  Let's say that one has a data set with scalar type "float", and wants to select values within a range (minval, maxval) where minval, maxval are of type "double":

 

    if (fval >= minval && fval <= maxval) { ... }

 

Now let's say you don't want "fval" to be converted to double, because floats are faster than doubles on your hardware:

 

   float fminval = static_cast<float>(minval);

   float fmaxval = static_cast<float>(maxval);

   ...

   if (fval >= fminval && fval <= fmaxval) { ... }

 

Unfortunately, there are some cases where fval <= fmaxval even though fval > maxval.  Reducing the precision of the range has invalidated the check.  In order to fix things, you must choose fminval to be the value closest to but not more than minval, and choose fmaxval to be the value closest to but not less than maxval.

 

    float fminval = NearestFloatNotGreaterThan(minval);

    float fmaxval = NearestFloatNotLessThan(maxval);

 

With these, (fval >= fminval && fval <= fmaxval) gives the same result as (fval >= minval && val <= fmaxval) and all is right with the world.

 

So my question is, have any other devs created a solution for this issue, either for VTK or for related code?  I'm considering a solution based on the C++11 function std::nextafter(), as described on this stackoverflow page: https://stackoverflow.com/questions/15294046/round-a-double-to-the-closest-and-greater-float

 

 - David

 

 

 

 


_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Search the list archives at: http://markmail.org/search/?q=vtk-developers

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/vtk-developers

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Comparing doubles with floats: precision issues

David Gobbi
Hi Alan,

Thanks for the idea, these tricks are always useful to know.  They don't solve my issue, though, because my goal isn't just optimization.

The thing is, I already have closed classes that do "if (fval >= fminval && fval <= fmaxval)" where all variables are of type "float".  My problem is, that I have a range (minval, maxval) in double-precision, and I have to compute (fminval, fminval) in single precision to provide to the existing code.  As described above, a naive typecast gives the wrong answer in edge cases, which is why fminval = NearestFloatNotGreaterThan(minval) and fmaxval = NearestFloatNotLessThan(maxval) are necessary.

Cheers,
 - David


On Fri, Jun 16, 2017 at 3:28 PM, Scott, W Alan <[hidden email]> wrote:

Haven’t seen a great reply, and this was a while ago, but here goes.  This is also all theoretical, from days 20 years ago when I worked on OpenGL drivers. 

 

How about the following:

 

test = (minval – static_cast<double>fval) & (static_cast<double>fval – maxval)

 

Now, test will be positive if your case should pass, negative if not.  I’m sure I missed some casts above, but you can see what I am doing.  By And’ing the positive bit on the two floats, you see if result is  positive. So, either

if(test >= 0)

  succeed;

else

  not so much;

 

Or, if your hardware is faster in integer space (and dirty),

if (<unsigned double>test & 0x800000000k)

 

I suspect this whole test should run in under 10 clocks...

 

Be sure to test with different compilers, and especially optimized.

 

Alan

 

From: vtk-developers [mailto:[hidden email]] On Behalf Of David Gobbi
Sent: Monday, June 12, 2017 12:16 PM
To: VTK Developers <[hidden email]>
Subject: [EXTERNAL] [vtk-developers] Comparing doubles with floats: precision issues

 

Hi All,

 

This is one of those picky math questions that deals with numerical precision.  Let's say that one has a data set with scalar type "float", and wants to select values within a range (minval, maxval) where minval, maxval are of type "double":

 

    if (fval >= minval && fval <= maxval) { ... }

 

Now let's say you don't want "fval" to be converted to double, because floats are faster than doubles on your hardware:

 

   float fminval = static_cast<float>(minval);

   float fmaxval = static_cast<float>(maxval);

   ...

   if (fval >= fminval && fval <= fmaxval) { ... }

 

Unfortunately, there are some cases where fval <= fmaxval even though fval > maxval.  Reducing the precision of the range has invalidated the check.  In order to fix things, you must choose fminval to be the value closest to but not more than minval, and choose fmaxval to be the value closest to but not less than maxval.

 

    float fminval = NearestFloatNotGreaterThan(minval);

    float fmaxval = NearestFloatNotLessThan(maxval);

 

With these, (fval >= fminval && fval <= fmaxval) gives the same result as (fval >= minval && val <= fmaxval) and all is right with the world.

 

So my question is, have any other devs created a solution for this issue, either for VTK or for related code?  I'm considering a solution based on the C++11 function std::nextafter(), as described on this stackoverflow page: https://stackoverflow.com/questions/15294046/round-a-double-to-the-closest-and-greater-float

 

 - David


_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Search the list archives at: http://markmail.org/search/?q=vtk-developers

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/vtk-developers

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Comparing doubles with floats: precision issues

Robert Maynard-4
We did something fairly similar to the proposed solution when doing
range computations in the past for ParaView, with the increased
challenge of wanting to move N values in a given direction. I think
that using nextafter will be the best way to get correct results for
your problem though. I would also state that the float -> integer bit
trick is trickier than it looks on the surface and I recommend reading
more on this subject before you dive in (
https://randomascii.wordpress.com/2012/01/23/stupid-float-tricks-2/ ).


On Fri, Jun 16, 2017 at 6:31 PM, David Gobbi <[hidden email]> wrote:

> Hi Alan,
>
> Thanks for the idea, these tricks are always useful to know.  They don't
> solve my issue, though, because my goal isn't just optimization.
>
> The thing is, I already have closed classes that do "if (fval >= fminval &&
> fval <= fmaxval)" where all variables are of type "float".  My problem is,
> that I have a range (minval, maxval) in double-precision, and I have to
> compute (fminval, fminval) in single precision to provide to the existing
> code.  As described above, a naive typecast gives the wrong answer in edge
> cases, which is why fminval = NearestFloatNotGreaterThan(minval) and fmaxval
> = NearestFloatNotLessThan(maxval) are necessary.
>
> Cheers,
>  - David
>
>
> On Fri, Jun 16, 2017 at 3:28 PM, Scott, W Alan <[hidden email]> wrote:
>>
>> Haven’t seen a great reply, and this was a while ago, but here goes.  This
>> is also all theoretical, from days 20 years ago when I worked on OpenGL
>> drivers.
>>
>>
>>
>> How about the following:
>>
>>
>>
>> test = (minval – static_cast<double>fval) & (static_cast<double>fval –
>> maxval)
>>
>>
>>
>> Now, test will be positive if your case should pass, negative if not.  I’m
>> sure I missed some casts above, but you can see what I am doing.  By And’ing
>> the positive bit on the two floats, you see if result is  positive. So,
>> either
>>
>> if(test >= 0)
>>
>>   succeed;
>>
>> else
>>
>>   not so much;
>>
>>
>>
>> Or, if your hardware is faster in integer space (and dirty),
>>
>> if (<unsigned double>test & 0x800000000k)
>>
>>
>>
>> I suspect this whole test should run in under 10 clocks...
>>
>>
>>
>> Be sure to test with different compilers, and especially optimized.
>>
>>
>>
>> Alan
>>
>>
>>
>> From: vtk-developers [mailto:[hidden email]] On Behalf Of
>> David Gobbi
>> Sent: Monday, June 12, 2017 12:16 PM
>> To: VTK Developers <[hidden email]>
>> Subject: [EXTERNAL] [vtk-developers] Comparing doubles with floats:
>> precision issues
>>
>>
>>
>> Hi All,
>>
>>
>>
>> This is one of those picky math questions that deals with numerical
>> precision.  Let's say that one has a data set with scalar type "float", and
>> wants to select values within a range (minval, maxval) where minval, maxval
>> are of type "double":
>>
>>
>>
>>     if (fval >= minval && fval <= maxval) { ... }
>>
>>
>>
>> Now let's say you don't want "fval" to be converted to double, because
>> floats are faster than doubles on your hardware:
>>
>>
>>
>>    float fminval = static_cast<float>(minval);
>>
>>    float fmaxval = static_cast<float>(maxval);
>>
>>    ...
>>
>>    if (fval >= fminval && fval <= fmaxval) { ... }
>>
>>
>>
>> Unfortunately, there are some cases where fval <= fmaxval even though fval
>> > maxval.  Reducing the precision of the range has invalidated the check.
>> In order to fix things, you must choose fminval to be the value closest to
>> but not more than minval, and choose fmaxval to be the value closest to but
>> not less than maxval.
>>
>>
>>
>>     float fminval = NearestFloatNotGreaterThan(minval);
>>
>>     float fmaxval = NearestFloatNotLessThan(maxval);
>>
>>
>>
>> With these, (fval >= fminval && fval <= fmaxval) gives the same result as
>> (fval >= minval && val <= fmaxval) and all is right with the world.
>>
>>
>>
>> So my question is, have any other devs created a solution for this issue,
>> either for VTK or for related code?  I'm considering a solution based on the
>> C++11 function std::nextafter(), as described on this stackoverflow page:
>> https://stackoverflow.com/questions/15294046/round-a-double-to-the-closest-and-greater-float
>>
>>
>>
>>  - David
>
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Search the list archives at: http://markmail.org/search/?q=vtk-developers
>
> Follow this link to subscribe/unsubscribe:
> http://public.kitware.com/mailman/listinfo/vtk-developers
>
>
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Search the list archives at: http://markmail.org/search/?q=vtk-developers

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/vtk-developers

Loading...