tl;dr
You should compare two floating-point numbers for equality in C# as follows:
tolerance
must be much larger than double.Epsilon
.
The problem in the equality operator
You should avoid using the equality operator == to compare two floating-point numbers for equality. This is because the equality operator can return false for two intuitively equal numbers. The following list and the output indicate the problem.
The output:x: 0.099999999999999978, y: 0.100000000000000006,
x == y: False, Abs(x-y): 0.000000000000000028
x: 0.299999999999999989, y: 0.300000000000000044,
x == y: False, Abs(x-y): 0.000000000000000056
x: 0.999999999999999889, y: 1.000000000000000000,
x == y: False, Abs(x-y): 0.000000000000000111
The equality comparisons between 1.0 - 0.9 and 0.1, between 0.15 + 0.15 and 0.1 + 0.2, and the sum of ten 0.1 and 1.0 unexpectedly result in false. The errors in the floating-point numbers cause these results.
Tolerating absolute errors
The well-known way to resolve this problem is to compare the absolute error of two numbers with a minuscule number. The following results are as expected because the absolute errors are less than 1e-10.
The output:x: 0.099999999999999978, y: 0.100000000000000006,
TolerateAbsError: True, Abs(x-y): 0.000000000000000028
x: 0.299999999999999989, y: 0.300000000000000044,
TolerateAbsError: True, Abs(x-y): 0.000000000000000056
x: 0.999999999999999889, y: 1.000000000000000000,
TolerateAbsError: True, Abs(x-y): 0.000000000000000111
This approach has a problem where the absolute error between two large numbers can easily exceed the minuscule number. For example, the comparison between 0.1 added to 1e6 ten times and 1e6 plus 1.0 results in false because the absolute error exceeds 1e-10.
The output:x: 1000000.999999999767169356, y: 1000001.000000000000000000,
TolerateAbsError: False, Abs(x-y): 0.000000000232830644
Tolerating relative errors
You can resolve this problem by tolerating the relative error instead of the absolute error. A relative error of two numbers is the absolute error divided by the maximum of their absolute values.
The following comparison tolerant of the relative error between the two additions to 1e6 returns true because the relative error is less than 1e-10. Please notice a / b <= c
forms a <= b * c
to avoid division by zero.
x: 1000000.999999999767169356, y: 1000001.000000000000000000,
TolerantRelativeError: True, RelativeError: 0.000000000000000233
This technique causes another problem. The relative error between two minuscule numbers almost equal to zero can become very large. For example, the comparison tolerant of the relative error between 1e-11 and 1e-12 returns false.
The output:
x: 0.000000000010000000, y: 0.000000000001000000,
TolerantRelativeError: False, RelativeError: 0.900000000000000022
Torelating both errors
The equality comparison must tolerate both the absolute and relative errors of two numbers to solve those problems.
The above equality comparison returns true for both of the corner cases.
The output:x: 1000000.999999999767169356, y: 1000001.000000000000000000,
TolerateRelativeAndAbsError: True
x: 0.000000000010000000, y: 0.000000000001000000,
TolerateRelativeAndAbsError: True
The myth of machine epsilon
I have arbitrarily chosen 1e-10 as the tolerance
so far, but there is a mathematically rigid number misunderstood as an appropriate value in this case. It is machine epsilon.
Many programming languages define machine epsilon as the difference between 1 and the next larger floating-point number, even though the formal definition is different from it. In C#, Double.Epsilon
has machine epsilon.
You can't use machine epsilon as the tolerance
. When adopting Double.Epsilon
as the tolerance
, the comparison between the sum of ten 0.1 and 1.0 in the first example returns false. Machine epsilon is too small, and cumulative errors of arithmetic operations easily exceed it.
x: 0.999999999999999889, y: 1.000000000000000000,
UseMachineEpsilonAsTolerance: False
After all, the tolerance
should be based on the precision necessary to your application. It also should be much larger than machine epsilon.
Conclusion
The equality comparison of floating-point numbers should take into account both the absolute and relative errors. Furthermore, the error tolerance in the comparison should be chosen based on the required accuracy in your application. If you would like to know more details of this topic, you should consult Comparing Floating Point Numbers, 2012 Edition.
This GitHub repository has the above examples and the implementation of EqualityComparer<double>
and IComperer<double>
based on this article.
Hey!
Great Article!
I've found a problem with your final solution that may be worth noting:
When you choose a `tolerance` that is too big (e.g. 1.0) and x & y are specific values (like 0 & 10), the absolute error checker will correctly detect that it's not equal but the relative error checker will not because of the `Max(...) * tolerance` part. The result will be the wrong one because of the `||` operation.
Other (x, y, tolerance) number pairs will also trigger this error, like (100, 80, 0.3). I'm sure you can think of more.
"Simply choose a smaller tolerance value." won't be a solution in those cases, where a tolerance value that high is required by the context.
We should probably add a precondition to the relative error checker.
I had the idea to check if the difference is much greater than the tolerance. A way to do that that came to my mind was: `diff >= tolerance * diff`. I realized that this wouldn't work when diff & tolerance would be both less than 0, so in those cases, we could invert diff with `1 / diff`.
What do you think?
best regards,
Jack
The relative error is a ratio between 0%~100%. When you specify 1.0 as the tolerance, you explicitly specify that any two numbers are equal. The tolerance of 0.3 means 30%. You should carefully select enough small number as the tolerance.
Ah OK, thank you for explaining that. That wasn't clear to me.
It still bugs me that in the first part the tolerance is interpreted as an absolute value and in the second part as a percentage. In my opinion that is bad design, because traditionally someone like me only has the meaning of an absolute value in mind when it comes to equality checks and is then surprised when the chosen tolerance doesn't fit.
I'm sorry that the article lacks the explanation.
I agree with your point. It's a bad design to use the same value in double meanings in one method.
But the method has been working well in my applications for long years.
I will add more explanation of the tolerance.