diff options
author | Fred Drake <fdrake@acm.org> | 2001-06-08 16:24:58 (GMT) |
---|---|---|
committer | Fred Drake <fdrake@acm.org> | 2001-06-08 16:24:58 (GMT) |
commit | 417d667dd925c470c175d36e6ca8bff28dc1d36a (patch) | |
tree | 33a3daabd52d87bb6f842a66634e9e04fec95730 | |
parent | 1dc98c44fe31ac7f7256d314e1eaf753b592aca4 (diff) | |
download | cpython-417d667dd925c470c175d36e6ca8bff28dc1d36a.zip cpython-417d667dd925c470c175d36e6ca8bff28dc1d36a.tar.gz cpython-417d667dd925c470c175d36e6ca8bff28dc1d36a.tar.bz2 |
Text from Tim & Guido discussing floating point arithmetic and what users
need to understand about the binary & decimal fp, so that representation
weirdness is documented somewhere. This makes it easier to repond to "bug"
reports caused by user confusion & ignorance of the issues.
This closes SF patch #426208.
-rw-r--r-- | Doc/tut/tut.tex | 265 |
1 files changed, 265 insertions, 0 deletions
diff --git a/Doc/tut/tut.tex b/Doc/tut/tut.tex index 8fe90c4..2749a51 100644 --- a/Doc/tut/tut.tex +++ b/Doc/tut/tut.tex @@ -4085,4 +4085,269 @@ check (or even suggest) matching parentheses, quotes, etc., would also be useful. +\chapter{Floating Point Arithmetic: Issues and Limitations + \label{fp-issues}} + +Floating-point numbers are represented in computer hardware as +base 2 (binary) fractions. For example, the decimal fraction + +\begin{verbatim} +0.125 +\end{verbatim} + +has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction + +\begin{verbatim} +0.001 +\end{verbatim} + +has value 0/2 + 0/4 + 1/8. These two fractions have identical values, +the only real difference being that the first is written in base 10 +fractional notation, and the second in base 2. + +Unfortunately, most decimal fractions cannot be represented exactly as +binary fractions. A consequence is that, in general, the decimal +floating-point numbers you enter are only approximated by the binary +floating-point numbers actually stored in the machine. + +The problem is easier to understand at first in base 10. Consider the +fraction 1/3. You can approximate that as a base 10 fraction: + +\begin{verbatim} +0.3 +\end{verbatim} + +or, better, + +\begin{verbatim} +0.33 +\end{verbatim} + +or, better, + +\begin{verbatim} +0.333 +\end{verbatim} + +and so on. No matter how many digits you're willing to write down, the +result will never be exactly 1/3, but will be an increasingly better +approximation to 1/3. + +In the same way, no matter how many base 2 digits you're willing to +use, the decimal value 0.1 cannot be represented exactly as a base 2 +fraction. In base 2, 1/10 is the infinitely repeating fraction + +\begin{verbatim} +0.0001100110011001100110011001100110011001100110011... +\end{verbatim} + +Stop at any finite number of bits, and you get an approximation. This +is why you see things like: + +\begin{verbatim} +>>> 0.1 +0.10000000000000001 +\end{verbatim} + +On most machines today, that is what you'll see if you enter 0.1 at +a Python prompt. You may not, though, because the number of bits +used by the hardware to store floating-point values can vary across +machines, and Python only prints a decimal approximation to the true +decimal value of the binary approximation stored by the machine. On +most machines, if Python were to print the true decimal value of +the binary approximation stored for 0.1, it would have to display + +\begin{verbatim} +>>> 0.1 +0.1000000000000000055511151231257827021181583404541015625 +\end{verbatim} + +instead! The Python prompt (implicitly) uses the builtin +\function{repr()} function to obtain a string version of everything it +displays. For floats, \code{repr(\var{float})} rounds the true +decimal value to 17 significant digits, giving + +\begin{verbatim} +0.10000000000000001 +\end{verbatim} + +\code{repr(\var{float})} produces 17 significant digits because it +turns out that's enough (on most machines) so that +\code{eval(repr(\var{x})) == \var{x}} exactly for all finite floats +\var{x}, but rounding to 16 digits is not enough to make that true. + +Note that this is in the very nature of binary floating-point: this is +not a bug in Python, it is not a bug in your code either, and you'll +see the same kind of thing in all languages that support your +hardware's floating-point arithmetic. + +Python's builtin \function{str()} function produces only 12 +significant digits, and you may wish to use that instead. It's +unusual for \code{eval(str(\var{x}))} to reproduce \var{x}, but the +output may be more pleasant to look at: + +\begin{verbatim} +>>> print str(0.1) +0.1 +\end{verbatim} + +It's important to realize that this is, in a real sense, an illusion: +the value in the machine is not exactly 1/10, you're simply rounding +the \emph{display} of the true machine value. + +Other surprises follow from this one. For example, after seeing + +\begin{verbatim} +>>> 0.1 +0.10000000000000001 +\end{verbatim} + +you may be tempted to use the \function{round()} function to chop it +back to the single digit you expect. But that makes no difference: + +\begin{verbatim} +>>> round(0.1, 1) +0.10000000000000001 +\end{verbatim} + +The problem is that the binary floating-point value stored for "0.1" +was already the best possible binary approximation to 1/10, so trying +to round it again can't make it better: it was already as good as it +gets. + +Another consequence is that since 0.1 is not exactly 1/10, adding 0.1 +to itself 10 times may not yield exactly 1.0, either: + +\begin{verbatim} +>>> sum = 0.0 +>>> for i in range(10): +... sum += 0.1 +... +>>> sum +0.99999999999999989 +\end{verbatim} + +Binary floating-point arithmetic holds many surprises like this. The +problem with "0.1" is explained in precise detail below, in the +"Representation Error" section. See +\citetitle[http://www.lahey.com/float.htm]{The Perils of Floating +Point} for a more complete account of other common surprises. + +As that says near the end, ``there are no easy answers.'' Still, +don't be unduly wary of floating-point! The errors in Python float +operations are inherited from the floating-point hardware, and on most +machines are on the order of no more than 1 part in 2**53 per +operation. That's more than adequate for most tasks, but you do need +to keep in mind that it's not decimal arithmetic, and that every float +operation can suffer a new rounding error. + +While pathological cases do exist, for most casual use of +floating-point arithmetic you'll see the result you expect in the end +if you simply round the display of your final results to the number of +decimal digits you expect. \function{str()} usually suffices, and for +finer control see the discussion of Pythons's \code{\%} format +operator: the \code{\%g}, \code{\%f} and \code{\%e} format codes +supply flexible and easy ways to round float results for display. + + +\section{Representation Error + \label{fp-error}} + +This section explains the ``0.1'' example in detail, and shows how +you can perform an exact analysis of cases like this yourself. Basic +familiarity with binary floating-point representation is assumed. + +\dfn{Representation error} refers to that some (most, actually) +decimal fractions cannot be represented exactly as binary (base 2) +fractions. This is the chief reason why Python (or Perl, C, \Cpp, +Java, Fortran, and many others) often won't display the exact decimal +number you expect: + +\begin{verbatim} +>>> 0.1 +0.10000000000000001 +\end{verbatim} + +Why is that? 1/10 is not exactly representable as a binary fraction. +Almost all machines today (November 2000) use IEEE-754 floating point +arithmetic, and almost all platforms map Python floats to IEEE-754 +"double precision". 754 doubles contain 53 bits of precision, so on +input the computer strives to convert 0.1 to the closest fraction it can +of the form \var{J}/2**\var{N} where \var{J} is an integer containing +exactly 53 bits. Rewriting + +\begin{verbatim} + 1 / 10 ~= J / (2**N) +\end{verbatim} + +as + +\begin{verbatim} +J ~= 2**N / 10 +\end{verbatim} + +and recalling that \var{J} has exactly 53 bits (is \code{>= 2**52} but +\code{< 2**53}), the best value for \var{N} is 56: + +\begin{verbatim} +>>> 2L**52 +4503599627370496L +>>> 2L**53 +9007199254740992L +>>> 2L**56/10 +7205759403792793L +\end{verbatim} + +That is, 56 is the only value for \var{N} that leaves \var{J} with +exactly 53 bits. The best possible value for \var{J} is then that +quotient rounded: + +\begin{verbatim} +>>> q, r = divmod(2L**56, 10) +>>> r +6L +\end{verbatim} + +Since the remainder is more than half of 10, the best approximation is +obtained by rounding up: + +\begin{verbatim} +>>> q+1 +7205759403792794L +\end{verbatim} + +Therefore the best possible approximation to 1/10 in 754 double +precision is that over 2**56, or + +\begin{verbatim} +7205759403792794 / 72057594037927936 +\end{verbatim} + +Note that since we rounded up, this is actually a little bit larger than +1/10; if we had not rounded up, the quotient would have been a little +bit smaller than 1/10. But in no case can it be *exactly* 1/10! + +So the computer never ``sees'' 1/10: what it sees is the exact +fraction given above, the best 754 double approximation it can get: + +\begin{verbatim} +>>> .1 * 2L**56 +7205759403792794.0 +\end{verbatim} + +If we multiply that fraction by 10**30, we can see the (truncated) +value of its 30 most significant decimal digits: + +\begin{verbatim} +>>> 7205759403792794L * 10L**30 / 2L**56 +100000000000000005551115123125L +\end{verbatim} + +meaning that the exact number stored in the computer is approximately +equal to the decimal value 0.100000000000000005551115123125. Rounding +that to 17 significant digits gives the 0.10000000000000001 that Python +displays (well, will display on any 754-conforming platform that does +best-possible input and output conversions in its C library --- yours may +not!). + \end{document} |