summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorAndrew M. Kuchling <amk@amk.ca>2004-07-04 01:26:42 (GMT)
committerAndrew M. Kuchling <amk@amk.ca>2004-07-04 01:26:42 (GMT)
commitc8f8a814e2254e2999344335e5f46dc104599ec1 (patch)
tree6815dd88c003760953a1826cecb109e05299bbb4 /Doc
parent49a5fe107fceb0239d87119842b20a7421043b5a (diff)
downloadcpython-c8f8a814e2254e2999344335e5f46dc104599ec1.zip
cpython-c8f8a814e2254e2999344335e5f46dc104599ec1.tar.gz
cpython-c8f8a814e2254e2999344335e5f46dc104599ec1.tar.bz2
Rewrite two sections
Diffstat (limited to 'Doc')
-rw-r--r--Doc/whatsnew/whatsnew24.tex342
1 files changed, 247 insertions, 95 deletions
diff --git a/Doc/whatsnew/whatsnew24.tex b/Doc/whatsnew/whatsnew24.tex
index fd60621..6482cb0 100644
--- a/Doc/whatsnew/whatsnew24.tex
+++ b/Doc/whatsnew/whatsnew24.tex
@@ -2,6 +2,10 @@
\usepackage{distutils}
% $Id$
+% Don't write extensive text for new sections; I'll do that.
+% Feel free to add commented-out reminders of things that need
+% to be covered. --amk
+
\title{What's New in Python 2.4}
\release{0.0}
\author{A.M.\ Kuchling}
@@ -89,73 +93,61 @@ Greg Wilson and ultimately implemented by Raymond Hettinger.}
XXX write this.
%======================================================================
-\section{PEP 229: Generator Expressions}
-
-Now, simple generators can be coded succinctly as expressions using a syntax
-like list comprehensions but with parentheses instead of brackets. These
-expressions are designed for situations where the generator is used right
-away by an enclosing function. Generator expressions are more compact but
-less versatile than full generator definitions and they tend to be more memory
-friendly than equivalent list comprehensions.
-
-\begin{verbatim}
- g = (tgtexp for var1 in exp1 for var2 in exp2 if exp3)
-\end{verbatim}
-
-is equivalent to:
+\section{PEP 289: Generator Expressions}
+
+The iterator feature introduced in Python 2.2 makes it easier to write
+programs that loop through large data sets without having the entire
+data set in memory at one time. Programmers can use iterators and the
+\module{itertools} module to write code in a fairly functional style.
+
+The fly in the ointment has been list comprehensions, because they
+produce a Python list object containing all of the items, unavoidably
+pulling them all into memory. When trying to write a program using the functional approach, it would be natural to write something like:
\begin{verbatim}
- def __gen(exp):
- for var1 in exp:
- for var2 in exp2:
- if exp3:
- yield tgtexp
- g = __gen(iter(exp1))
- del __gen
+links = [link for link in get_all_links() if not link.followed]
+for link in links:
+ ...
\end{verbatim}
-The advantage over full generator definitions is in economy of
-expression. Their advantage over list comprehensions is in saving
-memory by creating data only when it is needed rather than forming
-a whole list is memory all at once. Applications using memory
-friendly generator expressions may scale-up to high volumes of data
-more readily than with list comprehensions.
-
-Generator expressions are best used in functions that consume their
-data all at once and would not benefit from having a full list instead
-of a generator as an input:
+instead of
\begin{verbatim}
->>> sum(i*i for i in range(10))
-285
-
->>> sorted(set(i*i for i in xrange(-20, 20) if i%2==1)) # odd squares
-[1, 9, 25, 49, 81, 121, 169, 225, 289, 361]
+for link in get_all_links():
+ if link.followed:
+ continue
+ ...
+\end{verbatim}
->>> from itertools import izip
->>> xvec = [10, 20, 30]
->>> yvec = [7, 5, 3]
->>> sum(x*y for x,y in izip(xvec, yvec)) # dot product
-260
+The first form is more concise and perhaps more readable, but if
+you're dealing with a large number of link objects the second form
+would have to be used.
->>> from math import pi, sin
->>> sine_table = dict((x, sin(x*pi/180)) for x in xrange(0, 91))
+Generator expressions work similarly to list comprehensions but don't
+materialize the entire list; instead they create a generator that will
+return elements one by one. The above example could be written as:
->>> unique_words = set(word for line in page for word in line.split())
+\begin{verbatim}
+links = (link for link in get_all_links() if not link.followed)
+for link in links:
+ ...
+\end{verbatim}
->>> valedictorian = max((student.gpa, student.name) for student in graduates)
+Generator expressions always have to be written inside parentheses, as
+in the above example. The parentheses signalling a function call also
+count, so if you want to create a iterator that will be immediately
+passed to a function you could write:
-\end{verbatim}
+\begin{verbatim}
+print sum(obj.count for obj in list_all_objects())
+\end{verbatim}
-For more complex uses of generators, it is strongly recommended that
-the traditional full generator definitions be used instead. In a
-generator expression, the first for-loop expression is evaluated
-as soon as the expression is defined while the other expressions do
-not get evaluated until the generator is run. This nuance is never
-an issue when the generator is used immediately; however, if it is not
-used right away, a full generator definition would be much more clear
-about when the sub-expressions are evaluated and would be more obvious
-about the visibility and lifetime of the variables.
+There are some small differences from list comprehensions. Most
+notably, the loop variable (\var{obj} in the above example) is not
+accessible outside of the generator expression. List comprehensions
+leave the variable assigned to its last value; future versions of
+Python will change this, making list comprehensions match generator
+expressions in this respect.
\begin{seealso}
\seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and
@@ -203,62 +195,222 @@ root:*:0:0:System Administrator:/var/root:/bin/tcsh
%======================================================================
\section{PEP 327: Decimal Data Type}
-A new module, \module{decimal}, offers a \class{Decimal} data type for
-decimal floating point arithmetic. Compared to the built-in \class{float}
-type implemented with binary floating point, the new class is especially
-useful for financial applications and other uses which require exact
-decimal representation, control over precision, control over rounding
-to meet legal or regulatory requirements, tracking of significant
-decimal places, or for applications where the user expects the results
-to match hand calculations done the way they were taught in school.
+Python has always supported floating-point (FP) numbers as a data
+type, based on the underlying C \ctype{double} type. However, while
+most programming languages provide a floating-point type, most people
+(even programmers) are unaware that computing with floating-point
+numbers entails certain unavoidable inaccuracies. The new decimal
+type provides a way to avoid these inaccuracies.
-For example, calculating a 5% tax on a 70 cent phone charge gives
-different results in decimal floating point and binary floating point
-with the difference being significant when rounding to the nearest
-cent:
+\subsection{Why is Decimal needed?}
+The limitations arise from the representation used for floating-point numbers.
+FP numbers are made up of three components:
+
+\begin{itemize}
+\item The sign, which is -1 or +1.
+\item The mantissa, which is a single-digit binary number
+followed by a fractional part. For example, \code{1.01} in base-2 notation
+is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation.
+\item The exponent, which tells where the decimal point is located in the number represented.
+\end{itemize}
+
+For example, the number 1.25 has sign +1, mantissa 1.01 (in binary),
+and exponent of 0 (the decimal point doesn't need to be shifted). The
+number 5 has the same sign and mantissa, but the exponent is 2
+because the mantissa is multiplied by 4 (2 to the power of the exponent 2).
+
+Modern systems usually provide floating-point support that conforms to
+a relevant standard called IEEE 754. C's \ctype{double} type is
+usually implemented as a 64-bit IEEE 754 number, which uses 52 bits of
+space for the mantissa. This means that numbers can only be specified
+to 52 bits of precision. If you're trying to represent numbers whose
+expansion repeats endlessly, the expansion is cut off after 52 bits.
+Unfortunately, most software needs to produce output in base 10, and
+base 10 often gives rise to such repeating decimals. For example, 1.1
+decimal is binary \code{1.0001100110011 ...}; .1 = 1/16 + 1/32 + 1/256
+plus an infinite number of additional terms. IEEE 754 has to chop off
+that infinitely repeated decimal after 52 digits, so the
+representation is slightly inaccurate.
+
+Sometimes you can see this inaccuracy when the number is printed:
\begin{verbatim}
->>> from decimal import *
->>> Decimal('0.70') * Decimal('1.05')
-Decimal("0.7350")
->>> .70 * 1.05
-0.73499999999999999
+>>> 1.1
+1.1000000000000001
\end{verbatim}
-Note that the \class{Decimal} result keeps a trailing zero, automatically
-inferring four place significance from two digit mulitiplicands. A key
-goal is to reproduce the mathematics we do by hand and avoid the tricky
-issues that arise when decimal numbers cannot be represented exactly in
-binary floating point.
+The inaccuracy isn't always visible when you print the number because
+the FP-to-decimal-string conversion is provided by the C library, and
+most C libraries try to produce sensible output, but the inaccuracy is
+still there and subsequent operations can magnify the error.
+
+For many applications this doesn't matter. If I'm plotting points and
+displaying them on my monitor, the difference between 1.1 and
+1.1000000000000001 is too small to be visible. Reports often limit
+output to a certain number of decimal places, and if you round the
+number to two or three or even eight decimal places, the error is
+never apparent. However, for applications where it does matter,
+it's a lot of work to implement your own custom arithmetic routines.
+
+\subsection{The \class{Decimal} type}
-Exact representation enables the \class{Decimal} class to perform
-modulo calculations and equality tests that would fail in binary
-floating point:
+A new module, \module{decimal}, was added to Python's standard library.
+It contains two classes, \class{Decimal} and \class{Context}.
+\class{Decimal} instances represent numbers, and
+\class{Context} instances are used to wrap up various settings such as the precision and default rounding mode.
+
+\class{Decimal} instances, like regular Python integers and FP numbers, are immutable; once they've been created, you can't change the value it represents.
+\class{Decimal} instances can be created from integers or strings:
\begin{verbatim}
->>> Decimal('1.00') % Decimal('.10')
-Decimal("0.00")
->>> 1.00 % 0.10
-0.09999999999999995
-
->>> sum([Decimal('0.1')]*10) == Decimal('1.0')
-True
->>> sum([0.1]*10) == 1.0
-False
+>>> import decimal
+>>> decimal.Decimal(1972)
+Decimal("1972")
+>>> decimal.Decimal("1.1")
+Decimal("1.1")
+\end{verbatim}
+
+You can also provide tuples containing the sign, mantissa represented
+as a tuple of decimal digits, and exponent:
+
+\begin{verbatim}
+>>> decimal.Decimal((1, (1, 4, 7, 5), -2))
+Decimal("-14.75")
+\end{verbatim}
+
+Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 is negative.
+
+Floating-point numbers posed a bit of a problem: should the FP number
+representing 1.1 turn into the decimal number for exactly 1.1, or for
+1.1 plus whatever inaccuracies are introduced? The decision was to
+leave such a conversion out of the API. Instead, you should convert
+the floating-point number into a string using the desired precision and
+pass the string to the \class{Decimal} constructor:
+
+\begin{verbatim}
+>>> f = 1.1
+>>> decimal.Decimal(str(f))
+Decimal("1.1")
+>>> decimal.Decimal(repr(f))
+Decimal("1.1000000000000001")
\end{verbatim}
-The \module{decimal} module also allows arbitrarily large precisions to be
-set for calculation:
+Once you have \class{Decimal} instances, you can perform the usual
+mathematical operations on them. One limitation: exponentiation
+requires an integer exponent:
\begin{verbatim}
->>> getcontext().prec = 24
->>> Decimal(1) / Decimal(7)
-Decimal("0.142857142857142857142857")
+>>> a = decimal.Decimal('35.72')
+>>> b = decimal.Decimal('1.73')
+>>> a+b
+Decimal("37.45")
+>>> a-b
+Decimal("33.99")
+>>> a*b
+Decimal("61.7956")
+>>> a/b
+Decimal("20.6473988")
+>>> a ** 2
+Decimal("1275.9184")
+>>> a ** b
+Decimal("NaN")
\end{verbatim}
+
+You can combine \class{Decimal} instances with integers, but not with
+floating-point numbers:
+
+\begin{verbatim}
+>>> a + 4
+Decimal("39.72")
+>>> a + 4.5
+Traceback (most recent call last):
+ ...
+TypeError: You can interact Decimal only with int, long or Decimal data types.
+>>>
+\end{verbatim}
+
+\class{Decimal} numbers can be used with the \module{math} and
+\module{cmath} modules, though you'll get back a regular
+floating-point number and not a \class{Decimal}. Instances also have a \method{sqrt()} method:
+
+\begin{verbatim}
+>>> import math, cmath
+>>> d = decimal.Decimal('123456789012.345')
+>>> math.sqrt(d)
+351364.18288201344
+>>> cmath.sqrt(-d)
+351364.18288201344j
+>>> d.sqrt()
+Decimal(``351364.1828820134592177245001'')
+\end{verbatim}
+
+
+\subsection{The \class{Context} type}
+
+Instances of the \class{Context} class encapsulate several settings for
+decimal operations:
+
+\begin{itemize}
+ \item \member{prec} is the precision, the number of decimal places.
+ \item \member{rounding} specifies the rounding mode. The \module{decimal}
+ module has constants for the various possibilities:
+ \constant{ROUND_DOWN}, \constant{ROUND_CEILING}, \constant{ROUND_HALF_EVEN}, and various others.
+ \item \member{trap_enablers} is a dictionary specifying what happens on
+encountering certain error conditions: either an exception is raised or
+a value is returned. Some examples of error conditions are
+division by zero, loss of precision, and overflow.
+\end{itemize}
+
+There's a thread-local default context available by calling
+\function{getcontext()}; you can change the properties of this context
+to alter the default precision, rounding, or trap handling.
+
+\begin{verbatim}
+>>> decimal.getcontext().prec
+28
+>>> decimal.Decimal(1) / decimal.Decimal(7)
+Decimal(``0.1428571428571428571428571429'')
+>>> decimal.getcontext().prec = 9
+>>> decimal.Decimal(1) / decimal.Decimal(7)
+Decimal(``0.142857143'')
+\end{verbatim}
+
+The default action for error conditions is to return a special value
+such as infinity or not-a-number, but you can request that exceptions
+be raised:
+
+\begin{verbatim}
+>>> decimal.Decimal(1) / decimal.Decimal(0)
+Decimal(``Infinity'')
+>>> decimal.getcontext().trap_enablers[decimal.DivisionByZero] = True
+>>> decimal.Decimal(1) / decimal.Decimal(0)
+Traceback (most recent call last):
+ ...
+decimal.DivisionByZero: x / 0
+>>>
+\end{verbatim}
+
+The \class{Context} instance also has various methods for formatting
+numbers such as \method{to_eng_string()} and \method{to_sci_string()}.
+
\begin{seealso}
\seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented
- by Eric Price, Facundo Bastista, Raymond Hettinger, Aahz, and Tim Peters.}
+ by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.}
+
+\seeurl{http://research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html}
+{A more detailed overview of the IEEE-754 representation.}
+
+\seeurl{http://www.lahey.com/float.htm}
+{The article uses Fortran code to illustrate many of the problems
+that floating-point inaccuracy can cause.}
+
+\seeurl{http://www2.hursley.ibm.com/decimal/}
+{A description of a decimal-based representation. This representation
+is being proposed as a standard, and underlies the new Python decimal
+type. Much of this material was written by Mike Cowlishaw, designer of the
+REXX language.}
+
\end{seealso}