From 43737641c227a8e29150adfa4bab310fc8764845 Mon Sep 17 00:00:00 2001 From: "Andrew M. Kuchling" Date: Wed, 30 Aug 2000 00:51:02 +0000 Subject: Removed forgotten text in list comprehensions section (taken from the Haskell description of listcomps and used as inspiration) Rearranged sections (which accounts for much of the size of the diffs) Added section on augmented assignment Mentioned 'print >>file' Broke up the "Core Changes" section into subsections --- Doc/whatsnew/whatsnew20.tex | 476 ++++++++++++++++++++++++-------------------- 1 file changed, 263 insertions(+), 213 deletions(-) diff --git a/Doc/whatsnew/whatsnew20.tex b/Doc/whatsnew/whatsnew20.tex index 3081d16..d3ffcdb 100644 --- a/Doc/whatsnew/whatsnew20.tex +++ b/Doc/whatsnew/whatsnew20.tex @@ -261,7 +261,6 @@ while the second one is correct: [ (x,y) for x in seq1 for y in seq2] \end{verbatim} - The idea of list comprehensions originally comes from the functional programming language Haskell (\url{http://www.haskell.org}). Greg Ewing argued most effectively for adding them to Python and wrote the @@ -269,95 +268,45 @@ initial list comprehension patch, which was then discussed for a seemingly endless time on the python-dev mailing list and kept up-to-date by Skip Montanaro. - - - - A list comprehension has the form [ e | q[1], ..., q[n] ], n>=1, where - the q[i] qualifiers are either - * generators of the form p <- e, where p is a pattern (see Section - 3.17) of type t and e is an expression of type [t] - * guards, which are arbitrary expressions of type Bool - * local bindings that provide new definitions for use in the - generated expression e or subsequent guards and generators. - - % ====================================================================== -\section{Distutils: Making Modules Easy to Install} - -Before Python 2.0, installing modules was a tedious affair -- there -was no way to figure out automatically where Python is installed, or -what compiler options to use for extension modules. Software authors -had to go through an ardous ritual of editing Makefiles and -configuration files, which only really work on Unix and leave Windows -and MacOS unsupported. Software users faced wildly differing -installation instructions - -The SIG for distribution utilities, shepherded by Greg Ward, has -created the Distutils, a system to make package installation much -easier. They form the \module{distutils} package, a new part of -Python's standard library. In the best case, installing a Python -module from source will require the same steps: first you simply mean -unpack the tarball or zip archive, and the run ``\code{python setup.py -install}''. The platform will be automatically detected, the compiler -will be recognized, C extension modules will be compiled, and the -distribution installed into the proper directory. Optional -command-line arguments provide more control over the installation -process, the distutils package offers many places to override defaults --- separating the build from the install, building or installing in -non-default directories, and more. - -In order to use the Distutils, you need to write a \file{setup.py} -script. For the simple case, when the software contains only .py -files, a minimal \file{setup.py} can be just a few lines long: - -\begin{verbatim} -from distutils.core import setup -setup (name = "foo", version = "1.0", - py_modules = ["module1", "module2"]) -\end{verbatim} - -The \file{setup.py} file isn't much more complicated if the software -consists of a few packages: +\section{Augmented Assignment} + +Augmented assignment operators, another long-requested feature, have +been added to Python 2.0. Augmented assignment operators include +\code{+=}, \code{-=}, \code{*=}, and so forth. For example, the +statement \code{a += 2} increments the value of the variable \code{a} +by 2, equivalent to the slightly lengthier +\code{a = a + 2}. + +The full list of supported assignment operators is \code{+=}, +\code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=}, +\code{|=}, \code{^=}, \code{>>=}, and \code{<<=}. Python classes can +override the augmented assignment operators by defining methods named +\method{__iadd__}, \method{__isub__}, etc. For example, the following +\class{Number} class stores a number and supports using += to create a +new instance with an incremented value. \begin{verbatim} -from distutils.core import setup -setup (name = "foo", version = "1.0", - packages = ["package", "package.subpackage"]) +class Number: + def __init__(self, value): + self.value = value + def __iadd__(self, increment): + return Number( self.value + increment) + +n = Number(5) +n += 3 +print n.value \end{verbatim} -A C extension can be the most complicated case; here's an example taken from -the PyXML package: +The \method{__iadd__} special method is called with the value of the +increment, and should return a new instance with an appropriately +modified value; this return value is bound as the new value of the +variable on the left-hand side. - -\begin{verbatim} -from distutils.core import setup, Extension - -expat_extension = Extension('xml.parsers.pyexpat', - define_macros = [('XML_NS', None)], - include_dirs = [ 'extensions/expat/xmltok', - 'extensions/expat/xmlparse' ], - sources = [ 'extensions/pyexpat.c', - 'extensions/expat/xmltok/xmltok.c', - 'extensions/expat/xmltok/xmlrole.c', - ] - ) -setup (name = "PyXML", version = "0.5.4", - ext_modules =[ expat_extension ] ) - -\end{verbatim} - -The Distutils can also take care of creating source and binary -distributions. The ``sdist'' command, run by ``\code{python setup.py -sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}. -Adding new commands isn't difficult, ``bdist_rpm'' and -``bdist_wininst'' commands have already been contributed to create an -RPM distribution and a Windows installer for the software, -respectively. Commands to create other distribution formats such as -Debian packages and Solaris \file{.pkg} files are in various stages of -development. - -All this is documented in a new manual, \textit{Distributing Python -Modules}, that joins the basic set of Python documentation. +Augmented assignment operators were first introduced in the C +programming language, and most C-derived languages, such as +\program{awk}, C++, Java, Perl, and PHP also support them. The augmented +assignment patch was implemented by Thomas Wouters. % ====================================================================== \section{String Methods} @@ -384,9 +333,10 @@ string manipulation functionality available through methods on both 2 \end{verbatim} -One thing that hasn't changed, April Fools' jokes notwithstanding, is -that Python strings are immutable. Thus, the string methods return new -strings, and do not modify the string on which they operate. +One thing that hasn't changed, a noteworthy April Fools' joke +notwithstanding, is that Python strings are immutable. Thus, the +string methods return new strings, and do not modify the string on +which they operate. The old \module{string} module is still around for backwards compatibility, but it mostly acts as a front-end to the new string @@ -467,115 +417,23 @@ March 2000 archives of the python-dev mailing list contain most of the relevant discussion, especially in the threads titled ``Reference cycle collection for Python'' and ``Finalization again''. - -% ====================================================================== -%\section{New XML Code} - -%XXX write this section... - % ====================================================================== -\section{Porting to 2.0} - -New Python releases try hard to be compatible with previous releases, -and the record has been pretty good. However, some changes are -considered useful enough, often fixing initial design decisions that -turned to be actively mistaken, that breaking backward compatibility -can't always be avoided. This section lists the changes in Python 2.0 -that may cause old Python code to break. - -The change which will probably break the most code is tightening up -the arguments accepted by some methods. Some methods would take -multiple arguments and treat them as a tuple, particularly various -list methods such as \method{.append()} and \method{.insert()}. -In earlier versions of Python, if \code{L} is a list, \code{L.append( -1,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this -causes a \exception{TypeError} exception to be raised, with the -message: 'append requires exactly 1 argument; 2 given'. The fix is to -simply add an extra set of parentheses to pass both values as a tuple: -\code{L.append( (1,2) )}. - -The earlier versions of these methods were more forgiving because they -used an old function in Python's C interface to parse their arguments; -2.0 modernizes them to use \function{PyArg_ParseTuple}, the current -argument parsing function, which provides more helpful error messages -and treats multi-argument calls as errors. If you absolutely must use -2.0 but can't fix your code, you can edit \file{Objects/listobject.c} -and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to -preserve the old behaviour; this isn't recommended. - -Some of the functions in the \module{socket} module are still -forgiving in this way. For example, \function{socket.connect( -('hostname', 25) )} is the correct form, passing a tuple representing -an IP address, but \function{socket.connect( 'hostname', 25 )} also -works. \function{socket.connect_ex()} and \function{socket.bind()} are -similarly easy-going. 2.0alpha1 tightened these functions up, but -because the documentation actually used the erroneous multiple -argument form, many people wrote code which would break with the -stricter checking. GvR backed out the changes in the face of public -reaction, so for the\module{socket} module, the documentation was -fixed and the multiple argument form is simply marked as deprecated; -it \emph{will} be tightened up again in a future Python version. - -Some work has been done to make integers and long integers a bit more -interchangeable. In 1.5.2, large-file support was added for Solaris, -to allow reading files larger than 2Gb; this made the \method{tell()} -method of file objects return a long integer instead of a regular -integer. Some code would subtract two file offsets and attempt to use -the result to multiply a sequence or slice a string, but this raised a -\exception{TypeError}. In 2.0, long integers can be used to multiply -or slice a sequence, and it'll behave as you'd intuitively expect it -to; \code{3L * 'abc'} produces 'abcabcabc', and \code{ -(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in -various new places where previously only integers were accepted, such -as in the \method{seek()} method of file objects. - -The subtlest long integer change of all is that the \function{str()} -of a long integer no longer has a trailing 'L' character, though -\function{repr()} still includes it. The 'L' annoyed many people who -wanted to print long integers that looked just like regular integers, -since they had to go out of their way to chop off the character. This -is no longer a problem in 2.0, but code which assumes the 'L' is -there, and does \code{str(longval)[:-1]} will now lose the final -digit. - -Taking the \function{repr()} of a float now uses a different -formatting precision than \function{str()}. \function{repr()} uses -\code{\%.17g} format string for C's \function{sprintf()}, while -\function{str()} uses \code{\%.12g} as before. The effect is that -\function{repr()} may occasionally show more decimal places than -\function{str()}, for numbers -For example, the number 8.1 can't be represented exactly in binary, so -\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is -\code{'8.1'}. - -The \code{-X} command-line option, which turned all standard -exceptions into strings instead of classes, has been removed; the -standard exceptions will now always be classes. The -\module{exceptions} module containing the standard exceptions was -translated from Python to a built-in C module, written by Barry Warsaw -and Fredrik Lundh. - -% Commented out for now -- I don't think anyone will care. -%The pattern and match objects provided by SRE are C types, not Python -%class instances as in 1.5. This means you can no longer inherit from -%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much -%of a problem since no one should have been doing that in the first -%place. - -% ====================================================================== -\section{Core Changes} +\section{Other Core Changes} Various minor changes have been made to Python's syntax and built-in functions. None of the changes are very far-reaching, but they're handy conveniences. -A change to syntax makes it more convenient to call a given function +\subsection{Minor Language Changes} + +A new syntax makes it more convenient to call a given function with a tuple of arguments and/or a dictionary of keyword arguments. -In Python 1.5 and earlier, you do this with the \function{apply()} +In Python 1.5 and earlier, you'd use the \function{apply()} built-in function: \code{apply(f, \var{args}, \var{kw})} calls the function \function{f()} with the argument tuple \var{args} and the -keyword arguments in the dictionary \var{kw}. Thanks to a patch from -Greg Ewing, 2.0 adds \code{f(*\var{args}, **\var{kw})} as a shorter +keyword arguments in the dictionary \var{kw}. \function{apply()} +is the same in 2.0, but thanks to a patch from +Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter and clearer way to achieve the same effect. This syntax is symmetrical with the syntax for defining functions: @@ -586,31 +444,31 @@ def f(*args, **kw): ... \end{verbatim} -A new format style is available when using the \code{\%} operator. +The \keyword{print} statement can now have its output directed to a +file-like object by following the \keyword{print} with \code{>> +\var{fileobj}}, similar to the redirection operator in Unix shells. +Previously you'd either have to use the \method{write()} method of the +file-like object, which lacks the convenience and simplicity of +\keyword{print}, or you could assign a new value to \code{sys.stdout} +and then restore the old value. For sending output to standard error, +it's much easier to write this: + +\begin{verbatim} +print >> sys.stderr, "Warning: action field not supplied" +\end{verbatim} + +Modules can now be renamed on importing them, using the syntax +\code{import \var{module} as \var{name}} or \code{from \var{module} +import \var{name} as \var{othername}}. The patch was submitted by +Thomas Wouters. + +A new format style is available when using the \code{\%} operator; '\%r' will insert the \function{repr()} of its argument. This was also added from symmetry considerations, this time for symmetry with the existing '\%s' format style, which inserts the \function{str()} of its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a string containing \verb|'abc' abc|. -A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been -added. \function{zip()} returns a list of tuples where each tuple -contains the i-th element from each of the argument sequences. The -difference between \function{zip()} and \code{map(None, \var{seq1}, -\var{seq2})} is that \function{map()} raises an error if the sequences -aren't all of the same length, while \function{zip()} truncates the -returned list to the length of the shortest argument sequence. - -The \function{int()} and \function{long()} functions now accept an -optional ``base'' parameter when the first argument is a string. -\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns -291. \code{int(123, 16)} raises a \exception{TypeError} exception -with the message ``can't convert non-string with explicit base''. - -Modules can now be renamed on importing them, using the syntax -\code{import \var{module} as \var{name}} or \code{from \var{module} -import \var{name} as \var{othername}}. - Previously there was no way to implement a class that overrode Python's built-in \keyword{in} operator and implemented a custom version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is @@ -638,17 +496,20 @@ b.append(b) \end{verbatim} The comparison \code{a==b} returns true, because the two recursive -data structures are isomorphic. -\footnote{See the thread ``trashcan and PR\#7'' in the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant links. +data structures are isomorphic. \footnote{See the thread ``trashcan +and PR\#7'' in the April 2000 archives of the python-dev mailing list +for the discussion leading up to this implementation, and some useful +relevant links. %http://www.python.org/pipermail/python-dev/2000-April/004834.html } Work has been done on porting Python to 64-bit Windows on the Itanium -processor, mostly by Trent Mick of ActiveState. (Confusingly, \code{sys.platform} is still \code{'win32'} on -Win64 because it seems that for ease of porting, MS Visual C++ treats code -as 32 bit. -) PythonWin also supports Windows CE; see the Python CE page at -\url{http://starship.python.net/crew/mhammond/ce/} for more information. +processor, mostly by Trent Mick of ActiveState. (Confusingly, +\code{sys.platform} is still \code{'win32'} on Win64 because it seems +that for ease of porting, MS Visual C++ treats code as 32 bit on Itanium.) +PythonWin also supports Windows CE; see the Python CE page at +\url{http://starship.python.net/crew/mhammond/ce/} for more +information. An attempt has been made to alleviate one of Python's warts, the often-confusing \exception{NameError} exception when code refers to a @@ -668,6 +529,22 @@ def f(): f() \end{verbatim} +\subsection{Changes to Built-in Functions} + +A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been +added. \function{zip()} returns a list of tuples where each tuple +contains the i-th element from each of the argument sequences. The +difference between \function{zip()} and \code{map(None, \var{seq1}, +\var{seq2})} is that \function{map()} raises an error if the sequences +aren't all of the same length, while \function{zip()} truncates the +returned list to the length of the shortest argument sequence. + +The \function{int()} and \function{long()} functions now accept an +optional ``base'' parameter when the first argument is a string. +\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns +291. \code{int(123, 16)} raises a \exception{TypeError} exception +with the message ``can't convert non-string with explicit base''. + A new variable holding more detailed version information has been added to the \module{sys} module. \code{sys.version_info} is a tuple \code{(\var{major}, \var{minor}, \var{micro}, \var{level}, @@ -692,6 +569,96 @@ else: can be reduced to a single \code{return dict.setdefault(key, [])} statement. + +% ====================================================================== +\section{Porting to 2.0} + +New Python releases try hard to be compatible with previous releases, +and the record has been pretty good. However, some changes are +considered useful enough, often fixing initial design decisions that +turned to be actively mistaken, that breaking backward compatibility +can't always be avoided. This section lists the changes in Python 2.0 +that may cause old Python code to break. + +The change which will probably break the most code is tightening up +the arguments accepted by some methods. Some methods would take +multiple arguments and treat them as a tuple, particularly various +list methods such as \method{.append()} and \method{.insert()}. +In earlier versions of Python, if \code{L} is a list, \code{L.append( +1,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this +causes a \exception{TypeError} exception to be raised, with the +message: 'append requires exactly 1 argument; 2 given'. The fix is to +simply add an extra set of parentheses to pass both values as a tuple: +\code{L.append( (1,2) )}. + +The earlier versions of these methods were more forgiving because they +used an old function in Python's C interface to parse their arguments; +2.0 modernizes them to use \function{PyArg_ParseTuple}, the current +argument parsing function, which provides more helpful error messages +and treats multi-argument calls as errors. If you absolutely must use +2.0 but can't fix your code, you can edit \file{Objects/listobject.c} +and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to +preserve the old behaviour; this isn't recommended. + +Some of the functions in the \module{socket} module are still +forgiving in this way. For example, \function{socket.connect( +('hostname', 25) )} is the correct form, passing a tuple representing +an IP address, but \function{socket.connect( 'hostname', 25 )} also +works. \function{socket.connect_ex()} and \function{socket.bind()} are +similarly easy-going. 2.0alpha1 tightened these functions up, but +because the documentation actually used the erroneous multiple +argument form, many people wrote code which would break with the +stricter checking. GvR backed out the changes in the face of public +reaction, so for the\module{socket} module, the documentation was +fixed and the multiple argument form is simply marked as deprecated; +it \emph{will} be tightened up again in a future Python version. + +Some work has been done to make integers and long integers a bit more +interchangeable. In 1.5.2, large-file support was added for Solaris, +to allow reading files larger than 2Gb; this made the \method{tell()} +method of file objects return a long integer instead of a regular +integer. Some code would subtract two file offsets and attempt to use +the result to multiply a sequence or slice a string, but this raised a +\exception{TypeError}. In 2.0, long integers can be used to multiply +or slice a sequence, and it'll behave as you'd intuitively expect it +to; \code{3L * 'abc'} produces 'abcabcabc', and \code{ +(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in +various new places where previously only integers were accepted, such +as in the \method{seek()} method of file objects. + +The subtlest long integer change of all is that the \function{str()} +of a long integer no longer has a trailing 'L' character, though +\function{repr()} still includes it. The 'L' annoyed many people who +wanted to print long integers that looked just like regular integers, +since they had to go out of their way to chop off the character. This +is no longer a problem in 2.0, but code which assumes the 'L' is +there, and does \code{str(longval)[:-1]} will now lose the final +digit. + +Taking the \function{repr()} of a float now uses a different +formatting precision than \function{str()}. \function{repr()} uses +\code{\%.17g} format string for C's \function{sprintf()}, while +\function{str()} uses \code{\%.12g} as before. The effect is that +\function{repr()} may occasionally show more decimal places than +\function{str()}, for numbers +For example, the number 8.1 can't be represented exactly in binary, so +\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is +\code{'8.1'}. + +The \code{-X} command-line option, which turned all standard +exceptions into strings instead of classes, has been removed; the +standard exceptions will now always be classes. The +\module{exceptions} module containing the standard exceptions was +translated from Python to a built-in C module, written by Barry Warsaw +and Fredrik Lundh. + +% Commented out for now -- I don't think anyone will care. +%The pattern and match objects provided by SRE are C types, not Python +%class instances as in 1.5. This means you can no longer inherit from +%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much +%of a problem since no one should have been doing that in the first +%place. + % ====================================================================== \section{Extending/Embedding Changes} @@ -755,6 +722,89 @@ requires an ANSI C compiler, and can no longer be done using a compiler that only supports K\&R C. % ====================================================================== +\section{Distutils: Making Modules Easy to Install} + +Before Python 2.0, installing modules was a tedious affair -- there +was no way to figure out automatically where Python is installed, or +what compiler options to use for extension modules. Software authors +had to go through an ardous ritual of editing Makefiles and +configuration files, which only really work on Unix and leave Windows +and MacOS unsupported. Software users faced wildly differing +installation instructions + +The SIG for distribution utilities, shepherded by Greg Ward, has +created the Distutils, a system to make package installation much +easier. They form the \module{distutils} package, a new part of +Python's standard library. In the best case, installing a Python +module from source will require the same steps: first you simply mean +unpack the tarball or zip archive, and the run ``\code{python setup.py +install}''. The platform will be automatically detected, the compiler +will be recognized, C extension modules will be compiled, and the +distribution installed into the proper directory. Optional +command-line arguments provide more control over the installation +process, the distutils package offers many places to override defaults +-- separating the build from the install, building or installing in +non-default directories, and more. + +In order to use the Distutils, you need to write a \file{setup.py} +script. For the simple case, when the software contains only .py +files, a minimal \file{setup.py} can be just a few lines long: + +\begin{verbatim} +from distutils.core import setup +setup (name = "foo", version = "1.0", + py_modules = ["module1", "module2"]) +\end{verbatim} + +The \file{setup.py} file isn't much more complicated if the software +consists of a few packages: + +\begin{verbatim} +from distutils.core import setup +setup (name = "foo", version = "1.0", + packages = ["package", "package.subpackage"]) +\end{verbatim} + +A C extension can be the most complicated case; here's an example taken from +the PyXML package: + + +\begin{verbatim} +from distutils.core import setup, Extension + +expat_extension = Extension('xml.parsers.pyexpat', + define_macros = [('XML_NS', None)], + include_dirs = [ 'extensions/expat/xmltok', + 'extensions/expat/xmlparse' ], + sources = [ 'extensions/pyexpat.c', + 'extensions/expat/xmltok/xmltok.c', + 'extensions/expat/xmltok/xmlrole.c', + ] + ) +setup (name = "PyXML", version = "0.5.4", + ext_modules =[ expat_extension ] ) + +\end{verbatim} + +The Distutils can also take care of creating source and binary +distributions. The ``sdist'' command, run by ``\code{python setup.py +sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}. +Adding new commands isn't difficult, ``bdist_rpm'' and +``bdist_wininst'' commands have already been contributed to create an +RPM distribution and a Windows installer for the software, +respectively. Commands to create other distribution formats such as +Debian packages and Solaris \file{.pkg} files are in various stages of +development. + +All this is documented in a new manual, \textit{Distributing Python +Modules}, that joins the basic set of Python documentation. + +% ====================================================================== +%\section{New XML Code} + +%XXX write this section... + +% ====================================================================== \section{Module changes} Lots of improvements and bugfixes were made to Python's extensive -- cgit v0.12