diff options
author | Georg Brandl <georg@python.org> | 2007-08-15 14:27:07 (GMT) |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2007-08-15 14:27:07 (GMT) |
commit | 739c01d47b9118d04e5722333f0e6b4d0c8bdd9e (patch) | |
tree | f82b450d291927fc1758b96d981aa0610947b529 /Doc/whatsnew | |
parent | 2d1649094402ef393ea2b128ba2c08c3937e6b93 (diff) | |
download | cpython-739c01d47b9118d04e5722333f0e6b4d0c8bdd9e.zip cpython-739c01d47b9118d04e5722333f0e6b4d0c8bdd9e.tar.gz cpython-739c01d47b9118d04e5722333f0e6b4d0c8bdd9e.tar.bz2 |
Delete the LaTeX doc tree.
Diffstat (limited to 'Doc/whatsnew')
-rw-r--r-- | Doc/whatsnew/whatsnew20.tex | 1337 | ||||
-rw-r--r-- | Doc/whatsnew/whatsnew21.tex | 868 | ||||
-rw-r--r-- | Doc/whatsnew/whatsnew22.tex | 1466 | ||||
-rw-r--r-- | Doc/whatsnew/whatsnew23.tex | 2380 | ||||
-rw-r--r-- | Doc/whatsnew/whatsnew24.tex | 1757 | ||||
-rw-r--r-- | Doc/whatsnew/whatsnew25.tex | 2539 | ||||
-rw-r--r-- | Doc/whatsnew/whatsnew26.tex | 268 | ||||
-rw-r--r-- | Doc/whatsnew/whatsnew30.tex | 178 |
8 files changed, 0 insertions, 10793 deletions
diff --git a/Doc/whatsnew/whatsnew20.tex b/Doc/whatsnew/whatsnew20.tex deleted file mode 100644 index 360d7dc..0000000 --- a/Doc/whatsnew/whatsnew20.tex +++ /dev/null @@ -1,1337 +0,0 @@ -\documentclass{howto} - -% $Id$ - -\title{What's New in Python 2.0} -\release{1.02} -\author{A.M. Kuchling and Moshe Zadka} -\authoraddress{ - \strong{Python Software Foundation}\\ - Email: \email{amk@amk.ca}, \email{moshez@twistedmatrix.com} -} -\begin{document} -\maketitle\tableofcontents - -\section{Introduction} - -A new release of Python, version 2.0, was released on October 16, 2000. This -article covers the exciting new features in 2.0, highlights some other -useful changes, and points out a few incompatible changes that may require -rewriting code. - -Python's development never completely stops between releases, and a -steady flow of bug fixes and improvements are always being submitted. -A host of minor fixes, a few optimizations, additional docstrings, and -better error messages went into 2.0; to list them all would be -impossible, but they're certainly significant. Consult the -publicly-available CVS logs if you want to see the full list. This -progress is due to the five developers working for -PythonLabs are now getting paid to spend their days fixing bugs, -and also due to the improved communication resulting -from moving to SourceForge. - -% ====================================================================== -\section{What About Python 1.6?} - -Python 1.6 can be thought of as the Contractual Obligations Python -release. After the core development team left CNRI in May 2000, CNRI -requested that a 1.6 release be created, containing all the work on -Python that had been performed at CNRI. Python 1.6 therefore -represents the state of the CVS tree as of May 2000, with the most -significant new feature being Unicode support. Development continued -after May, of course, so the 1.6 tree received a few fixes to ensure -that it's forward-compatible with Python 2.0. 1.6 is therefore part -of Python's evolution, and not a side branch. - -So, should you take much interest in Python 1.6? Probably not. The -1.6final and 2.0beta1 releases were made on the same day (September 5, -2000), the plan being to finalize Python 2.0 within a month or so. If -you have applications to maintain, there seems little point in -breaking things by moving to 1.6, fixing them, and then having another -round of breakage within a month by moving to 2.0; you're better off -just going straight to 2.0. Most of the really interesting features -described in this document are only in 2.0, because a lot of work was -done between May and September. - -% ====================================================================== -\section{New Development Process} - -The most important change in Python 2.0 may not be to the code at all, -but to how Python is developed: in May 2000 the Python developers -began using the tools made available by SourceForge for storing -source code, tracking bug reports, and managing the queue of patch -submissions. To report bugs or submit patches for Python 2.0, use the -bug tracking and patch manager tools available from Python's project -page, located at \url{http://sourceforge.net/projects/python/}. - -The most important of the services now hosted at SourceForge is the -Python CVS tree, the version-controlled repository containing the -source code for Python. Previously, there were roughly 7 or so people -who had write access to the CVS tree, and all patches had to be -inspected and checked in by one of the people on this short list. -Obviously, this wasn't very scalable. By moving the CVS tree to -SourceForge, it became possible to grant write access to more people; -as of September 2000 there were 27 people able to check in changes, a -fourfold increase. This makes possible large-scale changes that -wouldn't be attempted if they'd have to be filtered through the small -group of core developers. For example, one day Peter Schneider-Kamp -took it into his head to drop K\&R C compatibility and convert the C -source for Python to ANSI C. After getting approval on the python-dev -mailing list, he launched into a flurry of checkins that lasted about -a week, other developers joined in to help, and the job was done. If -there were only 5 people with write access, probably that task would -have been viewed as ``nice, but not worth the time and effort needed'' -and it would never have gotten done. - -The shift to using SourceForge's services has resulted in a remarkable -increase in the speed of development. Patches now get submitted, -commented on, revised by people other than the original submitter, and -bounced back and forth between people until the patch is deemed worth -checking in. Bugs are tracked in one central location and can be -assigned to a specific person for fixing, and we can count the number -of open bugs to measure progress. This didn't come without a cost: -developers now have more e-mail to deal with, more mailing lists to -follow, and special tools had to be written for the new environment. -For example, SourceForge sends default patch and bug notification -e-mail messages that are completely unhelpful, so Ka-Ping Yee wrote an -HTML screen-scraper that sends more useful messages. - -The ease of adding code caused a few initial growing pains, such as -code was checked in before it was ready or without getting clear -agreement from the developer group. The approval process that has -emerged is somewhat similar to that used by the Apache group. -Developers can vote +1, +0, -0, or -1 on a patch; +1 and -1 denote -acceptance or rejection, while +0 and -0 mean the developer is mostly -indifferent to the change, though with a slight positive or negative -slant. The most significant change from the Apache model is that the -voting is essentially advisory, letting Guido van Rossum, who has -Benevolent Dictator For Life status, know what the general opinion is. -He can still ignore the result of a vote, and approve or -reject a change even if the community disagrees with him. - -Producing an actual patch is the last step in adding a new feature, -and is usually easy compared to the earlier task of coming up with a -good design. Discussions of new features can often explode into -lengthy mailing list threads, making the discussion hard to follow, -and no one can read every posting to python-dev. Therefore, a -relatively formal process has been set up to write Python Enhancement -Proposals (PEPs), modelled on the Internet RFC process. PEPs are -draft documents that describe a proposed new feature, and are -continually revised until the community reaches a consensus, either -accepting or rejecting the proposal. Quoting from the introduction to -PEP 1, ``PEP Purpose and Guidelines'': - -\begin{quotation} - PEP stands for Python Enhancement Proposal. A PEP is a design - document providing information to the Python community, or - describing a new feature for Python. The PEP should provide a - concise technical specification of the feature and a rationale for - the feature. - - We intend PEPs to be the primary mechanisms for proposing new - features, for collecting community input on an issue, and for - documenting the design decisions that have gone into Python. The - PEP author is responsible for building consensus within the - community and documenting dissenting opinions. -\end{quotation} - -Read the rest of PEP 1 for the details of the PEP editorial process, -style, and format. PEPs are kept in the Python CVS tree on -SourceForge, though they're not part of the Python 2.0 distribution, -and are also available in HTML form from -\url{http://www.python.org/peps/}. As of September 2000, -there are 25 PEPS, ranging from PEP 201, ``Lockstep Iteration'', to -PEP 225, ``Elementwise/Objectwise Operators''. - -% ====================================================================== -\section{Unicode} - -The largest new feature in Python 2.0 is a new fundamental data type: -Unicode strings. Unicode uses 16-bit numbers to represent characters -instead of the 8-bit number used by ASCII, meaning that 65,536 -distinct characters can be supported. - -The final interface for Unicode support was arrived at through -countless often-stormy discussions on the python-dev mailing list, and -mostly implemented by Marc-Andr\'e Lemburg, based on a Unicode string -type implementation by Fredrik Lundh. A detailed explanation of the -interface was written up as \pep{100}, ``Python Unicode Integration''. -This article will simply cover the most significant points about the -Unicode interfaces. - -In Python source code, Unicode strings are written as -\code{u"string"}. Arbitrary Unicode characters can be written using a -new escape sequence, \code{\e u\var{HHHH}}, where \var{HHHH} is a -4-digit hexadecimal number from 0000 to FFFF. The existing -\code{\e x\var{HHHH}} escape sequence can also be used, and octal -escapes can be used for characters up to U+01FF, which is represented -by \code{\e 777}. - -Unicode strings, just like regular strings, are an immutable sequence -type. They can be indexed and sliced, but not modified in place. -Unicode strings have an \method{encode( \optional{encoding} )} method -that returns an 8-bit string in the desired encoding. Encodings are -named by strings, such as \code{'ascii'}, \code{'utf-8'}, -\code{'iso-8859-1'}, or whatever. A codec API is defined for -implementing and registering new encodings that are then available -throughout a Python program. If an encoding isn't specified, the -default encoding is usually 7-bit ASCII, though it can be changed for -your Python installation by calling the -\function{sys.setdefaultencoding(\var{encoding})} function in a -customised version of \file{site.py}. - -Combining 8-bit and Unicode strings always coerces to Unicode, using -the default ASCII encoding; the result of \code{'a' + u'bc'} is -\code{u'abc'}. - -New built-in functions have been added, and existing built-ins -modified to support Unicode: - -\begin{itemize} -\item \code{unichr(\var{ch})} returns a Unicode string 1 character -long, containing the character \var{ch}. - -\item \code{ord(\var{u})}, where \var{u} is a 1-character regular or Unicode string, returns the number of the character as an integer. - -\item \code{unicode(\var{string} \optional{, \var{encoding}} -\optional{, \var{errors}} ) } creates a Unicode string from an 8-bit -string. \code{encoding} is a string naming the encoding to use. -The \code{errors} parameter specifies the treatment of characters that -are invalid for the current encoding; passing \code{'strict'} as the -value causes an exception to be raised on any encoding error, while -\code{'ignore'} causes errors to be silently ignored and -\code{'replace'} uses U+FFFD, the official replacement character, in -case of any problems. - -\item The \keyword{exec} statement, and various built-ins such as -\code{eval()}, \code{getattr()}, and \code{setattr()} will also -accept Unicode strings as well as regular strings. (It's possible -that the process of fixing this missed some built-ins; if you find a -built-in function that accepts strings but doesn't accept Unicode -strings at all, please report it as a bug.) - -\end{itemize} - -A new module, \module{unicodedata}, provides an interface to Unicode -character properties. For example, \code{unicodedata.category(u'A')} -returns the 2-character string 'Lu', the 'L' denoting it's a letter, -and 'u' meaning that it's uppercase. -\code{unicodedata.bidirectional(u'\e u0660')} returns 'AN', meaning that U+0660 is -an Arabic number. - -The \module{codecs} module contains functions to look up existing encodings -and register new ones. Unless you want to implement a -new encoding, you'll most often use the -\function{codecs.lookup(\var{encoding})} function, which returns a -4-element tuple: \code{(\var{encode_func}, -\var{decode_func}, \var{stream_reader}, \var{stream_writer})}. - -\begin{itemize} -\item \var{encode_func} is a function that takes a Unicode string, and -returns a 2-tuple \code{(\var{string}, \var{length})}. \var{string} -is an 8-bit string containing a portion (perhaps all) of the Unicode -string converted into the given encoding, and \var{length} tells you -how much of the Unicode string was converted. - -\item \var{decode_func} is the opposite of \var{encode_func}, taking -an 8-bit string and returning a 2-tuple \code{(\var{ustring}, -\var{length})}, consisting of the resulting Unicode string -\var{ustring} and the integer \var{length} telling how much of the -8-bit string was consumed. - -\item \var{stream_reader} is a class that supports decoding input from -a stream. \var{stream_reader(\var{file_obj})} returns an object that -supports the \method{read()}, \method{readline()}, and -\method{readlines()} methods. These methods will all translate from -the given encoding and return Unicode strings. - -\item \var{stream_writer}, similarly, is a class that supports -encoding output to a stream. \var{stream_writer(\var{file_obj})} -returns an object that supports the \method{write()} and -\method{writelines()} methods. These methods expect Unicode strings, -translating them to the given encoding on output. -\end{itemize} - -For example, the following code writes a Unicode string into a file, -encoding it as UTF-8: - -\begin{verbatim} -import codecs - -unistr = u'\u0660\u2000ab ...' - -(UTF8_encode, UTF8_decode, - UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8') - -output = UTF8_streamwriter( open( '/tmp/output', 'wb') ) -output.write( unistr ) -output.close() -\end{verbatim} - -The following code would then read UTF-8 input from the file: - -\begin{verbatim} -input = UTF8_streamreader( open( '/tmp/output', 'rb') ) -print repr(input.read()) -input.close() -\end{verbatim} - -Unicode-aware regular expressions are available through the -\module{re} module, which has a new underlying implementation called -SRE written by Fredrik Lundh of Secret Labs AB. - -A \code{-U} command line option was added which causes the Python -compiler to interpret all string literals as Unicode string literals. -This is intended to be used in testing and future-proofing your Python -code, since some future version of Python may drop support for 8-bit -strings and provide only Unicode strings. - -% ====================================================================== -\section{List Comprehensions} - -Lists are a workhorse data type in Python, and many programs -manipulate a list at some point. Two common operations on lists are -to loop over them, and either pick out the elements that meet a -certain criterion, or apply some function to each element. For -example, given a list of strings, you might want to pull out all the -strings containing a given substring, or strip off trailing whitespace -from each line. - -The existing \function{map()} and \function{filter()} functions can be -used for this purpose, but they require a function as one of their -arguments. This is fine if there's an existing built-in function that -can be passed directly, but if there isn't, you have to create a -little function to do the required work, and Python's scoping rules -make the result ugly if the little function needs additional -information. Take the first example in the previous paragraph, -finding all the strings in the list containing a given substring. You -could write the following to do it: - -\begin{verbatim} -# Given the list L, make a list of all strings -# containing the substring S. -sublist = filter( lambda s, substring=S: - string.find(s, substring) != -1, - L) -\end{verbatim} - -Because of Python's scoping rules, a default argument is used so that -the anonymous function created by the \keyword{lambda} statement knows -what substring is being searched for. List comprehensions make this -cleaner: - -\begin{verbatim} -sublist = [ s for s in L if string.find(s, S) != -1 ] -\end{verbatim} - -List comprehensions have the form: - -\begin{verbatim} -[ expression for expr in sequence1 - for expr2 in sequence2 ... - for exprN in sequenceN - if condition ] -\end{verbatim} - -The \keyword{for}...\keyword{in} clauses contain the sequences to be -iterated over. The sequences do not have to be the same length, -because they are \emph{not} iterated over in parallel, but -from left to right; this is explained more clearly in the following -paragraphs. The elements of the generated list will be the successive -values of \var{expression}. The final \keyword{if} clause is -optional; if present, \var{expression} is only evaluated and added to -the result if \var{condition} is true. - -To make the semantics very clear, a list comprehension is equivalent -to the following Python code: - -\begin{verbatim} -for expr1 in sequence1: - for expr2 in sequence2: - ... - for exprN in sequenceN: - if (condition): - # Append the value of - # the expression to the - # resulting list. -\end{verbatim} - -This means that when there are multiple \keyword{for}...\keyword{in} clauses, -the resulting list will be equal to the product of the lengths of all -the sequences. If you have two lists of length 3, the output list is -9 elements long: - -\begin{verbatim} -seq1 = 'abc' -seq2 = (1,2,3) ->>> [ (x,y) for x in seq1 for y in seq2] -[('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1), -('c', 2), ('c', 3)] -\end{verbatim} - -To avoid introducing an ambiguity into Python's grammar, if -\var{expression} is creating a tuple, it must be surrounded with -parentheses. The first list comprehension below is a syntax error, -while the second one is correct: - -\begin{verbatim} -# Syntax error -[ x,y for x in seq1 for y in seq2] -# Correct -[ (x,y) for x in seq1 for y in seq2] -\end{verbatim} - -The idea of list comprehensions originally comes from the functional -programming language Haskell (\url{http://www.haskell.org}). Greg -Ewing argued most effectively for adding them to Python and wrote the -initial list comprehension patch, which was then discussed for a -seemingly endless time on the python-dev mailing list and kept -up-to-date by Skip Montanaro. - -% ====================================================================== -\section{Augmented Assignment} - -Augmented assignment operators, another long-requested feature, have -been added to Python 2.0. Augmented assignment operators include -\code{+=}, \code{-=}, \code{*=}, and so forth. For example, the -statement \code{a += 2} increments the value of the variable -\code{a} by 2, equivalent to the slightly lengthier \code{a = a + 2}. - -% The empty groups below prevent conversion to guillemets. -The full list of supported assignment operators is \code{+=}, -\code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=}, -\code{|=}, \verb|^=|, \code{>>=}, and \code{<<=}. Python classes can -override the augmented assignment operators by defining methods named -\method{__iadd__}, \method{__isub__}, etc. For example, the following -\class{Number} class stores a number and supports using += to create a -new instance with an incremented value. - -\begin{verbatim} -class Number: - def __init__(self, value): - self.value = value - def __iadd__(self, increment): - return Number( self.value + increment) - -n = Number(5) -n += 3 -print n.value -\end{verbatim} - -The \method{__iadd__} special method is called with the value of the -increment, and should return a new instance with an appropriately -modified value; this return value is bound as the new value of the -variable on the left-hand side. - -Augmented assignment operators were first introduced in the C -programming language, and most C-derived languages, such as -\program{awk}, \Cpp, Java, Perl, and PHP also support them. The augmented -assignment patch was implemented by Thomas Wouters. - -% ====================================================================== -\section{String Methods} - -Until now string-manipulation functionality was in the \module{string} -module, which was usually a front-end for the \module{strop} -module written in C. The addition of Unicode posed a difficulty for -the \module{strop} module, because the functions would all need to be -rewritten in order to accept either 8-bit or Unicode strings. For -functions such as \function{string.replace()}, which takes 3 string -arguments, that means eight possible permutations, and correspondingly -complicated code. - -Instead, Python 2.0 pushes the problem onto the string type, making -string manipulation functionality available through methods on both -8-bit strings and Unicode strings. - -\begin{verbatim} ->>> 'andrew'.capitalize() -'Andrew' ->>> 'hostname'.replace('os', 'linux') -'hlinuxtname' ->>> 'moshe'.find('sh') -2 -\end{verbatim} - -One thing that hasn't changed, a noteworthy April Fools' joke -notwithstanding, is that Python strings are immutable. Thus, the -string methods return new strings, and do not modify the string on -which they operate. - -The old \module{string} module is still around for backwards -compatibility, but it mostly acts as a front-end to the new string -methods. - -Two methods which have no parallel in pre-2.0 versions, although they -did exist in JPython for quite some time, are \method{startswith()} -and \method{endswith}. \code{s.startswith(t)} is equivalent to \code{s[:len(t)] -== t}, while \code{s.endswith(t)} is equivalent to \code{s[-len(t):] == t}. - -One other method which deserves special mention is \method{join}. The -\method{join} method of a string receives one parameter, a sequence of -strings, and is equivalent to the \function{string.join} function from -the old \module{string} module, with the arguments reversed. In other -words, \code{s.join(seq)} is equivalent to the old -\code{string.join(seq, s)}. - -% ====================================================================== -\section{Garbage Collection of Cycles} - -The C implementation of Python uses reference counting to implement -garbage collection. Every Python object maintains a count of the -number of references pointing to itself, and adjusts the count as -references are created or destroyed. Once the reference count reaches -zero, the object is no longer accessible, since you need to have a -reference to an object to access it, and if the count is zero, no -references exist any longer. - -Reference counting has some pleasant properties: it's easy to -understand and implement, and the resulting implementation is -portable, fairly fast, and reacts well with other libraries that -implement their own memory handling schemes. The major problem with -reference counting is that it sometimes doesn't realise that objects -are no longer accessible, resulting in a memory leak. This happens -when there are cycles of references. - -Consider the simplest possible cycle, -a class instance which has a reference to itself: - -\begin{verbatim} -instance = SomeClass() -instance.myself = instance -\end{verbatim} - -After the above two lines of code have been executed, the reference -count of \code{instance} is 2; one reference is from the variable -named \samp{'instance'}, and the other is from the \samp{myself} -attribute of the instance. - -If the next line of code is \code{del instance}, what happens? The -reference count of \code{instance} is decreased by 1, so it has a -reference count of 1; the reference in the \samp{myself} attribute -still exists. Yet the instance is no longer accessible through Python -code, and it could be deleted. Several objects can participate in a -cycle if they have references to each other, causing all of the -objects to be leaked. - -Python 2.0 fixes this problem by periodically executing a cycle -detection algorithm which looks for inaccessible cycles and deletes -the objects involved. A new \module{gc} module provides functions to -perform a garbage collection, obtain debugging statistics, and tuning -the collector's parameters. - -Running the cycle detection algorithm takes some time, and therefore -will result in some additional overhead. It is hoped that after we've -gotten experience with the cycle collection from using 2.0, Python 2.1 -will be able to minimize the overhead with careful tuning. It's not -yet obvious how much performance is lost, because benchmarking this is -tricky and depends crucially on how often the program creates and -destroys objects. The detection of cycles can be disabled when Python -is compiled, if you can't afford even a tiny speed penalty or suspect -that the cycle collection is buggy, by specifying the -\longprogramopt{without-cycle-gc} switch when running the -\program{configure} script. - -Several people tackled this problem and contributed to a solution. An -early implementation of the cycle detection approach was written by -Toby Kelsey. The current algorithm was suggested by Eric Tiedemann -during a visit to CNRI, and Guido van Rossum and Neil Schemenauer -wrote two different implementations, which were later integrated by -Neil. Lots of other people offered suggestions along the way; the -March 2000 archives of the python-dev mailing list contain most of the -relevant discussion, especially in the threads titled ``Reference -cycle collection for Python'' and ``Finalization again''. - -% ====================================================================== -\section{Other Core Changes} - -Various minor changes have been made to Python's syntax and built-in -functions. None of the changes are very far-reaching, but they're -handy conveniences. - -\subsection{Minor Language Changes} - -A new syntax makes it more convenient to call a given function -with a tuple of arguments and/or a dictionary of keyword arguments. -In Python 1.5 and earlier, you'd use the \function{apply()} -built-in function: \code{apply(f, \var{args}, \var{kw})} calls the -function \function{f()} with the argument tuple \var{args} and the -keyword arguments in the dictionary \var{kw}. \function{apply()} -is the same in 2.0, but thanks to a patch from -Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter -and clearer way to achieve the same effect. This syntax is -symmetrical with the syntax for defining functions: - -\begin{verbatim} -def f(*args, **kw): - # args is a tuple of positional args, - # kw is a dictionary of keyword args - ... -\end{verbatim} - -The \keyword{print} statement can now have its output directed to a -file-like object by following the \keyword{print} with -\verb|>> file|, similar to the redirection operator in \UNIX{} shells. -Previously you'd either have to use the \method{write()} method of the -file-like object, which lacks the convenience and simplicity of -\keyword{print}, or you could assign a new value to -\code{sys.stdout} and then restore the old value. For sending output to standard error, -it's much easier to write this: - -\begin{verbatim} -print >> sys.stderr, "Warning: action field not supplied" -\end{verbatim} - -Modules can now be renamed on importing them, using the syntax -\code{import \var{module} as \var{name}} or \code{from \var{module} -import \var{name} as \var{othername}}. The patch was submitted by -Thomas Wouters. - -A new format style is available when using the \code{\%} operator; -'\%r' will insert the \function{repr()} of its argument. This was -also added from symmetry considerations, this time for symmetry with -the existing '\%s' format style, which inserts the \function{str()} of -its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a -string containing \verb|'abc' abc|. - -Previously there was no way to implement a class that overrode -Python's built-in \keyword{in} operator and implemented a custom -version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is -present in the sequence \var{seq}; Python computes this by simply -trying every index of the sequence until either \var{obj} is found or -an \exception{IndexError} is encountered. Moshe Zadka contributed a -patch which adds a \method{__contains__} magic method for providing a -custom implementation for \keyword{in}. Additionally, new built-in -objects written in C can define what \keyword{in} means for them via a -new slot in the sequence protocol. - -Earlier versions of Python used a recursive algorithm for deleting -objects. Deeply nested data structures could cause the interpreter to -fill up the C stack and crash; Christian Tismer rewrote the deletion -logic to fix this problem. On a related note, comparing recursive -objects recursed infinitely and crashed; Jeremy Hylton rewrote the -code to no longer crash, producing a useful result instead. For -example, after this code: - -\begin{verbatim} -a = [] -b = [] -a.append(a) -b.append(b) -\end{verbatim} - -The comparison \code{a==b} returns true, because the two recursive -data structures are isomorphic. See the thread ``trashcan -and PR\#7'' in the April 2000 archives of the python-dev mailing list -for the discussion leading up to this implementation, and some useful -relevant links. -% Starting URL: -% http://www.python.org/pipermail/python-dev/2000-April/004834.html - -Note that comparisons can now also raise exceptions. In earlier -versions of Python, a comparison operation such as \code{cmp(a,b)} -would always produce an answer, even if a user-defined -\method{__cmp__} method encountered an error, since the resulting -exception would simply be silently swallowed. - -Work has been done on porting Python to 64-bit Windows on the Itanium -processor, mostly by Trent Mick of ActiveState. (Confusingly, -\code{sys.platform} is still \code{'win32'} on Win64 because it seems -that for ease of porting, MS Visual \Cpp{} treats code as 32 bit on Itanium.) -PythonWin also supports Windows CE; see the Python CE page at -\url{http://starship.python.net/crew/mhammond/ce/} for more -information. - -Another new platform is Darwin/MacOS X; initial support for it is in -Python 2.0. Dynamic loading works, if you specify ``configure ---with-dyld --with-suffix=.x''. Consult the README in the Python -source distribution for more instructions. - -An attempt has been made to alleviate one of Python's warts, the -often-confusing \exception{NameError} exception when code refers to a -local variable before the variable has been assigned a value. For -example, the following code raises an exception on the \keyword{print} -statement in both 1.5.2 and 2.0; in 1.5.2 a \exception{NameError} -exception is raised, while 2.0 raises a new -\exception{UnboundLocalError} exception. -\exception{UnboundLocalError} is a subclass of \exception{NameError}, -so any existing code that expects \exception{NameError} to be raised -should still work. - -\begin{verbatim} -def f(): - print "i=",i - i = i + 1 -f() -\end{verbatim} - -Two new exceptions, \exception{TabError} and -\exception{IndentationError}, have been introduced. They're both -subclasses of \exception{SyntaxError}, and are raised when Python code -is found to be improperly indented. - -\subsection{Changes to Built-in Functions} - -A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been -added. \function{zip()} returns a list of tuples where each tuple -contains the i-th element from each of the argument sequences. The -difference between \function{zip()} and \code{map(None, \var{seq1}, -\var{seq2})} is that \function{map()} pads the sequences with -\code{None} if the sequences aren't all of the same length, while -\function{zip()} truncates the returned list to the length of the -shortest argument sequence. - -The \function{int()} and \function{long()} functions now accept an -optional ``base'' parameter when the first argument is a string. -\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns -291. \code{int(123, 16)} raises a \exception{TypeError} exception -with the message ``can't convert non-string with explicit base''. - -A new variable holding more detailed version information has been -added to the \module{sys} module. \code{sys.version_info} is a tuple -\code{(\var{major}, \var{minor}, \var{micro}, \var{level}, -\var{serial})} For example, in a hypothetical 2.0.1beta1, -\code{sys.version_info} would be \code{(2, 0, 1, 'beta', 1)}. -\var{level} is a string such as \code{"alpha"}, \code{"beta"}, or -\code{"final"} for a final release. - -Dictionaries have an odd new method, \method{setdefault(\var{key}, -\var{default})}, which behaves similarly to the existing -\method{get()} method. However, if the key is missing, -\method{setdefault()} both returns the value of \var{default} as -\method{get()} would do, and also inserts it into the dictionary as -the value for \var{key}. Thus, the following lines of code: - -\begin{verbatim} -if dict.has_key( key ): return dict[key] -else: - dict[key] = [] - return dict[key] -\end{verbatim} - -can be reduced to a single \code{return dict.setdefault(key, [])} statement. - -The interpreter sets a maximum recursion depth in order to catch -runaway recursion before filling the C stack and causing a core dump -or GPF.. Previously this limit was fixed when you compiled Python, -but in 2.0 the maximum recursion depth can be read and modified using -\function{sys.getrecursionlimit} and \function{sys.setrecursionlimit}. -The default value is 1000, and a rough maximum value for a given -platform can be found by running a new script, -\file{Misc/find_recursionlimit.py}. - -% ====================================================================== -\section{Porting to 2.0} - -New Python releases try hard to be compatible with previous releases, -and the record has been pretty good. However, some changes are -considered useful enough, usually because they fix initial design decisions that -turned out to be actively mistaken, that breaking backward compatibility -can't always be avoided. This section lists the changes in Python 2.0 -that may cause old Python code to break. - -The change which will probably break the most code is tightening up -the arguments accepted by some methods. Some methods would take -multiple arguments and treat them as a tuple, particularly various -list methods such as \method{.append()} and \method{.insert()}. -In earlier versions of Python, if \code{L} is a list, \code{L.append( -1,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this -causes a \exception{TypeError} exception to be raised, with the -message: 'append requires exactly 1 argument; 2 given'. The fix is to -simply add an extra set of parentheses to pass both values as a tuple: -\code{L.append( (1,2) )}. - -The earlier versions of these methods were more forgiving because they -used an old function in Python's C interface to parse their arguments; -2.0 modernizes them to use \function{PyArg_ParseTuple}, the current -argument parsing function, which provides more helpful error messages -and treats multi-argument calls as errors. If you absolutely must use -2.0 but can't fix your code, you can edit \file{Objects/listobject.c} -and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to -preserve the old behaviour; this isn't recommended. - -Some of the functions in the \module{socket} module are still -forgiving in this way. For example, \function{socket.connect( -('hostname', 25) )} is the correct form, passing a tuple representing -an IP address, but \function{socket.connect( 'hostname', 25 )} also -works. \function{socket.connect_ex()} and \function{socket.bind()} are -similarly easy-going. 2.0alpha1 tightened these functions up, but -because the documentation actually used the erroneous multiple -argument form, many people wrote code which would break with the -stricter checking. GvR backed out the changes in the face of public -reaction, so for the \module{socket} module, the documentation was -fixed and the multiple argument form is simply marked as deprecated; -it \emph{will} be tightened up again in a future Python version. - -The \code{\e x} escape in string literals now takes exactly 2 hex -digits. Previously it would consume all the hex digits following the -'x' and take the lowest 8 bits of the result, so \code{\e x123456} was -equivalent to \code{\e x56}. - -The \exception{AttributeError} and \exception{NameError} exceptions -have a more friendly error message, whose text will be something like -\code{'Spam' instance has no attribute 'eggs'} or \code{name 'eggs' is -not defined}. Previously the error message was just the missing -attribute name \code{eggs}, and code written to take advantage of this -fact will break in 2.0. - -Some work has been done to make integers and long integers a bit more -interchangeable. In 1.5.2, large-file support was added for Solaris, -to allow reading files larger than 2~GiB; this made the \method{tell()} -method of file objects return a long integer instead of a regular -integer. Some code would subtract two file offsets and attempt to use -the result to multiply a sequence or slice a string, but this raised a -\exception{TypeError}. In 2.0, long integers can be used to multiply -or slice a sequence, and it'll behave as you'd intuitively expect it -to; \code{3L * 'abc'} produces 'abcabcabc', and \code{ -(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in -various contexts where previously only integers were accepted, such -as in the \method{seek()} method of file objects, and in the formats -supported by the \verb|%| operator (\verb|%d|, \verb|%i|, \verb|%x|, -etc.). For example, \code{"\%d" \% 2L**64} will produce the string -\samp{18446744073709551616}. - -The subtlest long integer change of all is that the \function{str()} -of a long integer no longer has a trailing 'L' character, though -\function{repr()} still includes it. The 'L' annoyed many people who -wanted to print long integers that looked just like regular integers, -since they had to go out of their way to chop off the character. This -is no longer a problem in 2.0, but code which does \code{str(longval)[:-1]} and assumes the 'L' is there, will now lose -the final digit. - -Taking the \function{repr()} of a float now uses a different -formatting precision than \function{str()}. \function{repr()} uses -\code{\%.17g} format string for C's \function{sprintf()}, while -\function{str()} uses \code{\%.12g} as before. The effect is that -\function{repr()} may occasionally show more decimal places than -\function{str()}, for certain numbers. -For example, the number 8.1 can't be represented exactly in binary, so -\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is -\code{'8.1'}. - -The \code{-X} command-line option, which turned all standard -exceptions into strings instead of classes, has been removed; the -standard exceptions will now always be classes. The -\module{exceptions} module containing the standard exceptions was -translated from Python to a built-in C module, written by Barry Warsaw -and Fredrik Lundh. - -% Commented out for now -- I don't think anyone will care. -%The pattern and match objects provided by SRE are C types, not Python -%class instances as in 1.5. This means you can no longer inherit from -%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much -%of a problem since no one should have been doing that in the first -%place. - -% ====================================================================== -\section{Extending/Embedding Changes} - -Some of the changes are under the covers, and will only be apparent to -people writing C extension modules or embedding a Python interpreter -in a larger application. If you aren't dealing with Python's C API, -you can safely skip this section. - -The version number of the Python C API was incremented, so C -extensions compiled for 1.5.2 must be recompiled in order to work with -2.0. On Windows, it's not possible for Python 2.0 to import a third -party extension built for Python 1.5.x due to how Windows DLLs work, -so Python will raise an exception and the import will fail. - -Users of Jim Fulton's ExtensionClass module will be pleased to find -out that hooks have been added so that ExtensionClasses are now -supported by \function{isinstance()} and \function{issubclass()}. -This means you no longer have to remember to write code such as -\code{if type(obj) == myExtensionClass}, but can use the more natural -\code{if isinstance(obj, myExtensionClass)}. - -The \file{Python/importdl.c} file, which was a mass of \#ifdefs to -support dynamic loading on many different platforms, was cleaned up -and reorganised by Greg Stein. \file{importdl.c} is now quite small, -and platform-specific code has been moved into a bunch of -\file{Python/dynload_*.c} files. Another cleanup: there were also a -number of \file{my*.h} files in the Include/ directory that held -various portability hacks; they've been merged into a single file, -\file{Include/pyport.h}. - -Vladimir Marangozov's long-awaited malloc restructuring was completed, -to make it easy to have the Python interpreter use a custom allocator -instead of C's standard \function{malloc()}. For documentation, read -the comments in \file{Include/pymem.h} and -\file{Include/objimpl.h}. For the lengthy discussions during which -the interface was hammered out, see the Web archives of the 'patches' -and 'python-dev' lists at python.org. - -Recent versions of the GUSI development environment for MacOS support -POSIX threads. Therefore, Python's POSIX threading support now works -on the Macintosh. Threading support using the user-space GNU \texttt{pth} -library was also contributed. - -Threading support on Windows was enhanced, too. Windows supports -thread locks that use kernel objects only in case of contention; in -the common case when there's no contention, they use simpler functions -which are an order of magnitude faster. A threaded version of Python -1.5.2 on NT is twice as slow as an unthreaded version; with the 2.0 -changes, the difference is only 10\%. These improvements were -contributed by Yakov Markovitch. - -Python 2.0's source now uses only ANSI C prototypes, so compiling Python now -requires an ANSI C compiler, and can no longer be done using a compiler that -only supports K\&R C. - -Previously the Python virtual machine used 16-bit numbers in its -bytecode, limiting the size of source files. In particular, this -affected the maximum size of literal lists and dictionaries in Python -source; occasionally people who are generating Python code would run -into this limit. A patch by Charles G. Waldman raises the limit from -\verb|2^16| to \verb|2^{32}|. - -Three new convenience functions intended for adding constants to a -module's dictionary at module initialization time were added: -\function{PyModule_AddObject()}, \function{PyModule_AddIntConstant()}, -and \function{PyModule_AddStringConstant()}. Each of these functions -takes a module object, a null-terminated C string containing the name -to be added, and a third argument for the value to be assigned to the -name. This third argument is, respectively, a Python object, a C -long, or a C string. - -A wrapper API was added for \UNIX-style signal handlers. -\function{PyOS_getsig()} gets a signal handler and -\function{PyOS_setsig()} will set a new handler. - -% ====================================================================== -\section{Distutils: Making Modules Easy to Install} - -Before Python 2.0, installing modules was a tedious affair -- there -was no way to figure out automatically where Python is installed, or -what compiler options to use for extension modules. Software authors -had to go through an arduous ritual of editing Makefiles and -configuration files, which only really work on \UNIX{} and leave Windows -and MacOS unsupported. Python users faced wildly differing -installation instructions which varied between different extension -packages, which made administering a Python installation something of -a chore. - -The SIG for distribution utilities, shepherded by Greg Ward, has -created the Distutils, a system to make package installation much -easier. They form the \module{distutils} package, a new part of -Python's standard library. In the best case, installing a Python -module from source will require the same steps: first you simply mean -unpack the tarball or zip archive, and the run ``\code{python setup.py -install}''. The platform will be automatically detected, the compiler -will be recognized, C extension modules will be compiled, and the -distribution installed into the proper directory. Optional -command-line arguments provide more control over the installation -process, the distutils package offers many places to override defaults --- separating the build from the install, building or installing in -non-default directories, and more. - -In order to use the Distutils, you need to write a \file{setup.py} -script. For the simple case, when the software contains only .py -files, a minimal \file{setup.py} can be just a few lines long: - -\begin{verbatim} -from distutils.core import setup -setup (name = "foo", version = "1.0", - py_modules = ["module1", "module2"]) -\end{verbatim} - -The \file{setup.py} file isn't much more complicated if the software -consists of a few packages: - -\begin{verbatim} -from distutils.core import setup -setup (name = "foo", version = "1.0", - packages = ["package", "package.subpackage"]) -\end{verbatim} - -A C extension can be the most complicated case; here's an example taken from -the PyXML package: - - -\begin{verbatim} -from distutils.core import setup, Extension - -expat_extension = Extension('xml.parsers.pyexpat', - define_macros = [('XML_NS', None)], - include_dirs = [ 'extensions/expat/xmltok', - 'extensions/expat/xmlparse' ], - sources = [ 'extensions/pyexpat.c', - 'extensions/expat/xmltok/xmltok.c', - 'extensions/expat/xmltok/xmlrole.c', - ] - ) -setup (name = "PyXML", version = "0.5.4", - ext_modules =[ expat_extension ] ) -\end{verbatim} - -The Distutils can also take care of creating source and binary -distributions. The ``sdist'' command, run by ``\code{python setup.py -sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}. -Adding new commands isn't difficult, ``bdist_rpm'' and -``bdist_wininst'' commands have already been contributed to create an -RPM distribution and a Windows installer for the software, -respectively. Commands to create other distribution formats such as -Debian packages and Solaris \file{.pkg} files are in various stages of -development. - -All this is documented in a new manual, \textit{Distributing Python -Modules}, that joins the basic set of Python documentation. - -% ====================================================================== -\section{XML Modules} - -Python 1.5.2 included a simple XML parser in the form of the -\module{xmllib} module, contributed by Sjoerd Mullender. Since -1.5.2's release, two different interfaces for processing XML have -become common: SAX2 (version 2 of the Simple API for XML) provides an -event-driven interface with some similarities to \module{xmllib}, and -the DOM (Document Object Model) provides a tree-based interface, -transforming an XML document into a tree of nodes that can be -traversed and modified. Python 2.0 includes a SAX2 interface and a -stripped-down DOM interface as part of the \module{xml} package. -Here we will give a brief overview of these new interfaces; consult -the Python documentation or the source code for complete details. -The Python XML SIG is also working on improved documentation. - -\subsection{SAX2 Support} - -SAX defines an event-driven interface for parsing XML. To use SAX, -you must write a SAX handler class. Handler classes inherit from -various classes provided by SAX, and override various methods that -will then be called by the XML parser. For example, the -\method{startElement} and \method{endElement} methods are called for -every starting and end tag encountered by the parser, the -\method{characters()} method is called for every chunk of character -data, and so forth. - -The advantage of the event-driven approach is that the whole -document doesn't have to be resident in memory at any one time, which -matters if you are processing really huge documents. However, writing -the SAX handler class can get very complicated if you're trying to -modify the document structure in some elaborate way. - -For example, this little example program defines a handler that prints -a message for every starting and ending tag, and then parses the file -\file{hamlet.xml} using it: - -\begin{verbatim} -from xml import sax - -class SimpleHandler(sax.ContentHandler): - def startElement(self, name, attrs): - print 'Start of element:', name, attrs.keys() - - def endElement(self, name): - print 'End of element:', name - -# Create a parser object -parser = sax.make_parser() - -# Tell it what handler to use -handler = SimpleHandler() -parser.setContentHandler( handler ) - -# Parse a file! -parser.parse( 'hamlet.xml' ) -\end{verbatim} - -For more information, consult the Python documentation, or the XML -HOWTO at \url{http://pyxml.sourceforge.net/topics/howto/xml-howto.html}. - -\subsection{DOM Support} - -The Document Object Model is a tree-based representation for an XML -document. A top-level \class{Document} instance is the root of the -tree, and has a single child which is the top-level \class{Element} -instance. This \class{Element} has children nodes representing -character data and any sub-elements, which may have further children -of their own, and so forth. Using the DOM you can traverse the -resulting tree any way you like, access element and attribute values, -insert and delete nodes, and convert the tree back into XML. - -The DOM is useful for modifying XML documents, because you can create -a DOM tree, modify it by adding new nodes or rearranging subtrees, and -then produce a new XML document as output. You can also construct a -DOM tree manually and convert it to XML, which can be a more flexible -way of producing XML output than simply writing -\code{<tag1>}...\code{</tag1>} to a file. - -The DOM implementation included with Python lives in the -\module{xml.dom.minidom} module. It's a lightweight implementation of -the Level 1 DOM with support for XML namespaces. The -\function{parse()} and \function{parseString()} convenience -functions are provided for generating a DOM tree: - -\begin{verbatim} -from xml.dom import minidom -doc = minidom.parse('hamlet.xml') -\end{verbatim} - -\code{doc} is a \class{Document} instance. \class{Document}, like all -the other DOM classes such as \class{Element} and \class{Text}, is a -subclass of the \class{Node} base class. All the nodes in a DOM tree -therefore support certain common methods, such as \method{toxml()} -which returns a string containing the XML representation of the node -and its children. Each class also has special methods of its own; for -example, \class{Element} and \class{Document} instances have a method -to find all child elements with a given tag name. Continuing from the -previous 2-line example: - -\begin{verbatim} -perslist = doc.getElementsByTagName( 'PERSONA' ) -print perslist[0].toxml() -print perslist[1].toxml() -\end{verbatim} - -For the \textit{Hamlet} XML file, the above few lines output: - -\begin{verbatim} -<PERSONA>CLAUDIUS, king of Denmark. </PERSONA> -<PERSONA>HAMLET, son to the late, and nephew to the present king.</PERSONA> -\end{verbatim} - -The root element of the document is available as -\code{doc.documentElement}, and its children can be easily modified -by deleting, adding, or removing nodes: - -\begin{verbatim} -root = doc.documentElement - -# Remove the first child -root.removeChild( root.childNodes[0] ) - -# Move the new first child to the end -root.appendChild( root.childNodes[0] ) - -# Insert the new first child (originally, -# the third child) before the 20th child. -root.insertBefore( root.childNodes[0], root.childNodes[20] ) -\end{verbatim} - -Again, I will refer you to the Python documentation for a complete -listing of the different \class{Node} classes and their various methods. - -\subsection{Relationship to PyXML} - -The XML Special Interest Group has been working on XML-related Python -code for a while. Its code distribution, called PyXML, is available -from the SIG's Web pages at \url{http://www.python.org/sigs/xml-sig/}. -The PyXML distribution also used the package name \samp{xml}. If -you've written programs that used PyXML, you're probably wondering -about its compatibility with the 2.0 \module{xml} package. - -The answer is that Python 2.0's \module{xml} package isn't compatible -with PyXML, but can be made compatible by installing a recent version -PyXML. Many applications can get by with the XML support that is -included with Python 2.0, but more complicated applications will -require that the full PyXML package will be installed. When -installed, PyXML versions 0.6.0 or greater will replace the -\module{xml} package shipped with Python, and will be a strict -superset of the standard package, adding a bunch of additional -features. Some of the additional features in PyXML include: - -\begin{itemize} -\item 4DOM, a full DOM implementation -from FourThought, Inc. -\item The xmlproc validating parser, written by Lars Marius Garshol. -\item The \module{sgmlop} parser accelerator module, written by Fredrik Lundh. -\end{itemize} - -% ====================================================================== -\section{Module changes} - -Lots of improvements and bugfixes were made to Python's extensive -standard library; some of the affected modules include -\module{readline}, \module{ConfigParser}, \module{cgi}, -\module{calendar}, \module{posix}, \module{readline}, \module{xmllib}, -\module{aifc}, \module{chunk, wave}, \module{random}, \module{shelve}, -and \module{nntplib}. Consult the CVS logs for the exact -patch-by-patch details. - -Brian Gallew contributed OpenSSL support for the \module{socket} -module. OpenSSL is an implementation of the Secure Socket Layer, -which encrypts the data being sent over a socket. When compiling -Python, you can edit \file{Modules/Setup} to include SSL support, -which adds an additional function to the \module{socket} module: -\function{socket.ssl(\var{socket}, \var{keyfile}, \var{certfile})}, -which takes a socket object and returns an SSL socket. The -\module{httplib} and \module{urllib} modules were also changed to -support ``https://'' URLs, though no one has implemented FTP or SMTP -over SSL. - -The \module{httplib} module has been rewritten by Greg Stein to -support HTTP/1.1. Backward compatibility with the 1.5 version of -\module{httplib} is provided, though using HTTP/1.1 features such as -pipelining will require rewriting code to use a different set of -interfaces. - -The \module{Tkinter} module now supports Tcl/Tk version 8.1, 8.2, or -8.3, and support for the older 7.x versions has been dropped. The -Tkinter module now supports displaying Unicode strings in Tk widgets. -Also, Fredrik Lundh contributed an optimization which makes operations -like \code{create_line} and \code{create_polygon} much faster, -especially when using lots of coordinates. - -The \module{curses} module has been greatly extended, starting from -Oliver Andrich's enhanced version, to provide many additional -functions from ncurses and SYSV curses, such as colour, alternative -character set support, pads, and mouse support. This means the module -is no longer compatible with operating systems that only have BSD -curses, but there don't seem to be any currently maintained OSes that -fall into this category. - -As mentioned in the earlier discussion of 2.0's Unicode support, the -underlying implementation of the regular expressions provided by the -\module{re} module has been changed. SRE, a new regular expression -engine written by Fredrik Lundh and partially funded by Hewlett -Packard, supports matching against both 8-bit strings and Unicode -strings. - -% ====================================================================== -\section{New modules} - -A number of new modules were added. We'll simply list them with brief -descriptions; consult the 2.0 documentation for the details of a -particular module. - -\begin{itemize} - -\item{\module{atexit}}: -For registering functions to be called before the Python interpreter exits. -Code that currently sets -\code{sys.exitfunc} directly should be changed to -use the \module{atexit} module instead, importing \module{atexit} -and calling \function{atexit.register()} with -the function to be called on exit. -(Contributed by Skip Montanaro.) - -\item{\module{codecs}, \module{encodings}, \module{unicodedata}:} Added as part of the new Unicode support. - -\item{\module{filecmp}:} Supersedes the old \module{cmp}, \module{cmpcache} and -\module{dircmp} modules, which have now become deprecated. -(Contributed by Gordon MacMillan and Moshe Zadka.) - -\item{\module{gettext}:} This module provides internationalization -(I18N) and localization (L10N) support for Python programs by -providing an interface to the GNU gettext message catalog library. -(Integrated by Barry Warsaw, from separate contributions by Martin -von~L\"owis, Peter Funk, and James Henstridge.) - -\item{\module{linuxaudiodev}:} Support for the \file{/dev/audio} -device on Linux, a twin to the existing \module{sunaudiodev} module. -(Contributed by Peter Bosch, with fixes by Jeremy Hylton.) - -\item{\module{mmap}:} An interface to memory-mapped files on both -Windows and \UNIX. A file's contents can be mapped directly into -memory, at which point it behaves like a mutable string, so its -contents can be read and modified. They can even be passed to -functions that expect ordinary strings, such as the \module{re} -module. (Contributed by Sam Rushing, with some extensions by -A.M. Kuchling.) - -\item{\module{pyexpat}:} An interface to the Expat XML parser. -(Contributed by Paul Prescod.) - -\item{\module{robotparser}:} Parse a \file{robots.txt} file, which is -used for writing Web spiders that politely avoid certain areas of a -Web site. The parser accepts the contents of a \file{robots.txt} file, -builds a set of rules from it, and can then answer questions about -the fetchability of a given URL. (Contributed by Skip Montanaro.) - -\item{\module{tabnanny}:} A module/script to -check Python source code for ambiguous indentation. -(Contributed by Tim Peters.) - -\item{\module{UserString}:} A base class useful for deriving objects that behave like strings. - -\item{\module{webbrowser}:} A module that provides a platform independent -way to launch a web browser on a specific URL. For each platform, various -browsers are tried in a specific order. The user can alter which browser -is launched by setting the \var{BROWSER} environment variable. -(Originally inspired by Eric S. Raymond's patch to \module{urllib} -which added similar functionality, but -the final module comes from code originally -implemented by Fred Drake as \file{Tools/idle/BrowserControl.py}, -and adapted for the standard library by Fred.) - -\item{\module{_winreg}:} An interface to the -Windows registry. \module{_winreg} is an adaptation of functions that -have been part of PythonWin since 1995, but has now been added to the core -distribution, and enhanced to support Unicode. -\module{_winreg} was written by Bill Tutt and Mark Hammond. - -\item{\module{zipfile}:} A module for reading and writing ZIP-format -archives. These are archives produced by \program{PKZIP} on -DOS/Windows or \program{zip} on \UNIX, not to be confused with -\program{gzip}-format files (which are supported by the \module{gzip} -module) -(Contributed by James C. Ahlstrom.) - -\item{\module{imputil}:} A module that provides a simpler way for -writing customised import hooks, in comparison to the existing -\module{ihooks} module. (Implemented by Greg Stein, with much -discussion on python-dev along the way.) - -\end{itemize} - -% ====================================================================== -\section{IDLE Improvements} - -IDLE is the official Python cross-platform IDE, written using Tkinter. -Python 2.0 includes IDLE 0.6, which adds a number of new features and -improvements. A partial list: - -\begin{itemize} -\item UI improvements and optimizations, -especially in the area of syntax highlighting and auto-indentation. - -\item The class browser now shows more information, such as the top -level functions in a module. - -\item Tab width is now a user settable option. When opening an existing Python -file, IDLE automatically detects the indentation conventions, and adapts. - -\item There is now support for calling browsers on various platforms, -used to open the Python documentation in a browser. - -\item IDLE now has a command line, which is largely similar to -the vanilla Python interpreter. - -\item Call tips were added in many places. - -\item IDLE can now be installed as a package. - -\item In the editor window, there is now a line/column bar at the bottom. - -\item Three new keystroke commands: Check module (Alt-F5), Import -module (F5) and Run script (Ctrl-F5). - -\end{itemize} - -% ====================================================================== -\section{Deleted and Deprecated Modules} - -A few modules have been dropped because they're obsolete, or because -there are now better ways to do the same thing. The \module{stdwin} -module is gone; it was for a platform-independent windowing toolkit -that's no longer developed. - -A number of modules have been moved to the -\file{lib-old} subdirectory: -\module{cmp}, \module{cmpcache}, \module{dircmp}, \module{dump}, -\module{find}, \module{grep}, \module{packmail}, -\module{poly}, \module{util}, \module{whatsound}, \module{zmod}. -If you have code which relies on a module that's been moved to -\file{lib-old}, you can simply add that directory to \code{sys.path} -to get them back, but you're encouraged to update any code that uses -these modules. - -\section{Acknowledgements} - -The authors would like to thank the following people for offering -suggestions on various drafts of this article: David Bolen, Mark -Hammond, Gregg Hauser, Jeremy Hylton, Fredrik Lundh, Detlef Lannert, -Aahz Maruch, Skip Montanaro, Vladimir Marangozov, Tobias Polzin, Guido -van Rossum, Neil Schemenauer, and Russ Schmidt. - -\end{document} diff --git a/Doc/whatsnew/whatsnew21.tex b/Doc/whatsnew/whatsnew21.tex deleted file mode 100644 index 67cbbe4..0000000 --- a/Doc/whatsnew/whatsnew21.tex +++ /dev/null @@ -1,868 +0,0 @@ -\documentclass{howto} - -\usepackage{distutils} - -% $Id$ - -\title{What's New in Python 2.1} -\release{1.01} -\author{A.M. Kuchling} -\authoraddress{ - \strong{Python Software Foundation}\\ - Email: \email{amk@amk.ca} -} -\begin{document} -\maketitle\tableofcontents - -\section{Introduction} - -This article explains the new features in Python 2.1. While there aren't as -many changes in 2.1 as there were in Python 2.0, there are still some -pleasant surprises in store. 2.1 is the first release to be steered -through the use of Python Enhancement Proposals, or PEPs, so most of -the sizable changes have accompanying PEPs that provide more complete -documentation and a design rationale for the change. This article -doesn't attempt to document the new features completely, but simply -provides an overview of the new features for Python programmers. -Refer to the Python 2.1 documentation, or to the specific PEP, for -more details about any new feature that particularly interests you. - -One recent goal of the Python development team has been to accelerate -the pace of new releases, with a new release coming every 6 to 9 -months. 2.1 is the first release to come out at this faster pace, with -the first alpha appearing in January, 3 months after the final version -of 2.0 was released. - -The final release of Python 2.1 was made on April 17, 2001. - -%====================================================================== -\section{PEP 227: Nested Scopes} - -The largest change in Python 2.1 is to Python's scoping rules. In -Python 2.0, at any given time there are at most three namespaces used -to look up variable names: local, module-level, and the built-in -namespace. This often surprised people because it didn't match their -intuitive expectations. For example, a nested recursive function -definition doesn't work: - -\begin{verbatim} -def f(): - ... - def g(value): - ... - return g(value-1) + 1 - ... -\end{verbatim} - -The function \function{g()} will always raise a \exception{NameError} -exception, because the binding of the name \samp{g} isn't in either -its local namespace or in the module-level namespace. This isn't much -of a problem in practice (how often do you recursively define interior -functions like this?), but this also made using the \keyword{lambda} -statement clumsier, and this was a problem in practice. In code which -uses \keyword{lambda} you can often find local variables being copied -by passing them as the default values of arguments. - -\begin{verbatim} -def find(self, name): - "Return list of any entries equal to 'name'" - L = filter(lambda x, name=name: x == name, - self.list_attribute) - return L -\end{verbatim} - -The readability of Python code written in a strongly functional style -suffers greatly as a result. - -The most significant change to Python 2.1 is that static scoping has -been added to the language to fix this problem. As a first effect, -the \code{name=name} default argument is now unnecessary in the above -example. Put simply, when a given variable name is not assigned a -value within a function (by an assignment, or the \keyword{def}, -\keyword{class}, or \keyword{import} statements), references to the -variable will be looked up in the local namespace of the enclosing -scope. A more detailed explanation of the rules, and a dissection of -the implementation, can be found in the PEP. - -This change may cause some compatibility problems for code where the -same variable name is used both at the module level and as a local -variable within a function that contains further function definitions. -This seems rather unlikely though, since such code would have been -pretty confusing to read in the first place. - -One side effect of the change is that the \code{from \var{module} -import *} and \keyword{exec} statements have been made illegal inside -a function scope under certain conditions. The Python reference -manual has said all along that \code{from \var{module} import *} is -only legal at the top level of a module, but the CPython interpreter -has never enforced this before. As part of the implementation of -nested scopes, the compiler which turns Python source into bytecodes -has to generate different code to access variables in a containing -scope. \code{from \var{module} import *} and \keyword{exec} make it -impossible for the compiler to figure this out, because they add names -to the local namespace that are unknowable at compile time. -Therefore, if a function contains function definitions or -\keyword{lambda} expressions with free variables, the compiler will -flag this by raising a \exception{SyntaxError} exception. - -To make the preceding explanation a bit clearer, here's an example: - -\begin{verbatim} -x = 1 -def f(): - # The next line is a syntax error - exec 'x=2' - def g(): - return x -\end{verbatim} - -Line 4 containing the \keyword{exec} statement is a syntax error, -since \keyword{exec} would define a new local variable named \samp{x} -whose value should be accessed by \function{g()}. - -This shouldn't be much of a limitation, since \keyword{exec} is rarely -used in most Python code (and when it is used, it's often a sign of a -poor design anyway). - -Compatibility concerns have led to nested scopes being introduced -gradually; in Python 2.1, they aren't enabled by default, but can be -turned on within a module by using a future statement as described in -PEP 236. (See the following section for further discussion of PEP -236.) In Python 2.2, nested scopes will become the default and there -will be no way to turn them off, but users will have had all of 2.1's -lifetime to fix any breakage resulting from their introduction. - -\begin{seealso} - -\seepep{227}{Statically Nested Scopes}{Written and implemented by -Jeremy Hylton.} - -\end{seealso} - - -%====================================================================== -\section{PEP 236: __future__ Directives} - -The reaction to nested scopes was widespread concern about the dangers -of breaking code with the 2.1 release, and it was strong enough to -make the Pythoneers take a more conservative approach. This approach -consists of introducing a convention for enabling optional -functionality in release N that will become compulsory in release N+1. - -The syntax uses a \code{from...import} statement using the reserved -module name \module{__future__}. Nested scopes can be enabled by the -following statement: - -\begin{verbatim} -from __future__ import nested_scopes -\end{verbatim} - -While it looks like a normal \keyword{import} statement, it's not; -there are strict rules on where such a future statement can be put. -They can only be at the top of a module, and must precede any Python -code or regular \keyword{import} statements. This is because such -statements can affect how the Python bytecode compiler parses code and -generates bytecode, so they must precede any statement that will -result in bytecodes being produced. - -\begin{seealso} - -\seepep{236}{Back to the \module{__future__}}{Written by Tim Peters, -and primarily implemented by Jeremy Hylton.} - -\end{seealso} - -%====================================================================== -\section{PEP 207: Rich Comparisons} - -In earlier versions, Python's support for implementing comparisons on -user-defined classes and extension types was quite simple. Classes -could implement a \method{__cmp__} method that was given two instances -of a class, and could only return 0 if they were equal or +1 or -1 if -they weren't; the method couldn't raise an exception or return -anything other than a Boolean value. Users of Numeric Python often -found this model too weak and restrictive, because in the -number-crunching programs that numeric Python is used for, it would be -more useful to be able to perform elementwise comparisons of two -matrices, returning a matrix containing the results of a given -comparison for each element. If the two matrices are of different -sizes, then the compare has to be able to raise an exception to signal -the error. - -In Python 2.1, rich comparisons were added in order to support this -need. Python classes can now individually overload each of the -\code{<}, \code{<=}, \code{>}, \code{>=}, \code{==}, and \code{!=} -operations. The new magic method names are: - -\begin{tableii}{c|l}{code}{Operation}{Method name} - \lineii{<}{\method{__lt__}} \lineii{<=}{\method{__le__}} - \lineii{>}{\method{__gt__}} \lineii{>=}{\method{__ge__}} - \lineii{==}{\method{__eq__}} \lineii{!=}{\method{__ne__}} - \end{tableii} - -(The magic methods are named after the corresponding Fortran operators -\code{.LT.}. \code{.LE.}, \&c. Numeric programmers are almost -certainly quite familiar with these names and will find them easy to -remember.) - -Each of these magic methods is of the form \code{\var{method}(self, -other)}, where \code{self} will be the object on the left-hand side of -the operator, while \code{other} will be the object on the right-hand -side. For example, the expression \code{A < B} will cause -\code{A.__lt__(B)} to be called. - -Each of these magic methods can return anything at all: a Boolean, a -matrix, a list, or any other Python object. Alternatively they can -raise an exception if the comparison is impossible, inconsistent, or -otherwise meaningless. - -The built-in \function{cmp(A,B)} function can use the rich comparison -machinery, and now accepts an optional argument specifying which -comparison operation to use; this is given as one of the strings -\code{"<"}, \code{"<="}, \code{">"}, \code{">="}, \code{"=="}, or -\code{"!="}. If called without the optional third argument, -\function{cmp()} will only return -1, 0, or +1 as in previous versions -of Python; otherwise it will call the appropriate method and can -return any Python object. - -There are also corresponding changes of interest to C programmers; -there's a new slot \code{tp_richcmp} in type objects and an API for -performing a given rich comparison. I won't cover the C API here, but -will refer you to PEP 207, or to 2.1's C API documentation, for the -full list of related functions. - -\begin{seealso} - -\seepep{207}{Rich Comparisions}{Written by Guido van Rossum, heavily -based on earlier work by David Ascher, and implemented by Guido van -Rossum.} - -\end{seealso} - -%====================================================================== -\section{PEP 230: Warning Framework} - -Over its 10 years of existence, Python has accumulated a certain -number of obsolete modules and features along the way. It's difficult -to know when a feature is safe to remove, since there's no way of -knowing how much code uses it --- perhaps no programs depend on the -feature, or perhaps many do. To enable removing old features in a -more structured way, a warning framework was added. When the Python -developers want to get rid of a feature, it will first trigger a -warning in the next version of Python. The following Python version -can then drop the feature, and users will have had a full release -cycle to remove uses of the old feature. - -Python 2.1 adds the warning framework to be used in this scheme. It -adds a \module{warnings} module that provide functions to issue -warnings, and to filter out warnings that you don't want to be -displayed. Third-party modules can also use this framework to -deprecate old features that they no longer wish to support. - -For example, in Python 2.1 the \module{regex} module is deprecated, so -importing it causes a warning to be printed: - -\begin{verbatim} ->>> import regex -__main__:1: DeprecationWarning: the regex module - is deprecated; please use the re module ->>> -\end{verbatim} - -Warnings can be issued by calling the \function{warnings.warn} -function: - -\begin{verbatim} -warnings.warn("feature X no longer supported") -\end{verbatim} - -The first parameter is the warning message; an additional optional -parameters can be used to specify a particular warning category. - -Filters can be added to disable certain warnings; a regular expression -pattern can be applied to the message or to the module name in order -to suppress a warning. For example, you may have a program that uses -the \module{regex} module and not want to spare the time to convert it -to use the \module{re} module right now. The warning can be -suppressed by calling - -\begin{verbatim} -import warnings -warnings.filterwarnings(action = 'ignore', - message='.*regex module is deprecated', - category=DeprecationWarning, - module = '__main__') -\end{verbatim} - -This adds a filter that will apply only to warnings of the class -\class{DeprecationWarning} triggered in the \module{__main__} module, -and applies a regular expression to only match the message about the -\module{regex} module being deprecated, and will cause such warnings -to be ignored. Warnings can also be printed only once, printed every -time the offending code is executed, or turned into exceptions that -will cause the program to stop (unless the exceptions are caught in -the usual way, of course). - -Functions were also added to Python's C API for issuing warnings; -refer to PEP 230 or to Python's API documentation for the details. - -\begin{seealso} - -\seepep{5}{Guidelines for Language Evolution}{Written -by Paul Prescod, to specify procedures to be followed when removing -old features from Python. The policy described in this PEP hasn't -been officially adopted, but the eventual policy probably won't be too -different from Prescod's proposal.} - -\seepep{230}{Warning Framework}{Written and implemented by Guido van -Rossum.} - -\end{seealso} - -%====================================================================== -\section{PEP 229: New Build System} - -When compiling Python, the user had to go in and edit the -\file{Modules/Setup} file in order to enable various additional -modules; the default set is relatively small and limited to modules -that compile on most \UNIX{} platforms. This means that on \Unix{} -platforms with many more features, most notably Linux, Python -installations often don't contain all useful modules they could. - -Python 2.0 added the Distutils, a set of modules for distributing and -installing extensions. In Python 2.1, the Distutils are used to -compile much of the standard library of extension modules, -autodetecting which ones are supported on the current machine. It's -hoped that this will make Python installations easier and more -featureful. - -Instead of having to edit the \file{Modules/Setup} file in order to -enable modules, a \file{setup.py} script in the top directory of the -Python source distribution is run at build time, and attempts to -discover which modules can be enabled by examining the modules and -header files on the system. If a module is configured in -\file{Modules/Setup}, the \file{setup.py} script won't attempt to -compile that module and will defer to the \file{Modules/Setup} file's -contents. This provides a way to specific any strange command-line -flags or libraries that are required for a specific platform. - -In another far-reaching change to the build mechanism, Neil -Schemenauer restructured things so Python now uses a single makefile -that isn't recursive, instead of makefiles in the top directory and in -each of the \file{Python/}, \file{Parser/}, \file{Objects/}, and -\file{Modules/} subdirectories. This makes building Python faster -and also makes hacking the Makefiles clearer and simpler. - -\begin{seealso} - -\seepep{229}{Using Distutils to Build Python}{Written -and implemented by A.M. Kuchling.} - -\end{seealso} - -%====================================================================== -\section{PEP 205: Weak References} - -Weak references, available through the \module{weakref} module, are a -minor but useful new data type in the Python programmer's toolbox. - -Storing a reference to an object (say, in a dictionary or a list) has -the side effect of keeping that object alive forever. There are a few -specific cases where this behaviour is undesirable, object caches -being the most common one, and another being circular references in -data structures such as trees. - -For example, consider a memoizing function that caches the results of -another function \function{f(\var{x})} by storing the function's -argument and its result in a dictionary: - -\begin{verbatim} -_cache = {} -def memoize(x): - if _cache.has_key(x): - return _cache[x] - - retval = f(x) - - # Cache the returned object - _cache[x] = retval - - return retval -\end{verbatim} - -This version works for simple things such as integers, but it has a -side effect; the \code{_cache} dictionary holds a reference to the -return values, so they'll never be deallocated until the Python -process exits and cleans up This isn't very noticeable for integers, -but if \function{f()} returns an object, or a data structure that -takes up a lot of memory, this can be a problem. - -Weak references provide a way to implement a cache that won't keep -objects alive beyond their time. If an object is only accessible -through weak references, the object will be deallocated and the weak -references will now indicate that the object it referred to no longer -exists. A weak reference to an object \var{obj} is created by calling -\code{wr = weakref.ref(\var{obj})}. The object being referred to is -returned by calling the weak reference as if it were a function: -\code{wr()}. It will return the referenced object, or \code{None} if -the object no longer exists. - -This makes it possible to write a \function{memoize()} function whose -cache doesn't keep objects alive, by storing weak references in the -cache. - -\begin{verbatim} -_cache = {} -def memoize(x): - if _cache.has_key(x): - obj = _cache[x]() - # If weak reference object still exists, - # return it - if obj is not None: return obj - - retval = f(x) - - # Cache a weak reference - _cache[x] = weakref.ref(retval) - - return retval -\end{verbatim} - -The \module{weakref} module also allows creating proxy objects which -behave like weak references --- an object referenced only by proxy -objects is deallocated -- but instead of requiring an explicit call to -retrieve the object, the proxy transparently forwards all operations -to the object as long as the object still exists. If the object is -deallocated, attempting to use a proxy will cause a -\exception{weakref.ReferenceError} exception to be raised. - -\begin{verbatim} -proxy = weakref.proxy(obj) -proxy.attr # Equivalent to obj.attr -proxy.meth() # Equivalent to obj.meth() -del obj -proxy.attr # raises weakref.ReferenceError -\end{verbatim} - -\begin{seealso} - -\seepep{205}{Weak References}{Written and implemented by -Fred~L. Drake,~Jr.} - -\end{seealso} - -%====================================================================== -\section{PEP 232: Function Attributes} - -In Python 2.1, functions can now have arbitrary information attached -to them. People were often using docstrings to hold information about -functions and methods, because the \code{__doc__} attribute was the -only way of attaching any information to a function. For example, in -the Zope Web application server, functions are marked as safe for -public access by having a docstring, and in John Aycock's SPARK -parsing framework, docstrings hold parts of the BNF grammar to be -parsed. This overloading is unfortunate, since docstrings are really -intended to hold a function's documentation; for example, it means you -can't properly document functions intended for private use in Zope. - -Arbitrary attributes can now be set and retrieved on functions using the -regular Python syntax: - -\begin{verbatim} -def f(): pass - -f.publish = 1 -f.secure = 1 -f.grammar = "A ::= B (C D)*" -\end{verbatim} - -The dictionary containing attributes can be accessed as the function's -\member{__dict__}. Unlike the \member{__dict__} attribute of class -instances, in functions you can actually assign a new dictionary to -\member{__dict__}, though the new value is restricted to a regular -Python dictionary; you \emph{can't} be tricky and set it to a -\class{UserDict} instance, or any other random object that behaves -like a mapping. - -\begin{seealso} - -\seepep{232}{Function Attributes}{Written and implemented by Barry -Warsaw.} - -\end{seealso} - - -%====================================================================== - -\section{PEP 235: Importing Modules on Case-Insensitive Platforms} - -Some operating systems have filesystems that are case-insensitive, -MacOS and Windows being the primary examples; on these systems, it's -impossible to distinguish the filenames \samp{FILE.PY} and -\samp{file.py}, even though they do store the file's name -in its original case (they're case-preserving, too). - -In Python 2.1, the \keyword{import} statement will work to simulate -case-sensitivity on case-insensitive platforms. Python will now -search for the first case-sensitive match by default, raising an -\exception{ImportError} if no such file is found, so \code{import file} -will not import a module named \samp{FILE.PY}. Case-insensitive -matching can be requested by setting the \envvar{PYTHONCASEOK} environment -variable before starting the Python interpreter. - -%====================================================================== -\section{PEP 217: Interactive Display Hook} - -When using the Python interpreter interactively, the output of -commands is displayed using the built-in \function{repr()} function. -In Python 2.1, the variable \function{sys.displayhook} can be set to a -callable object which will be called instead of \function{repr()}. -For example, you can set it to a special pretty-printing function: - -\begin{verbatim} ->>> # Create a recursive data structure -... L = [1,2,3] ->>> L.append(L) ->>> L # Show Python's default output -[1, 2, 3, [...]] ->>> # Use pprint.pprint() as the display function -... import sys, pprint ->>> sys.displayhook = pprint.pprint ->>> L -[1, 2, 3, <Recursion on list with id=135143996>] ->>> -\end{verbatim} - -\begin{seealso} - -\seepep{217}{Display Hook for Interactive Use}{Written and implemented -by Moshe Zadka.} - -\end{seealso} - -%====================================================================== -\section{PEP 208: New Coercion Model} - -How numeric coercion is done at the C level was significantly -modified. This will only affect the authors of C extensions to -Python, allowing them more flexibility in writing extension types that -support numeric operations. - -Extension types can now set the type flag \code{Py_TPFLAGS_CHECKTYPES} -in their \code{PyTypeObject} structure to indicate that they support -the new coercion model. In such extension types, the numeric slot -functions can no longer assume that they'll be passed two arguments of -the same type; instead they may be passed two arguments of differing -types, and can then perform their own internal coercion. If the slot -function is passed a type it can't handle, it can indicate the failure -by returning a reference to the \code{Py_NotImplemented} singleton -value. The numeric functions of the other type will then be tried, -and perhaps they can handle the operation; if the other type also -returns \code{Py_NotImplemented}, then a \exception{TypeError} will be -raised. Numeric methods written in Python can also return -\code{Py_NotImplemented}, causing the interpreter to act as if the -method did not exist (perhaps raising a \exception{TypeError}, perhaps -trying another object's numeric methods). - -\begin{seealso} - -\seepep{208}{Reworking the Coercion Model}{Written and implemented by -Neil Schemenauer, heavily based upon earlier work by Marc-Andr\'e -Lemburg. Read this to understand the fine points of how numeric -operations will now be processed at the C level.} - -\end{seealso} - -%====================================================================== -\section{PEP 241: Metadata in Python Packages} - -A common complaint from Python users is that there's no single catalog -of all the Python modules in existence. T.~Middleton's Vaults of -Parnassus at \url{http://www.vex.net/parnassus/} are the largest -catalog of Python modules, but registering software at the Vaults is -optional, and many people don't bother. - -As a first small step toward fixing the problem, Python software -packaged using the Distutils \command{sdist} command will include a -file named \file{PKG-INFO} containing information about the package -such as its name, version, and author (metadata, in cataloguing -terminology). PEP 241 contains the full list of fields that can be -present in the \file{PKG-INFO} file. As people began to package their -software using Python 2.1, more and more packages will include -metadata, making it possible to build automated cataloguing systems -and experiment with them. With the result experience, perhaps it'll -be possible to design a really good catalog and then build support for -it into Python 2.2. For example, the Distutils \command{sdist} -and \command{bdist_*} commands could support a \option{upload} option -that would automatically upload your package to a catalog server. - -You can start creating packages containing \file{PKG-INFO} even if -you're not using Python 2.1, since a new release of the Distutils will -be made for users of earlier Python versions. Version 1.0.2 of the -Distutils includes the changes described in PEP 241, as well as -various bugfixes and enhancements. It will be available from -the Distutils SIG at \url{http://www.python.org/sigs/distutils-sig/}. - -\begin{seealso} - -\seepep{241}{Metadata for Python Software Packages}{Written and -implemented by A.M. Kuchling.} - -\seepep{243}{Module Repository Upload Mechanism}{Written by Sean -Reifschneider, this draft PEP describes a proposed mechanism for uploading -Python packages to a central server. -} - -\end{seealso} - -%====================================================================== -\section{New and Improved Modules} - -\begin{itemize} - -\item Ka-Ping Yee contributed two new modules: \module{inspect.py}, a -module for getting information about live Python code, and -\module{pydoc.py}, a module for interactively converting docstrings to -HTML or text. As a bonus, \file{Tools/scripts/pydoc}, which is now -automatically installed, uses \module{pydoc.py} to display -documentation given a Python module, package, or class name. For -example, \samp{pydoc xml.dom} displays the following: - -\begin{verbatim} -Python Library Documentation: package xml.dom in xml - -NAME - xml.dom - W3C Document Object Model implementation for Python. - -FILE - /usr/local/lib/python2.1/xml/dom/__init__.pyc - -DESCRIPTION - The Python mapping of the Document Object Model is documented in the - Python Library Reference in the section on the xml.dom package. - - This package contains the following modules: - ... -\end{verbatim} - -\file{pydoc} also includes a Tk-based interactive help browser. -\file{pydoc} quickly becomes addictive; try it out! - -\item Two different modules for unit testing were added to the -standard library. The \module{doctest} module, contributed by Tim -Peters, provides a testing framework based on running embedded -examples in docstrings and comparing the results against the expected -output. PyUnit, contributed by Steve Purcell, is a unit testing -framework inspired by JUnit, which was in turn an adaptation of Kent -Beck's Smalltalk testing framework. See -\url{http://pyunit.sourceforge.net/} for more information about -PyUnit. - -\item The \module{difflib} module contains a class, -\class{SequenceMatcher}, which compares two sequences and computes the -changes required to transform one sequence into the other. For -example, this module can be used to write a tool similar to the \UNIX{} -\program{diff} program, and in fact the sample program -\file{Tools/scripts/ndiff.py} demonstrates how to write such a script. - -\item \module{curses.panel}, a wrapper for the panel library, part of -ncurses and of SYSV curses, was contributed by Thomas Gellekum. The -panel library provides windows with the additional feature of depth. -Windows can be moved higher or lower in the depth ordering, and the -panel library figures out where panels overlap and which sections are -visible. - -\item The PyXML package has gone through a few releases since Python -2.0, and Python 2.1 includes an updated version of the \module{xml} -package. Some of the noteworthy changes include support for Expat 1.2 -and later versions, the ability for Expat parsers to handle files in -any encoding supported by Python, and various bugfixes for SAX, DOM, -and the \module{minidom} module. - -\item Ping also contributed another hook for handling uncaught -exceptions. \function{sys.excepthook} can be set to a callable -object. When an exception isn't caught by any -\keyword{try}...\keyword{except} blocks, the exception will be passed -to \function{sys.excepthook}, which can then do whatever it likes. At -the Ninth Python Conference, Ping demonstrated an application for this -hook: printing an extended traceback that not only lists the stack -frames, but also lists the function arguments and the local variables -for each frame. - -\item Various functions in the \module{time} module, such as -\function{asctime()} and \function{localtime()}, require a floating -point argument containing the time in seconds since the epoch. The -most common use of these functions is to work with the current time, -so the floating point argument has been made optional; when a value -isn't provided, the current time will be used. For example, log file -entries usually need a string containing the current time; in Python -2.1, \code{time.asctime()} can be used, instead of the lengthier -\code{time.asctime(time.localtime(time.time()))} that was previously -required. - -This change was proposed and implemented by Thomas Wouters. - -\item The \module{ftplib} module now defaults to retrieving files in -passive mode, because passive mode is more likely to work from behind -a firewall. This request came from the Debian bug tracking system, -since other Debian packages use \module{ftplib} to retrieve files and -then don't work from behind a firewall. It's deemed unlikely that -this will cause problems for anyone, because Netscape defaults to -passive mode and few people complain, but if passive mode is -unsuitable for your application or network setup, call -\method{set_pasv(0)} on FTP objects to disable passive mode. - -\item Support for raw socket access has been added to the -\module{socket} module, contributed by Grant Edwards. - -\item The \module{pstats} module now contains a simple interactive -statistics browser for displaying timing profiles for Python programs, -invoked when the module is run as a script. Contributed by -Eric S.\ Raymond. - -\item A new implementation-dependent function, \function{sys._getframe(\optional{depth})}, -has been added to return a given frame object from the current call stack. -\function{sys._getframe()} returns the frame at the top of the call stack; -if the optional integer argument \var{depth} is supplied, the function returns the frame -that is \var{depth} calls below the top of the stack. For example, \code{sys._getframe(1)} -returns the caller's frame object. - -This function is only present in CPython, not in Jython or the .NET -implementation. Use it for debugging, and resist the temptation to -put it into production code. - - - -\end{itemize} - -%====================================================================== -\section{Other Changes and Fixes} - -There were relatively few smaller changes made in Python 2.1 due to -the shorter release cycle. A search through the CVS change logs turns -up 117 patches applied, and 136 bugs fixed; both figures are likely to -be underestimates. Some of the more notable changes are: - -\begin{itemize} - - -\item A specialized object allocator is now optionally available, that -should be faster than the system \function{malloc()} and have less -memory overhead. The allocator uses C's \function{malloc()} function -to get large pools of memory, and then fulfills smaller memory -requests from these pools. It can be enabled by providing the -\longprogramopt{with-pymalloc} option to the \program{configure} script; see -\file{Objects/obmalloc.c} for the implementation details. - -Authors of C extension modules should test their code with the object -allocator enabled, because some incorrect code may break, causing core -dumps at runtime. There are a bunch of memory allocation functions in -Python's C API that have previously been just aliases for the C -library's \function{malloc()} and \function{free()}, meaning that if -you accidentally called mismatched functions, the error wouldn't be -noticeable. When the object allocator is enabled, these functions -aren't aliases of \function{malloc()} and \function{free()} any more, -and calling the wrong function to free memory will get you a core -dump. For example, if memory was allocated using -\function{PyMem_New()}, it has to be freed using -\function{PyMem_Del()}, not \function{free()}. A few modules included -with Python fell afoul of this and had to be fixed; doubtless there -are more third-party modules that will have the same problem. - -The object allocator was contributed by Vladimir Marangozov. - -\item The speed of line-oriented file I/O has been improved because -people often complain about its lack of speed, and because it's often -been used as a na\"ive benchmark. The \method{readline()} method of -file objects has therefore been rewritten to be much faster. The -exact amount of the speedup will vary from platform to platform -depending on how slow the C library's \function{getc()} was, but is -around 66\%, and potentially much faster on some particular operating -systems. Tim Peters did much of the benchmarking and coding for this -change, motivated by a discussion in comp.lang.python. - -A new module and method for file objects was also added, contributed -by Jeff Epler. The new method, \method{xreadlines()}, is similar to -the existing \function{xrange()} built-in. \function{xreadlines()} -returns an opaque sequence object that only supports being iterated -over, reading a line on every iteration but not reading the entire -file into memory as the existing \method{readlines()} method does. -You'd use it like this: - -\begin{verbatim} -for line in sys.stdin.xreadlines(): - # ... do something for each line ... - ... -\end{verbatim} - -For a fuller discussion of the line I/O changes, see the python-dev -summary for January 1-15, 2001 at -\url{http://www.python.org/dev/summary/2001-01-1.html}. - -\item A new method, \method{popitem()}, was added to dictionaries to -enable destructively iterating through the contents of a dictionary; -this can be faster for large dictionaries because there's no need to -construct a list containing all the keys or values. -\code{D.popitem()} removes a random \code{(\var{key}, \var{value})} -pair from the dictionary~\code{D} and returns it as a 2-tuple. This -was implemented mostly by Tim Peters and Guido van Rossum, after a -suggestion and preliminary patch by Moshe Zadka. - -\item Modules can now control which names are imported when \code{from -\var{module} import *} is used, by defining an \code{__all__} -attribute containing a list of names that will be imported. One -common complaint is that if the module imports other modules such as -\module{sys} or \module{string}, \code{from \var{module} import *} -will add them to the importing module's namespace. To fix this, -simply list the public names in \code{__all__}: - -\begin{verbatim} -# List public names -__all__ = ['Database', 'open'] -\end{verbatim} - -A stricter version of this patch was first suggested and implemented -by Ben Wolfson, but after some python-dev discussion, a weaker final -version was checked in. - -\item Applying \function{repr()} to strings previously used octal -escapes for non-printable characters; for example, a newline was -\code{'\e 012'}. This was a vestigial trace of Python's C ancestry, but -today octal is of very little practical use. Ka-Ping Yee suggested -using hex escapes instead of octal ones, and using the \code{\e n}, -\code{\e t}, \code{\e r} escapes for the appropriate characters, and -implemented this new formatting. - -\item Syntax errors detected at compile-time can now raise exceptions -containing the filename and line number of the error, a pleasant side -effect of the compiler reorganization done by Jeremy Hylton. - -\item C extensions which import other modules have been changed to use -\function{PyImport_ImportModule()}, which means that they will use any -import hooks that have been installed. This is also encouraged for -third-party extensions that need to import some other module from C -code. - -\item The size of the Unicode character database was shrunk by another -340K thanks to Fredrik Lundh. - -\item Some new ports were contributed: MacOS X (by Steven Majewski), -Cygwin (by Jason Tishler); RISCOS (by Dietmar Schwertberger); Unixware~7 -(by Billy G. Allie). - -\end{itemize} - -And there's the usual list of minor bugfixes, minor memory leaks, -docstring edits, and other tweaks, too lengthy to be worth itemizing; -see the CVS logs for the full details if you want them. - - -%====================================================================== -\section{Acknowledgements} - -The author would like to thank the following people for offering -suggestions on various drafts of this article: Graeme Cross, David -Goodger, Jay Graves, Michael Hudson, Marc-Andr\'e Lemburg, Fredrik -Lundh, Neil Schemenauer, Thomas Wouters. - -\end{document} diff --git a/Doc/whatsnew/whatsnew22.tex b/Doc/whatsnew/whatsnew22.tex deleted file mode 100644 index 82ff061..0000000 --- a/Doc/whatsnew/whatsnew22.tex +++ /dev/null @@ -1,1466 +0,0 @@ -\documentclass{howto} - -% $Id$ - -\title{What's New in Python 2.2} -\release{1.02} -\author{A.M. Kuchling} -\authoraddress{ - \strong{Python Software Foundation}\\ - Email: \email{amk@amk.ca} -} -\begin{document} -\maketitle\tableofcontents - -\section{Introduction} - -This article explains the new features in Python 2.2.2, released on -October 14, 2002. Python 2.2.2 is a bugfix release of Python 2.2, -originally released on December 21, 2001. - -Python 2.2 can be thought of as the "cleanup release". There are some -features such as generators and iterators that are completely new, but -most of the changes, significant and far-reaching though they may be, -are aimed at cleaning up irregularities and dark corners of the -language design. - -This article doesn't attempt to provide a complete specification of -the new features, but instead provides a convenient overview. For -full details, you should refer to the documentation for Python 2.2, -such as the -\citetitle[http://www.python.org/doc/2.2/lib/lib.html]{Python -Library Reference} and the -\citetitle[http://www.python.org/doc/2.2/ref/ref.html]{Python -Reference Manual}. If you want to understand the complete -implementation and design rationale for a change, refer to the PEP for -a particular new feature. - -\begin{seealso} - -\seeurl{http://www.unixreview.com/documents/s=1356/urm0109h/0109h.htm} -{``What's So Special About Python 2.2?'' is also about the new 2.2 -features, and was written by Cameron Laird and Kathryn Soraiz.} - -\end{seealso} - - -%====================================================================== -\section{PEPs 252 and 253: Type and Class Changes} - -The largest and most far-reaching changes in Python 2.2 are to -Python's model of objects and classes. The changes should be backward -compatible, so it's likely that your code will continue to run -unchanged, but the changes provide some amazing new capabilities. -Before beginning this, the longest and most complicated section of -this article, I'll provide an overview of the changes and offer some -comments. - -A long time ago I wrote a Web page -(\url{http://www.amk.ca/python/writing/warts.html}) listing flaws in -Python's design. One of the most significant flaws was that it's -impossible to subclass Python types implemented in C. In particular, -it's not possible to subclass built-in types, so you can't just -subclass, say, lists in order to add a single useful method to them. -The \module{UserList} module provides a class that supports all of the -methods of lists and that can be subclassed further, but there's lots -of C code that expects a regular Python list and won't accept a -\class{UserList} instance. - -Python 2.2 fixes this, and in the process adds some exciting new -capabilities. A brief summary: - -\begin{itemize} - -\item You can subclass built-in types such as lists and even integers, -and your subclasses should work in every place that requires the -original type. - -\item It's now possible to define static and class methods, in addition -to the instance methods available in previous versions of Python. - -\item It's also possible to automatically call methods on accessing or -setting an instance attribute by using a new mechanism called -\dfn{properties}. Many uses of \method{__getattr__} can be rewritten -to use properties instead, making the resulting code simpler and -faster. As a small side benefit, attributes can now have docstrings, -too. - -\item The list of legal attributes for an instance can be limited to a -particular set using \dfn{slots}, making it possible to safeguard -against typos and perhaps make more optimizations possible in future -versions of Python. - -\end{itemize} - -Some users have voiced concern about all these changes. Sure, they -say, the new features are neat and lend themselves to all sorts of -tricks that weren't possible in previous versions of Python, but -they also make the language more complicated. Some people have said -that they've always recommended Python for its simplicity, and feel -that its simplicity is being lost. - -Personally, I think there's no need to worry. Many of the new -features are quite esoteric, and you can write a lot of Python code -without ever needed to be aware of them. Writing a simple class is no -more difficult than it ever was, so you don't need to bother learning -or teaching them unless they're actually needed. Some very -complicated tasks that were previously only possible from C will now -be possible in pure Python, and to my mind that's all for the better. - -I'm not going to attempt to cover every single corner case and small -change that were required to make the new features work. Instead this -section will paint only the broad strokes. See section~\ref{sect-rellinks}, -``Related Links'', for further sources of information about Python 2.2's new -object model. - - -\subsection{Old and New Classes} - -First, you should know that Python 2.2 really has two kinds of -classes: classic or old-style classes, and new-style classes. The -old-style class model is exactly the same as the class model in -earlier versions of Python. All the new features described in this -section apply only to new-style classes. This divergence isn't -intended to last forever; eventually old-style classes will be -dropped, possibly in Python 3.0. - -So how do you define a new-style class? You do it by subclassing an -existing new-style class. Most of Python's built-in types, such as -integers, lists, dictionaries, and even files, are new-style classes -now. A new-style class named \class{object}, the base class for all -built-in types, has also been added so if no built-in type is -suitable, you can just subclass \class{object}: - -\begin{verbatim} -class C(object): - def __init__ (self): - ... - ... -\end{verbatim} - -This means that \keyword{class} statements that don't have any base -classes are always classic classes in Python 2.2. (Actually you can -also change this by setting a module-level variable named -\member{__metaclass__} --- see \pep{253} for the details --- but it's -easier to just subclass \keyword{object}.) - -The type objects for the built-in types are available as built-ins, -named using a clever trick. Python has always had built-in functions -named \function{int()}, \function{float()}, and \function{str()}. In -2.2, they aren't functions any more, but type objects that behave as -factories when called. - -\begin{verbatim} ->>> int -<type 'int'> ->>> int('123') -123 -\end{verbatim} - -To make the set of types complete, new type objects such as -\function{dict} and \function{file} have been added. Here's a -more interesting example, adding a \method{lock()} method to file -objects: - -\begin{verbatim} -class LockableFile(file): - def lock (self, operation, length=0, start=0, whence=0): - import fcntl - return fcntl.lockf(self.fileno(), operation, - length, start, whence) -\end{verbatim} - -The now-obsolete \module{posixfile} module contained a class that -emulated all of a file object's methods and also added a -\method{lock()} method, but this class couldn't be passed to internal -functions that expected a built-in file, something which is possible -with our new \class{LockableFile}. - - -\subsection{Descriptors} - -In previous versions of Python, there was no consistent way to -discover what attributes and methods were supported by an object. -There were some informal conventions, such as defining -\member{__members__} and \member{__methods__} attributes that were -lists of names, but often the author of an extension type or a class -wouldn't bother to define them. You could fall back on inspecting the -\member{__dict__} of an object, but when class inheritance or an -arbitrary \method{__getattr__} hook were in use this could still be -inaccurate. - -The one big idea underlying the new class model is that an API for -describing the attributes of an object using \dfn{descriptors} has -been formalized. Descriptors specify the value of an attribute, -stating whether it's a method or a field. With the descriptor API, -static methods and class methods become possible, as well as more -exotic constructs. - -Attribute descriptors are objects that live inside class objects, and -have a few attributes of their own: - -\begin{itemize} - -\item \member{__name__} is the attribute's name. - -\item \member{__doc__} is the attribute's docstring. - -\item \method{__get__(\var{object})} is a method that retrieves the -attribute value from \var{object}. - -\item \method{__set__(\var{object}, \var{value})} sets the attribute -on \var{object} to \var{value}. - -\item \method{__delete__(\var{object}, \var{value})} deletes the \var{value} -attribute of \var{object}. -\end{itemize} - -For example, when you write \code{obj.x}, the steps that Python -actually performs are: - -\begin{verbatim} -descriptor = obj.__class__.x -descriptor.__get__(obj) -\end{verbatim} - -For methods, \method{descriptor.__get__} returns a temporary object that's -callable, and wraps up the instance and the method to be called on it. -This is also why static methods and class methods are now possible; -they have descriptors that wrap up just the method, or the method and -the class. As a brief explanation of these new kinds of methods, -static methods aren't passed the instance, and therefore resemble -regular functions. Class methods are passed the class of the object, -but not the object itself. Static and class methods are defined like -this: - -\begin{verbatim} -class C(object): - def f(arg1, arg2): - ... - f = staticmethod(f) - - def g(cls, arg1, arg2): - ... - g = classmethod(g) -\end{verbatim} - -The \function{staticmethod()} function takes the function -\function{f}, and returns it wrapped up in a descriptor so it can be -stored in the class object. You might expect there to be special -syntax for creating such methods (\code{def static f()}, -\code{defstatic f()}, or something like that) but no such syntax has -been defined yet; that's been left for future versions of Python. - -More new features, such as slots and properties, are also implemented -as new kinds of descriptors, and it's not difficult to write a -descriptor class that does something novel. For example, it would be -possible to write a descriptor class that made it possible to write -Eiffel-style preconditions and postconditions for a method. A class -that used this feature might be defined like this: - -\begin{verbatim} -from eiffel import eiffelmethod - -class C(object): - def f(self, arg1, arg2): - # The actual function - ... - def pre_f(self): - # Check preconditions - ... - def post_f(self): - # Check postconditions - ... - - f = eiffelmethod(f, pre_f, post_f) -\end{verbatim} - -Note that a person using the new \function{eiffelmethod()} doesn't -have to understand anything about descriptors. This is why I think -the new features don't increase the basic complexity of the language. -There will be a few wizards who need to know about it in order to -write \function{eiffelmethod()} or the ZODB or whatever, but most -users will just write code on top of the resulting libraries and -ignore the implementation details. - - -\subsection{Multiple Inheritance: The Diamond Rule} - -Multiple inheritance has also been made more useful through changing -the rules under which names are resolved. Consider this set of classes -(diagram taken from \pep{253} by Guido van Rossum): - -\begin{verbatim} - class A: - ^ ^ def save(self): ... - / \ - / \ - / \ - / \ - class B class C: - ^ ^ def save(self): ... - \ / - \ / - \ / - \ / - class D -\end{verbatim} - -The lookup rule for classic classes is simple but not very smart; the -base classes are searched depth-first, going from left to right. A -reference to \method{D.save} will search the classes \class{D}, -\class{B}, and then \class{A}, where \method{save()} would be found -and returned. \method{C.save()} would never be found at all. This is -bad, because if \class{C}'s \method{save()} method is saving some -internal state specific to \class{C}, not calling it will result in -that state never getting saved. - -New-style classes follow a different algorithm that's a bit more -complicated to explain, but does the right thing in this situation. -(Note that Python 2.3 changes this algorithm to one that produces the -same results in most cases, but produces more useful results for -really complicated inheritance graphs.) - -\begin{enumerate} - -\item List all the base classes, following the classic lookup rule and -include a class multiple times if it's visited repeatedly. In the -above example, the list of visited classes is [\class{D}, \class{B}, -\class{A}, \class{C}, \class{A}]. - -\item Scan the list for duplicated classes. If any are found, remove -all but one occurrence, leaving the \emph{last} one in the list. In -the above example, the list becomes [\class{D}, \class{B}, \class{C}, -\class{A}] after dropping duplicates. - -\end{enumerate} - -Following this rule, referring to \method{D.save()} will return -\method{C.save()}, which is the behaviour we're after. This lookup -rule is the same as the one followed by Common Lisp. A new built-in -function, \function{super()}, provides a way to get at a class's -superclasses without having to reimplement Python's algorithm. -The most commonly used form will be -\function{super(\var{class}, \var{obj})}, which returns -a bound superclass object (not the actual class object). This form -will be used in methods to call a method in the superclass; for -example, \class{D}'s \method{save()} method would look like this: - -\begin{verbatim} -class D (B,C): - def save (self): - # Call superclass .save() - super(D, self).save() - # Save D's private information here - ... -\end{verbatim} - -\function{super()} can also return unbound superclass objects -when called as \function{super(\var{class})} or -\function{super(\var{class1}, \var{class2})}, but this probably won't -often be useful. - - -\subsection{Attribute Access} - -A fair number of sophisticated Python classes define hooks for -attribute access using \method{__getattr__}; most commonly this is -done for convenience, to make code more readable by automatically -mapping an attribute access such as \code{obj.parent} into a method -call such as \code{obj.get_parent()}. Python 2.2 adds some new ways -of controlling attribute access. - -First, \method{__getattr__(\var{attr_name})} is still supported by -new-style classes, and nothing about it has changed. As before, it -will be called when an attempt is made to access \code{obj.foo} and no -attribute named \samp{foo} is found in the instance's dictionary. - -New-style classes also support a new method, -\method{__getattribute__(\var{attr_name})}. The difference between -the two methods is that \method{__getattribute__} is \emph{always} -called whenever any attribute is accessed, while the old -\method{__getattr__} is only called if \samp{foo} isn't found in the -instance's dictionary. - -However, Python 2.2's support for \dfn{properties} will often be a -simpler way to trap attribute references. Writing a -\method{__getattr__} method is complicated because to avoid recursion -you can't use regular attribute accesses inside them, and instead have -to mess around with the contents of \member{__dict__}. -\method{__getattr__} methods also end up being called by Python when -it checks for other methods such as \method{__repr__} or -\method{__coerce__}, and so have to be written with this in mind. -Finally, calling a function on every attribute access results in a -sizable performance loss. - -\class{property} is a new built-in type that packages up three -functions that get, set, or delete an attribute, and a docstring. For -example, if you want to define a \member{size} attribute that's -computed, but also settable, you could write: - -\begin{verbatim} -class C(object): - def get_size (self): - result = ... computation ... - return result - def set_size (self, size): - ... compute something based on the size - and set internal state appropriately ... - - # Define a property. The 'delete this attribute' - # method is defined as None, so the attribute - # can't be deleted. - size = property(get_size, set_size, - None, - "Storage size of this instance") -\end{verbatim} - -That is certainly clearer and easier to write than a pair of -\method{__getattr__}/\method{__setattr__} methods that check for the -\member{size} attribute and handle it specially while retrieving all -other attributes from the instance's \member{__dict__}. Accesses to -\member{size} are also the only ones which have to perform the work of -calling a function, so references to other attributes run at -their usual speed. - -Finally, it's possible to constrain the list of attributes that can be -referenced on an object using the new \member{__slots__} class attribute. -Python objects are usually very dynamic; at any time it's possible to -define a new attribute on an instance by just doing -\code{obj.new_attr=1}. A new-style class can define a class attribute named -\member{__slots__} to limit the legal attributes -to a particular set of names. An example will make this clear: - -\begin{verbatim} ->>> class C(object): -... __slots__ = ('template', 'name') -... ->>> obj = C() ->>> print obj.template -None ->>> obj.template = 'Test' ->>> print obj.template -Test ->>> obj.newattr = None -Traceback (most recent call last): - File "<stdin>", line 1, in ? -AttributeError: 'C' object has no attribute 'newattr' -\end{verbatim} - -Note how you get an \exception{AttributeError} on the attempt to -assign to an attribute not listed in \member{__slots__}. - - - -\subsection{Related Links} -\label{sect-rellinks} - -This section has just been a quick overview of the new features, -giving enough of an explanation to start you programming, but many -details have been simplified or ignored. Where should you go to get a -more complete picture? - -\url{http://www.python.org/2.2/descrintro.html} is a lengthy tutorial -introduction to the descriptor features, written by Guido van Rossum. -If my description has whetted your appetite, go read this tutorial -next, because it goes into much more detail about the new features -while still remaining quite easy to read. - -Next, there are two relevant PEPs, \pep{252} and \pep{253}. \pep{252} -is titled "Making Types Look More Like Classes", and covers the -descriptor API. \pep{253} is titled "Subtyping Built-in Types", and -describes the changes to type objects that make it possible to subtype -built-in objects. \pep{253} is the more complicated PEP of the two, -and at a few points the necessary explanations of types and meta-types -may cause your head to explode. Both PEPs were written and -implemented by Guido van Rossum, with substantial assistance from the -rest of the Zope Corp. team. - -Finally, there's the ultimate authority: the source code. Most of the -machinery for the type handling is in \file{Objects/typeobject.c}, but -you should only resort to it after all other avenues have been -exhausted, including posting a question to python-list or python-dev. - - -%====================================================================== -\section{PEP 234: Iterators} - -Another significant addition to 2.2 is an iteration interface at both -the C and Python levels. Objects can define how they can be looped -over by callers. - -In Python versions up to 2.1, the usual way to make \code{for item in -obj} work is to define a \method{__getitem__()} method that looks -something like this: - -\begin{verbatim} - def __getitem__(self, index): - return <next item> -\end{verbatim} - -\method{__getitem__()} is more properly used to define an indexing -operation on an object so that you can write \code{obj[5]} to retrieve -the sixth element. It's a bit misleading when you're using this only -to support \keyword{for} loops. Consider some file-like object that -wants to be looped over; the \var{index} parameter is essentially -meaningless, as the class probably assumes that a series of -\method{__getitem__()} calls will be made with \var{index} -incrementing by one each time. In other words, the presence of the -\method{__getitem__()} method doesn't mean that using \code{file[5]} -to randomly access the sixth element will work, though it really should. - -In Python 2.2, iteration can be implemented separately, and -\method{__getitem__()} methods can be limited to classes that really -do support random access. The basic idea of iterators is -simple. A new built-in function, \function{iter(obj)} or -\code{iter(\var{C}, \var{sentinel})}, is used to get an iterator. -\function{iter(obj)} returns an iterator for the object \var{obj}, -while \code{iter(\var{C}, \var{sentinel})} returns an iterator that -will invoke the callable object \var{C} until it returns -\var{sentinel} to signal that the iterator is done. - -Python classes can define an \method{__iter__()} method, which should -create and return a new iterator for the object; if the object is its -own iterator, this method can just return \code{self}. In particular, -iterators will usually be their own iterators. Extension types -implemented in C can implement a \member{tp_iter} function in order to -return an iterator, and extension types that want to behave as -iterators can define a \member{tp_iternext} function. - -So, after all this, what do iterators actually do? They have one -required method, \method{next()}, which takes no arguments and returns -the next value. When there are no more values to be returned, calling -\method{next()} should raise the \exception{StopIteration} exception. - -\begin{verbatim} ->>> L = [1,2,3] ->>> i = iter(L) ->>> print i -<iterator object at 0x8116870> ->>> i.next() -1 ->>> i.next() -2 ->>> i.next() -3 ->>> i.next() -Traceback (most recent call last): - File "<stdin>", line 1, in ? -StopIteration ->>> -\end{verbatim} - -In 2.2, Python's \keyword{for} statement no longer expects a sequence; -it expects something for which \function{iter()} will return an iterator. -For backward compatibility and convenience, an iterator is -automatically constructed for sequences that don't implement -\method{__iter__()} or a \member{tp_iter} slot, so \code{for i in -[1,2,3]} will still work. Wherever the Python interpreter loops over -a sequence, it's been changed to use the iterator protocol. This -means you can do things like this: - -\begin{verbatim} ->>> L = [1,2,3] ->>> i = iter(L) ->>> a,b,c = i ->>> a,b,c -(1, 2, 3) -\end{verbatim} - -Iterator support has been added to some of Python's basic types. -Calling \function{iter()} on a dictionary will return an iterator -which loops over its keys: - -\begin{verbatim} ->>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6, -... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12} ->>> for key in m: print key, m[key] -... -Mar 3 -Feb 2 -Aug 8 -Sep 9 -May 5 -Jun 6 -Jul 7 -Jan 1 -Apr 4 -Nov 11 -Dec 12 -Oct 10 -\end{verbatim} - -That's just the default behaviour. If you want to iterate over keys, -values, or key/value pairs, you can explicitly call the -\method{iterkeys()}, \method{itervalues()}, or \method{iteritems()} -methods to get an appropriate iterator. In a minor related change, -the \keyword{in} operator now works on dictionaries, so -\code{\var{key} in dict} is now equivalent to -\code{dict.has_key(\var{key})}. - -Files also provide an iterator, which calls the \method{readline()} -method until there are no more lines in the file. This means you can -now read each line of a file using code like this: - -\begin{verbatim} -for line in file: - # do something for each line - ... -\end{verbatim} - -Note that you can only go forward in an iterator; there's no way to -get the previous element, reset the iterator, or make a copy of it. -An iterator object could provide such additional capabilities, but the -iterator protocol only requires a \method{next()} method. - -\begin{seealso} - -\seepep{234}{Iterators}{Written by Ka-Ping Yee and GvR; implemented -by the Python Labs crew, mostly by GvR and Tim Peters.} - -\end{seealso} - - -%====================================================================== -\section{PEP 255: Simple Generators} - -Generators are another new feature, one that interacts with the -introduction of iterators. - -You're doubtless familiar with how function calls work in Python or -C. When you call a function, it gets a private namespace where its local -variables are created. When the function reaches a \keyword{return} -statement, the local variables are destroyed and the resulting value -is returned to the caller. A later call to the same function will get -a fresh new set of local variables. But, what if the local variables -weren't thrown away on exiting a function? What if you could later -resume the function where it left off? This is what generators -provide; they can be thought of as resumable functions. - -Here's the simplest example of a generator function: - -\begin{verbatim} -def generate_ints(N): - for i in range(N): - yield i -\end{verbatim} - -A new keyword, \keyword{yield}, was introduced for generators. Any -function containing a \keyword{yield} statement is a generator -function; this is detected by Python's bytecode compiler which -compiles the function specially as a result. Because a new keyword was -introduced, generators must be explicitly enabled in a module by -including a \code{from __future__ import generators} statement near -the top of the module's source code. In Python 2.3 this statement -will become unnecessary. - -When you call a generator function, it doesn't return a single value; -instead it returns a generator object that supports the iterator -protocol. On executing the \keyword{yield} statement, the generator -outputs the value of \code{i}, similar to a \keyword{return} -statement. The big difference between \keyword{yield} and a -\keyword{return} statement is that on reaching a \keyword{yield} the -generator's state of execution is suspended and local variables are -preserved. On the next call to the generator's \code{next()} method, -the function will resume executing immediately after the -\keyword{yield} statement. (For complicated reasons, the -\keyword{yield} statement isn't allowed inside the \keyword{try} block -of a \keyword{try}...\keyword{finally} statement; read \pep{255} for a full -explanation of the interaction between \keyword{yield} and -exceptions.) - -Here's a sample usage of the \function{generate_ints} generator: - -\begin{verbatim} ->>> gen = generate_ints(3) ->>> gen -<generator object at 0x8117f90> ->>> gen.next() -0 ->>> gen.next() -1 ->>> gen.next() -2 ->>> gen.next() -Traceback (most recent call last): - File "<stdin>", line 1, in ? - File "<stdin>", line 2, in generate_ints -StopIteration -\end{verbatim} - -You could equally write \code{for i in generate_ints(5)}, or -\code{a,b,c = generate_ints(3)}. - -Inside a generator function, the \keyword{return} statement can only -be used without a value, and signals the end of the procession of -values; afterwards the generator cannot return any further values. -\keyword{return} with a value, such as \code{return 5}, is a syntax -error inside a generator function. The end of the generator's results -can also be indicated by raising \exception{StopIteration} manually, -or by just letting the flow of execution fall off the bottom of the -function. - -You could achieve the effect of generators manually by writing your -own class and storing all the local variables of the generator as -instance variables. For example, returning a list of integers could -be done by setting \code{self.count} to 0, and having the -\method{next()} method increment \code{self.count} and return it. -However, for a moderately complicated generator, writing a -corresponding class would be much messier. -\file{Lib/test/test_generators.py} contains a number of more -interesting examples. The simplest one implements an in-order -traversal of a tree using generators recursively. - -\begin{verbatim} -# A recursive generator that generates Tree leaves in in-order. -def inorder(t): - if t: - for x in inorder(t.left): - yield x - yield t.label - for x in inorder(t.right): - yield x -\end{verbatim} - -Two other examples in \file{Lib/test/test_generators.py} produce -solutions for the N-Queens problem (placing $N$ queens on an $NxN$ -chess board so that no queen threatens another) and the Knight's Tour -(a route that takes a knight to every square of an $NxN$ chessboard -without visiting any square twice). - -The idea of generators comes from other programming languages, -especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the -idea of generators is central. In Icon, every -expression and function call behaves like a generator. One example -from ``An Overview of the Icon Programming Language'' at -\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of -what this looks like: - -\begin{verbatim} -sentence := "Store it in the neighboring harbor" -if (i := find("or", sentence)) > 5 then write(i) -\end{verbatim} - -In Icon the \function{find()} function returns the indexes at which the -substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement, -\code{i} is first assigned a value of 3, but 3 is less than 5, so the -comparison fails, and Icon retries it with the second value of 23. 23 -is greater than 5, so the comparison now succeeds, and the code prints -the value 23 to the screen. - -Python doesn't go nearly as far as Icon in adopting generators as a -central concept. Generators are considered a new part of the core -Python language, but learning or using them isn't compulsory; if they -don't solve any problems that you have, feel free to ignore them. -One novel feature of Python's interface as compared to -Icon's is that a generator's state is represented as a concrete object -(the iterator) that can be passed around to other functions or stored -in a data structure. - -\begin{seealso} - -\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim -Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer -and Tim Peters, with other fixes from the Python Labs crew.} - -\end{seealso} - - -%====================================================================== -\section{PEP 237: Unifying Long Integers and Integers} - -In recent versions, the distinction between regular integers, which -are 32-bit values on most machines, and long integers, which can be of -arbitrary size, was becoming an annoyance. For example, on platforms -that support files larger than \code{2**32} bytes, the -\method{tell()} method of file objects has to return a long integer. -However, there were various bits of Python that expected plain -integers and would raise an error if a long integer was provided -instead. For example, in Python 1.5, only regular integers -could be used as a slice index, and \code{'abc'[1L:]} would raise a -\exception{TypeError} exception with the message 'slice index must be -int'. - -Python 2.2 will shift values from short to long integers as required. -The 'L' suffix is no longer needed to indicate a long integer literal, -as now the compiler will choose the appropriate type. (Using the 'L' -suffix will be discouraged in future 2.x versions of Python, -triggering a warning in Python 2.4, and probably dropped in Python -3.0.) Many operations that used to raise an \exception{OverflowError} -will now return a long integer as their result. For example: - -\begin{verbatim} ->>> 1234567890123 -1234567890123L ->>> 2 ** 64 -18446744073709551616L -\end{verbatim} - -In most cases, integers and long integers will now be treated -identically. You can still distinguish them with the -\function{type()} built-in function, but that's rarely needed. - -\begin{seealso} - -\seepep{237}{Unifying Long Integers and Integers}{Written by -Moshe Zadka and Guido van Rossum. Implemented mostly by Guido van -Rossum.} - -\end{seealso} - - -%====================================================================== -\section{PEP 238: Changing the Division Operator} - -The most controversial change in Python 2.2 heralds the start of an effort -to fix an old design flaw that's been in Python from the beginning. -Currently Python's division operator, \code{/}, behaves like C's -division operator when presented with two integer arguments: it -returns an integer result that's truncated down when there would be -a fractional part. For example, \code{3/2} is 1, not 1.5, and -\code{(-1)/2} is -1, not -0.5. This means that the results of divison -can vary unexpectedly depending on the type of the two operands and -because Python is dynamically typed, it can be difficult to determine -the possible types of the operands. - -(The controversy is over whether this is \emph{really} a design flaw, -and whether it's worth breaking existing code to fix this. It's -caused endless discussions on python-dev, and in July 2001 erupted into an -storm of acidly sarcastic postings on \newsgroup{comp.lang.python}. I -won't argue for either side here and will stick to describing what's -implemented in 2.2. Read \pep{238} for a summary of arguments and -counter-arguments.) - -Because this change might break code, it's being introduced very -gradually. Python 2.2 begins the transition, but the switch won't be -complete until Python 3.0. - -First, I'll borrow some terminology from \pep{238}. ``True division'' is the -division that most non-programmers are familiar with: 3/2 is 1.5, 1/4 -is 0.25, and so forth. ``Floor division'' is what Python's \code{/} -operator currently does when given integer operands; the result is the -floor of the value returned by true division. ``Classic division'' is -the current mixed behaviour of \code{/}; it returns the result of -floor division when the operands are integers, and returns the result -of true division when one of the operands is a floating-point number. - -Here are the changes 2.2 introduces: - -\begin{itemize} - -\item A new operator, \code{//}, is the floor division operator. -(Yes, we know it looks like \Cpp's comment symbol.) \code{//} -\emph{always} performs floor division no matter what the types of -its operands are, so \code{1 // 2} is 0 and \code{1.0 // 2.0} is also -0.0. - -\code{//} is always available in Python 2.2; you don't need to enable -it using a \code{__future__} statement. - -\item By including a \code{from __future__ import division} in a -module, the \code{/} operator will be changed to return the result of -true division, so \code{1/2} is 0.5. Without the \code{__future__} -statement, \code{/} still means classic division. The default meaning -of \code{/} will not change until Python 3.0. - -\item Classes can define methods called \method{__truediv__} and -\method{__floordiv__} to overload the two division operators. At the -C level, there are also slots in the \ctype{PyNumberMethods} structure -so extension types can define the two operators. - -\item Python 2.2 supports some command-line arguments for testing -whether code will works with the changed division semantics. Running -python with \programopt{-Q warn} will cause a warning to be issued -whenever division is applied to two integers. You can use this to -find code that's affected by the change and fix it. By default, -Python 2.2 will simply perform classic division without a warning; the -warning will be turned on by default in Python 2.3. - -\end{itemize} - -\begin{seealso} - -\seepep{238}{Changing the Division Operator}{Written by Moshe Zadka and -Guido van Rossum. Implemented by Guido van Rossum..} - -\end{seealso} - - -%====================================================================== -\section{Unicode Changes} - -Python's Unicode support has been enhanced a bit in 2.2. Unicode -strings are usually stored as UCS-2, as 16-bit unsigned integers. -Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned -integers, as its internal encoding by supplying -\longprogramopt{enable-unicode=ucs4} to the configure script. -(It's also possible to specify -\longprogramopt{disable-unicode} to completely disable Unicode -support.) - -When built to use UCS-4 (a ``wide Python''), the interpreter can -natively handle Unicode characters from U+000000 to U+110000, so the -range of legal values for the \function{unichr()} function is expanded -accordingly. Using an interpreter compiled to use UCS-2 (a ``narrow -Python''), values greater than 65535 will still cause -\function{unichr()} to raise a \exception{ValueError} exception. -This is all described in \pep{261}, ``Support for `wide' Unicode -characters''; consult it for further details. - -Another change is simpler to explain. Since their introduction, -Unicode strings have supported an \method{encode()} method to convert -the string to a selected encoding such as UTF-8 or Latin-1. A -symmetric \method{decode(\optional{\var{encoding}})} method has been -added to 8-bit strings (though not to Unicode strings) in 2.2. -\method{decode()} assumes that the string is in the specified encoding -and decodes it, returning whatever is returned by the codec. - -Using this new feature, codecs have been added for tasks not directly -related to Unicode. For example, codecs have been added for -uu-encoding, MIME's base64 encoding, and compression with the -\module{zlib} module: - -\begin{verbatim} ->>> s = """Here is a lengthy piece of redundant, overly verbose, -... and repetitive text. -... """ ->>> data = s.encode('zlib') ->>> data -'x\x9c\r\xc9\xc1\r\x80 \x10\x04\xc0?Ul...' ->>> data.decode('zlib') -'Here is a lengthy piece of redundant, overly verbose,\nand repetitive text.\n' ->>> print s.encode('uu') -begin 666 <data> -M2&5R92!I<R!A(&QE;F=T:'D@<&EE8V4@;V8@<F5D=6YD86YT+"!O=F5R;'D@ ->=F5R8F]S92P*86YD(')E<&5T:71I=F4@=&5X="X* - -end ->>> "sheesh".encode('rot-13') -'furrfu' -\end{verbatim} - -To convert a class instance to Unicode, a \method{__unicode__} method -can be defined by a class, analogous to \method{__str__}. - -\method{encode()}, \method{decode()}, and \method{__unicode__} were -implemented by Marc-Andr\'e Lemburg. The changes to support using -UCS-4 internally were implemented by Fredrik Lundh and Martin von -L\"owis. - -\begin{seealso} - -\seepep{261}{Support for `wide' Unicode characters}{Written by -Paul Prescod.} - -\end{seealso} - - -%====================================================================== -\section{PEP 227: Nested Scopes} - -In Python 2.1, statically nested scopes were added as an optional -feature, to be enabled by a \code{from __future__ import -nested_scopes} directive. In 2.2 nested scopes no longer need to be -specially enabled, and are now always present. The rest of this section -is a copy of the description of nested scopes from my ``What's New in -Python 2.1'' document; if you read it when 2.1 came out, you can skip -the rest of this section. - -The largest change introduced in Python 2.1, and made complete in 2.2, -is to Python's scoping rules. In Python 2.0, at any given time there -are at most three namespaces used to look up variable names: local, -module-level, and the built-in namespace. This often surprised people -because it didn't match their intuitive expectations. For example, a -nested recursive function definition doesn't work: - -\begin{verbatim} -def f(): - ... - def g(value): - ... - return g(value-1) + 1 - ... -\end{verbatim} - -The function \function{g()} will always raise a \exception{NameError} -exception, because the binding of the name \samp{g} isn't in either -its local namespace or in the module-level namespace. This isn't much -of a problem in practice (how often do you recursively define interior -functions like this?), but this also made using the \keyword{lambda} -statement clumsier, and this was a problem in practice. In code which -uses \keyword{lambda} you can often find local variables being copied -by passing them as the default values of arguments. - -\begin{verbatim} -def find(self, name): - "Return list of any entries equal to 'name'" - L = filter(lambda x, name=name: x == name, - self.list_attribute) - return L -\end{verbatim} - -The readability of Python code written in a strongly functional style -suffers greatly as a result. - -The most significant change to Python 2.2 is that static scoping has -been added to the language to fix this problem. As a first effect, -the \code{name=name} default argument is now unnecessary in the above -example. Put simply, when a given variable name is not assigned a -value within a function (by an assignment, or the \keyword{def}, -\keyword{class}, or \keyword{import} statements), references to the -variable will be looked up in the local namespace of the enclosing -scope. A more detailed explanation of the rules, and a dissection of -the implementation, can be found in the PEP. - -This change may cause some compatibility problems for code where the -same variable name is used both at the module level and as a local -variable within a function that contains further function definitions. -This seems rather unlikely though, since such code would have been -pretty confusing to read in the first place. - -One side effect of the change is that the \code{from \var{module} -import *} and \keyword{exec} statements have been made illegal inside -a function scope under certain conditions. The Python reference -manual has said all along that \code{from \var{module} import *} is -only legal at the top level of a module, but the CPython interpreter -has never enforced this before. As part of the implementation of -nested scopes, the compiler which turns Python source into bytecodes -has to generate different code to access variables in a containing -scope. \code{from \var{module} import *} and \keyword{exec} make it -impossible for the compiler to figure this out, because they add names -to the local namespace that are unknowable at compile time. -Therefore, if a function contains function definitions or -\keyword{lambda} expressions with free variables, the compiler will -flag this by raising a \exception{SyntaxError} exception. - -To make the preceding explanation a bit clearer, here's an example: - -\begin{verbatim} -x = 1 -def f(): - # The next line is a syntax error - exec 'x=2' - def g(): - return x -\end{verbatim} - -Line 4 containing the \keyword{exec} statement is a syntax error, -since \keyword{exec} would define a new local variable named \samp{x} -whose value should be accessed by \function{g()}. - -This shouldn't be much of a limitation, since \keyword{exec} is rarely -used in most Python code (and when it is used, it's often a sign of a -poor design anyway). - -\begin{seealso} - -\seepep{227}{Statically Nested Scopes}{Written and implemented by -Jeremy Hylton.} - -\end{seealso} - - -%====================================================================== -\section{New and Improved Modules} - -\begin{itemize} - - \item The \module{xmlrpclib} module was contributed to the standard - library by Fredrik Lundh, providing support for writing XML-RPC - clients. XML-RPC is a simple remote procedure call protocol built on - top of HTTP and XML. For example, the following snippet retrieves a - list of RSS channels from the O'Reilly Network, and then - lists the recent headlines for one channel: - -\begin{verbatim} -import xmlrpclib -s = xmlrpclib.Server( - 'http://www.oreillynet.com/meerkat/xml-rpc/server.php') -channels = s.meerkat.getChannels() -# channels is a list of dictionaries, like this: -# [{'id': 4, 'title': 'Freshmeat Daily News'} -# {'id': 190, 'title': '32Bits Online'}, -# {'id': 4549, 'title': '3DGamers'}, ... ] - -# Get the items for one channel -items = s.meerkat.getItems( {'channel': 4} ) - -# 'items' is another list of dictionaries, like this: -# [{'link': 'http://freshmeat.net/releases/52719/', -# 'description': 'A utility which converts HTML to XSL FO.', -# 'title': 'html2fo 0.3 (Default)'}, ... ] -\end{verbatim} - -The \module{SimpleXMLRPCServer} module makes it easy to create -straightforward XML-RPC servers. See \url{http://www.xmlrpc.com/} for -more information about XML-RPC. - - \item The new \module{hmac} module implements the HMAC - algorithm described by \rfc{2104}. - (Contributed by Gerhard H\"aring.) - - \item Several functions that originally returned lengthy tuples now - return pseudo-sequences that still behave like tuples but also have - mnemonic attributes such as member{st_mtime} or \member{tm_year}. - The enhanced functions include \function{stat()}, - \function{fstat()}, \function{statvfs()}, and \function{fstatvfs()} - in the \module{os} module, and \function{localtime()}, - \function{gmtime()}, and \function{strptime()} in the \module{time} - module. - - For example, to obtain a file's size using the old tuples, you'd end - up writing something like \code{file_size = - os.stat(filename)[stat.ST_SIZE]}, but now this can be written more - clearly as \code{file_size = os.stat(filename).st_size}. - - The original patch for this feature was contributed by Nick Mathewson. - - \item The Python profiler has been extensively reworked and various - errors in its output have been corrected. (Contributed by - Fred~L. Drake, Jr. and Tim Peters.) - - \item The \module{socket} module can be compiled to support IPv6; - specify the \longprogramopt{enable-ipv6} option to Python's configure - script. (Contributed by Jun-ichiro ``itojun'' Hagino.) - - \item Two new format characters were added to the \module{struct} - module for 64-bit integers on platforms that support the C - \ctype{long long} type. \samp{q} is for a signed 64-bit integer, - and \samp{Q} is for an unsigned one. The value is returned in - Python's long integer type. (Contributed by Tim Peters.) - - \item In the interpreter's interactive mode, there's a new built-in - function \function{help()} that uses the \module{pydoc} module - introduced in Python 2.1 to provide interactive help. - \code{help(\var{object})} displays any available help text about - \var{object}. \function{help()} with no argument puts you in an online - help utility, where you can enter the names of functions, classes, - or modules to read their help text. - (Contributed by Guido van Rossum, using Ka-Ping Yee's \module{pydoc} module.) - - \item Various bugfixes and performance improvements have been made - to the SRE engine underlying the \module{re} module. For example, - the \function{re.sub()} and \function{re.split()} functions have - been rewritten in C. Another contributed patch speeds up certain - Unicode character ranges by a factor of two, and a new \method{finditer()} - method that returns an iterator over all the non-overlapping matches in - a given string. - (SRE is maintained by - Fredrik Lundh. The BIGCHARSET patch was contributed by Martin von - L\"owis.) - - \item The \module{smtplib} module now supports \rfc{2487}, ``Secure - SMTP over TLS'', so it's now possible to encrypt the SMTP traffic - between a Python program and the mail transport agent being handed a - message. \module{smtplib} also supports SMTP authentication. - (Contributed by Gerhard H\"aring.) - - \item The \module{imaplib} module, maintained by Piers Lauder, has - support for several new extensions: the NAMESPACE extension defined - in \rfc{2342}, SORT, GETACL and SETACL. (Contributed by Anthony - Baxter and Michel Pelletier.) - - \item The \module{rfc822} module's parsing of email addresses is now - compliant with \rfc{2822}, an update to \rfc{822}. (The module's - name is \emph{not} going to be changed to \samp{rfc2822}.) A new - package, \module{email}, has also been added for parsing and - generating e-mail messages. (Contributed by Barry Warsaw, and - arising out of his work on Mailman.) - - \item The \module{difflib} module now contains a new \class{Differ} - class for producing human-readable lists of changes (a ``delta'') - between two sequences of lines of text. There are also two - generator functions, \function{ndiff()} and \function{restore()}, - which respectively return a delta from two sequences, or one of the - original sequences from a delta. (Grunt work contributed by David - Goodger, from ndiff.py code by Tim Peters who then did the - generatorization.) - - \item New constants \constant{ascii_letters}, - \constant{ascii_lowercase}, and \constant{ascii_uppercase} were - added to the \module{string} module. There were several modules in - the standard library that used \constant{string.letters} to mean the - ranges A-Za-z, but that assumption is incorrect when locales are in - use, because \constant{string.letters} varies depending on the set - of legal characters defined by the current locale. The buggy - modules have all been fixed to use \constant{ascii_letters} instead. - (Reported by an unknown person; fixed by Fred~L. Drake, Jr.) - - \item The \module{mimetypes} module now makes it easier to use - alternative MIME-type databases by the addition of a - \class{MimeTypes} class, which takes a list of filenames to be - parsed. (Contributed by Fred~L. Drake, Jr.) - - \item A \class{Timer} class was added to the \module{threading} - module that allows scheduling an activity to happen at some future - time. (Contributed by Itamar Shtull-Trauring.) - -\end{itemize} - - -%====================================================================== -\section{Interpreter Changes and Fixes} - -Some of the changes only affect people who deal with the Python -interpreter at the C level because they're writing Python extension modules, -embedding the interpreter, or just hacking on the interpreter itself. -If you only write Python code, none of the changes described here will -affect you very much. - -\begin{itemize} - - \item Profiling and tracing functions can now be implemented in C, - which can operate at much higher speeds than Python-based functions - and should reduce the overhead of profiling and tracing. This - will be of interest to authors of development environments for - Python. Two new C functions were added to Python's API, - \cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}. - The existing \function{sys.setprofile()} and - \function{sys.settrace()} functions still exist, and have simply - been changed to use the new C-level interface. (Contributed by Fred - L. Drake, Jr.) - - \item Another low-level API, primarily of interest to implementors - of Python debuggers and development tools, was added. - \cfunction{PyInterpreterState_Head()} and - \cfunction{PyInterpreterState_Next()} let a caller walk through all - the existing interpreter objects; - \cfunction{PyInterpreterState_ThreadHead()} and - \cfunction{PyThreadState_Next()} allow looping over all the thread - states for a given interpreter. (Contributed by David Beazley.) - -\item The C-level interface to the garbage collector has been changed -to make it easier to write extension types that support garbage -collection and to debug misuses of the functions. -Various functions have slightly different semantics, so a bunch of -functions had to be renamed. Extensions that use the old API will -still compile but will \emph{not} participate in garbage collection, -so updating them for 2.2 should be considered fairly high priority. - -To upgrade an extension module to the new API, perform the following -steps: - -\begin{itemize} - -\item Rename \cfunction{Py_TPFLAGS_GC} to \cfunction{PyTPFLAGS_HAVE_GC}. - -\item Use \cfunction{PyObject_GC_New} or \cfunction{PyObject_GC_NewVar} to -allocate objects, and \cfunction{PyObject_GC_Del} to deallocate them. - -\item Rename \cfunction{PyObject_GC_Init} to \cfunction{PyObject_GC_Track} and -\cfunction{PyObject_GC_Fini} to \cfunction{PyObject_GC_UnTrack}. - -\item Remove \cfunction{PyGC_HEAD_SIZE} from object size calculations. - -\item Remove calls to \cfunction{PyObject_AS_GC} and \cfunction{PyObject_FROM_GC}. - -\end{itemize} - - \item A new \samp{et} format sequence was added to - \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and - an encoding name, and converts the parameter to the given encoding - if the parameter turns out to be a Unicode string, or leaves it - alone if it's an 8-bit string, assuming it to already be in the - desired encoding. This differs from the \samp{es} format character, - which assumes that 8-bit strings are in Python's default ASCII - encoding and converts them to the specified new encoding. - (Contributed by M.-A. Lemburg, and used for the MBCS support on - Windows described in the following section.) - - \item A different argument parsing function, - \cfunction{PyArg_UnpackTuple()}, has been added that's simpler and - presumably faster. Instead of specifying a format string, the - caller simply gives the minimum and maximum number of arguments - expected, and a set of pointers to \ctype{PyObject*} variables that - will be filled in with argument values. - - \item Two new flags \constant{METH_NOARGS} and \constant{METH_O} are - available in method definition tables to simplify implementation of - methods with no arguments or a single untyped argument. Calling - such methods is more efficient than calling a corresponding method - that uses \constant{METH_VARARGS}. - Also, the old \constant{METH_OLDARGS} style of writing C methods is - now officially deprecated. - -\item - Two new wrapper functions, \cfunction{PyOS_snprintf()} and - \cfunction{PyOS_vsnprintf()} were added to provide - cross-platform implementations for the relatively new - \cfunction{snprintf()} and \cfunction{vsnprintf()} C lib APIs. In - contrast to the standard \cfunction{sprintf()} and - \cfunction{vsprintf()} functions, the Python versions check the - bounds of the buffer used to protect against buffer overruns. - (Contributed by M.-A. Lemburg.) - - \item The \cfunction{_PyTuple_Resize()} function has lost an unused - parameter, so now it takes 2 parameters instead of 3. The third - argument was never used, and can simply be discarded when porting - code from earlier versions to Python 2.2. - -\end{itemize} - - -%====================================================================== -\section{Other Changes and Fixes} - -As usual there were a bunch of other improvements and bugfixes -scattered throughout the source tree. A search through the CVS change -logs finds there were 527 patches applied and 683 bugs fixed between -Python 2.1 and 2.2; 2.2.1 applied 139 patches and fixed 143 bugs; -2.2.2 applied 106 patches and fixed 82 bugs. These figures are likely -to be underestimates. - -Some of the more notable changes are: - -\begin{itemize} - - \item The code for the MacOS port for Python, maintained by Jack - Jansen, is now kept in the main Python CVS tree, and many changes - have been made to support MacOS~X. - -The most significant change is the ability to build Python as a -framework, enabled by supplying the \longprogramopt{enable-framework} -option to the configure script when compiling Python. According to -Jack Jansen, ``This installs a self-contained Python installation plus -the OS~X framework "glue" into -\file{/Library/Frameworks/Python.framework} (or another location of -choice). For now there is little immediate added benefit to this -(actually, there is the disadvantage that you have to change your PATH -to be able to find Python), but it is the basis for creating a -full-blown Python application, porting the MacPython IDE, possibly -using Python as a standard OSA scripting language and much more.'' - -Most of the MacPython toolbox modules, which interface to MacOS APIs -such as windowing, QuickTime, scripting, etc. have been ported to OS~X, -but they've been left commented out in \file{setup.py}. People who want -to experiment with these modules can uncomment them manually. - -% Jack's original comments: -%The main change is the possibility to build Python as a -%framework. This installs a self-contained Python installation plus the -%OSX framework "glue" into /Library/Frameworks/Python.framework (or -%another location of choice). For now there is little immedeate added -%benefit to this (actually, there is the disadvantage that you have to -%change your PATH to be able to find Python), but it is the basis for -%creating a fullblown Python application, porting the MacPython IDE, -%possibly using Python as a standard OSA scripting language and much -%more. You enable this with "configure --enable-framework". - -%The other change is that most MacPython toolbox modules, which -%interface to all the MacOS APIs such as windowing, quicktime, -%scripting, etc. have been ported. Again, most of these are not of -%immedeate use, as they need a full application to be really useful, so -%they have been commented out in setup.py. People wanting to experiment -%can uncomment them. Gestalt and Internet Config modules are enabled by -%default. - - \item Keyword arguments passed to builtin functions that don't take them - now cause a \exception{TypeError} exception to be raised, with the - message "\var{function} takes no keyword arguments". - - \item Weak references, added in Python 2.1 as an extension module, - are now part of the core because they're used in the implementation - of new-style classes. The \exception{ReferenceError} exception has - therefore moved from the \module{weakref} module to become a - built-in exception. - - \item A new script, \file{Tools/scripts/cleanfuture.py} by Tim - Peters, automatically removes obsolete \code{__future__} statements - from Python source code. - - \item An additional \var{flags} argument has been added to the - built-in function \function{compile()}, so the behaviour of - \code{__future__} statements can now be correctly observed in - simulated shells, such as those presented by IDLE and other - development environments. This is described in \pep{264}. - (Contributed by Michael Hudson.) - - \item The new license introduced with Python 1.6 wasn't - GPL-compatible. This is fixed by some minor textual changes to the - 2.2 license, so it's now legal to embed Python inside a GPLed - program again. Note that Python itself is not GPLed, but instead is - under a license that's essentially equivalent to the BSD license, - same as it always was. The license changes were also applied to the - Python 2.0.1 and 2.1.1 releases. - - \item When presented with a Unicode filename on Windows, Python will - now convert it to an MBCS encoded string, as used by the Microsoft - file APIs. As MBCS is explicitly used by the file APIs, Python's - choice of ASCII as the default encoding turns out to be an - annoyance. On \UNIX, the locale's character set is used if - \function{locale.nl_langinfo(CODESET)} is available. (Windows - support was contributed by Mark Hammond with assistance from - Marc-Andr\'e Lemburg. \UNIX{} support was added by Martin von L\"owis.) - - \item Large file support is now enabled on Windows. (Contributed by - Tim Peters.) - - \item The \file{Tools/scripts/ftpmirror.py} script - now parses a \file{.netrc} file, if you have one. - (Contributed by Mike Romberg.) - - \item Some features of the object returned by the - \function{xrange()} function are now deprecated, and trigger - warnings when they're accessed; they'll disappear in Python 2.3. - \class{xrange} objects tried to pretend they were full sequence - types by supporting slicing, sequence multiplication, and the - \keyword{in} operator, but these features were rarely used and - therefore buggy. The \method{tolist()} method and the - \member{start}, \member{stop}, and \member{step} attributes are also - being deprecated. At the C level, the fourth argument to the - \cfunction{PyRange_New()} function, \samp{repeat}, has also been - deprecated. - - \item There were a bunch of patches to the dictionary - implementation, mostly to fix potential core dumps if a dictionary - contains objects that sneakily changed their hash value, or mutated - the dictionary they were contained in. For a while python-dev fell - into a gentle rhythm of Michael Hudson finding a case that dumped - core, Tim Peters fixing the bug, Michael finding another case, and round - and round it went. - - \item On Windows, Python can now be compiled with Borland C thanks - to a number of patches contributed by Stephen Hansen, though the - result isn't fully functional yet. (But this \emph{is} progress...) - - \item Another Windows enhancement: Wise Solutions generously offered - PythonLabs use of their InstallerMaster 8.1 system. Earlier - PythonLabs Windows installers used Wise 5.0a, which was beginning to - show its age. (Packaged up by Tim Peters.) - - \item Files ending in \samp{.pyw} can now be imported on Windows. - \samp{.pyw} is a Windows-only thing, used to indicate that a script - needs to be run using PYTHONW.EXE instead of PYTHON.EXE in order to - prevent a DOS console from popping up to display the output. This - patch makes it possible to import such scripts, in case they're also - usable as modules. (Implemented by David Bolen.) - - \item On platforms where Python uses the C \cfunction{dlopen()} function - to load extension modules, it's now possible to set the flags used - by \cfunction{dlopen()} using the \function{sys.getdlopenflags()} and - \function{sys.setdlopenflags()} functions. (Contributed by Bram Stolk.) - - \item The \function{pow()} built-in function no longer supports 3 - arguments when floating-point numbers are supplied. - \code{pow(\var{x}, \var{y}, \var{z})} returns \code{(x**y) \% z}, but - this is never useful for floating point numbers, and the final - result varies unpredictably depending on the platform. A call such - as \code{pow(2.0, 8.0, 7.0)} will now raise a \exception{TypeError} - exception. - -\end{itemize} - - -%====================================================================== -\section{Acknowledgements} - -The author would like to thank the following people for offering -suggestions, corrections and assistance with various drafts of this -article: Fred Bremmer, Keith Briggs, Andrew Dalke, Fred~L. Drake, Jr., -Carel Fellinger, David Goodger, Mark Hammond, Stephen Hansen, Michael -Hudson, Jack Jansen, Marc-Andr\'e Lemburg, Martin von L\"owis, Fredrik -Lundh, Michael McLay, Nick Mathewson, Paul Moore, Gustavo Niemeyer, -Don O'Donnell, Joonas Paalasma, Tim Peters, Jens Quade, Tom Reinhardt, Neil -Schemenauer, Guido van Rossum, Greg Ward, Edward Welbourne. - -\end{document} diff --git a/Doc/whatsnew/whatsnew23.tex b/Doc/whatsnew/whatsnew23.tex deleted file mode 100644 index 7c92be2..0000000 --- a/Doc/whatsnew/whatsnew23.tex +++ /dev/null @@ -1,2380 +0,0 @@ -\documentclass{howto} -\usepackage{distutils} -% $Id$ - -\title{What's New in Python 2.3} -\release{1.01} -\author{A.M.\ Kuchling} -\authoraddress{ - \strong{Python Software Foundation}\\ - Email: \email{amk@amk.ca} -} - -\begin{document} -\maketitle -\tableofcontents - -This article explains the new features in Python 2.3. Python 2.3 was -released on July 29, 2003. - -The main themes for Python 2.3 are polishing some of the features -added in 2.2, adding various small but useful enhancements to the core -language, and expanding the standard library. The new object model -introduced in the previous version has benefited from 18 months of -bugfixes and from optimization efforts that have improved the -performance of new-style classes. A few new built-in functions have -been added such as \function{sum()} and \function{enumerate()}. The -\keyword{in} operator can now be used for substring searches (e.g. -\code{"ab" in "abc"} returns \constant{True}). - -Some of the many new library features include Boolean, set, heap, and -date/time data types, the ability to import modules from ZIP-format -archives, metadata support for the long-awaited Python catalog, an -updated version of IDLE, and modules for logging messages, wrapping -text, parsing CSV files, processing command-line options, using BerkeleyDB -databases... the list of new and enhanced modules is lengthy. - -This article doesn't attempt to provide a complete specification of -the new features, but instead provides a convenient overview. For -full details, you should refer to the documentation for Python 2.3, -such as the \citetitle[../lib/lib.html]{Python Library Reference} and -the \citetitle[../ref/ref.html]{Python Reference Manual}. If you want -to understand the complete implementation and design rationale, -refer to the PEP for a particular new feature. - - -%====================================================================== -\section{PEP 218: A Standard Set Datatype} - -The new \module{sets} module contains an implementation of a set -datatype. The \class{Set} class is for mutable sets, sets that can -have members added and removed. The \class{ImmutableSet} class is for -sets that can't be modified, and instances of \class{ImmutableSet} can -therefore be used as dictionary keys. Sets are built on top of -dictionaries, so the elements within a set must be hashable. - -Here's a simple example: - -\begin{verbatim} ->>> import sets ->>> S = sets.Set([1,2,3]) ->>> S -Set([1, 2, 3]) ->>> 1 in S -True ->>> 0 in S -False ->>> S.add(5) ->>> S.remove(3) ->>> S -Set([1, 2, 5]) ->>> -\end{verbatim} - -The union and intersection of sets can be computed with the -\method{union()} and \method{intersection()} methods; an alternative -notation uses the bitwise operators \code{\&} and \code{|}. -Mutable sets also have in-place versions of these methods, -\method{union_update()} and \method{intersection_update()}. - -\begin{verbatim} ->>> S1 = sets.Set([1,2,3]) ->>> S2 = sets.Set([4,5,6]) ->>> S1.union(S2) -Set([1, 2, 3, 4, 5, 6]) ->>> S1 | S2 # Alternative notation -Set([1, 2, 3, 4, 5, 6]) ->>> S1.intersection(S2) -Set([]) ->>> S1 & S2 # Alternative notation -Set([]) ->>> S1.union_update(S2) ->>> S1 -Set([1, 2, 3, 4, 5, 6]) ->>> -\end{verbatim} - -It's also possible to take the symmetric difference of two sets. This -is the set of all elements in the union that aren't in the -intersection. Another way of putting it is that the symmetric -difference contains all elements that are in exactly one -set. Again, there's an alternative notation (\code{\^}), and an -in-place version with the ungainly name -\method{symmetric_difference_update()}. - -\begin{verbatim} ->>> S1 = sets.Set([1,2,3,4]) ->>> S2 = sets.Set([3,4,5,6]) ->>> S1.symmetric_difference(S2) -Set([1, 2, 5, 6]) ->>> S1 ^ S2 -Set([1, 2, 5, 6]) ->>> -\end{verbatim} - -There are also \method{issubset()} and \method{issuperset()} methods -for checking whether one set is a subset or superset of another: - -\begin{verbatim} ->>> S1 = sets.Set([1,2,3]) ->>> S2 = sets.Set([2,3]) ->>> S2.issubset(S1) -True ->>> S1.issubset(S2) -False ->>> S1.issuperset(S2) -True ->>> -\end{verbatim} - - -\begin{seealso} - -\seepep{218}{Adding a Built-In Set Object Type}{PEP written by Greg V. Wilson. -Implemented by Greg V. Wilson, Alex Martelli, and GvR.} - -\end{seealso} - - - -%====================================================================== -\section{PEP 255: Simple Generators\label{section-generators}} - -In Python 2.2, generators were added as an optional feature, to be -enabled by a \code{from __future__ import generators} directive. In -2.3 generators no longer need to be specially enabled, and are now -always present; this means that \keyword{yield} is now always a -keyword. The rest of this section is a copy of the description of -generators from the ``What's New in Python 2.2'' document; if you read -it back when Python 2.2 came out, you can skip the rest of this section. - -You're doubtless familiar with how function calls work in Python or C. -When you call a function, it gets a private namespace where its local -variables are created. When the function reaches a \keyword{return} -statement, the local variables are destroyed and the resulting value -is returned to the caller. A later call to the same function will get -a fresh new set of local variables. But, what if the local variables -weren't thrown away on exiting a function? What if you could later -resume the function where it left off? This is what generators -provide; they can be thought of as resumable functions. - -Here's the simplest example of a generator function: - -\begin{verbatim} -def generate_ints(N): - for i in range(N): - yield i -\end{verbatim} - -A new keyword, \keyword{yield}, was introduced for generators. Any -function containing a \keyword{yield} statement is a generator -function; this is detected by Python's bytecode compiler which -compiles the function specially as a result. - -When you call a generator function, it doesn't return a single value; -instead it returns a generator object that supports the iterator -protocol. On executing the \keyword{yield} statement, the generator -outputs the value of \code{i}, similar to a \keyword{return} -statement. The big difference between \keyword{yield} and a -\keyword{return} statement is that on reaching a \keyword{yield} the -generator's state of execution is suspended and local variables are -preserved. On the next call to the generator's \code{.next()} method, -the function will resume executing immediately after the -\keyword{yield} statement. (For complicated reasons, the -\keyword{yield} statement isn't allowed inside the \keyword{try} block -of a \keyword{try}...\keyword{finally} statement; read \pep{255} for a full -explanation of the interaction between \keyword{yield} and -exceptions.) - -Here's a sample usage of the \function{generate_ints()} generator: - -\begin{verbatim} ->>> gen = generate_ints(3) ->>> gen -<generator object at 0x8117f90> ->>> gen.next() -0 ->>> gen.next() -1 ->>> gen.next() -2 ->>> gen.next() -Traceback (most recent call last): - File "stdin", line 1, in ? - File "stdin", line 2, in generate_ints -StopIteration -\end{verbatim} - -You could equally write \code{for i in generate_ints(5)}, or -\code{a,b,c = generate_ints(3)}. - -Inside a generator function, the \keyword{return} statement can only -be used without a value, and signals the end of the procession of -values; afterwards the generator cannot return any further values. -\keyword{return} with a value, such as \code{return 5}, is a syntax -error inside a generator function. The end of the generator's results -can also be indicated by raising \exception{StopIteration} manually, -or by just letting the flow of execution fall off the bottom of the -function. - -You could achieve the effect of generators manually by writing your -own class and storing all the local variables of the generator as -instance variables. For example, returning a list of integers could -be done by setting \code{self.count} to 0, and having the -\method{next()} method increment \code{self.count} and return it. -However, for a moderately complicated generator, writing a -corresponding class would be much messier. -\file{Lib/test/test_generators.py} contains a number of more -interesting examples. The simplest one implements an in-order -traversal of a tree using generators recursively. - -\begin{verbatim} -# A recursive generator that generates Tree leaves in in-order. -def inorder(t): - if t: - for x in inorder(t.left): - yield x - yield t.label - for x in inorder(t.right): - yield x -\end{verbatim} - -Two other examples in \file{Lib/test/test_generators.py} produce -solutions for the N-Queens problem (placing $N$ queens on an $NxN$ -chess board so that no queen threatens another) and the Knight's Tour -(a route that takes a knight to every square of an $NxN$ chessboard -without visiting any square twice). - -The idea of generators comes from other programming languages, -especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the -idea of generators is central. In Icon, every -expression and function call behaves like a generator. One example -from ``An Overview of the Icon Programming Language'' at -\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of -what this looks like: - -\begin{verbatim} -sentence := "Store it in the neighboring harbor" -if (i := find("or", sentence)) > 5 then write(i) -\end{verbatim} - -In Icon the \function{find()} function returns the indexes at which the -substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement, -\code{i} is first assigned a value of 3, but 3 is less than 5, so the -comparison fails, and Icon retries it with the second value of 23. 23 -is greater than 5, so the comparison now succeeds, and the code prints -the value 23 to the screen. - -Python doesn't go nearly as far as Icon in adopting generators as a -central concept. Generators are considered part of the core -Python language, but learning or using them isn't compulsory; if they -don't solve any problems that you have, feel free to ignore them. -One novel feature of Python's interface as compared to -Icon's is that a generator's state is represented as a concrete object -(the iterator) that can be passed around to other functions or stored -in a data structure. - -\begin{seealso} - -\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim -Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer -and Tim Peters, with other fixes from the Python Labs crew.} - -\end{seealso} - - -%====================================================================== -\section{PEP 263: Source Code Encodings \label{section-encodings}} - -Python source files can now be declared as being in different -character set encodings. Encodings are declared by including a -specially formatted comment in the first or second line of the source -file. For example, a UTF-8 file can be declared with: - -\begin{verbatim} -#!/usr/bin/env python -# -*- coding: UTF-8 -*- -\end{verbatim} - -Without such an encoding declaration, the default encoding used is -7-bit ASCII. Executing or importing modules that contain string -literals with 8-bit characters and have no encoding declaration will result -in a \exception{DeprecationWarning} being signalled by Python 2.3; in -2.4 this will be a syntax error. - -The encoding declaration only affects Unicode string literals, which -will be converted to Unicode using the specified encoding. Note that -Python identifiers are still restricted to ASCII characters, so you -can't have variable names that use characters outside of the usual -alphanumerics. - -\begin{seealso} - -\seepep{263}{Defining Python Source Code Encodings}{Written by -Marc-Andr\'e Lemburg and Martin von~L\"owis; implemented by Suzuki -Hisao and Martin von~L\"owis.} - -\end{seealso} - - -%====================================================================== -\section{PEP 273: Importing Modules from ZIP Archives} - -The new \module{zipimport} module adds support for importing -modules from a ZIP-format archive. You don't need to import the -module explicitly; it will be automatically imported if a ZIP -archive's filename is added to \code{sys.path}. For example: - -\begin{verbatim} -amk@nyman:~/src/python$ unzip -l /tmp/example.zip -Archive: /tmp/example.zip - Length Date Time Name - -------- ---- ---- ---- - 8467 11-26-02 22:30 jwzthreading.py - -------- ------- - 8467 1 file -amk@nyman:~/src/python$ ./python -Python 2.3 (#1, Aug 1 2003, 19:54:32) ->>> import sys ->>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path ->>> import jwzthreading ->>> jwzthreading.__file__ -'/tmp/example.zip/jwzthreading.py' ->>> -\end{verbatim} - -An entry in \code{sys.path} can now be the filename of a ZIP archive. -The ZIP archive can contain any kind of files, but only files named -\file{*.py}, \file{*.pyc}, or \file{*.pyo} can be imported. If an -archive only contains \file{*.py} files, Python will not attempt to -modify the archive by adding the corresponding \file{*.pyc} file, meaning -that if a ZIP archive doesn't contain \file{*.pyc} files, importing may be -rather slow. - -A path within the archive can also be specified to only import from a -subdirectory; for example, the path \file{/tmp/example.zip/lib/} -would only import from the \file{lib/} subdirectory within the -archive. - -\begin{seealso} - -\seepep{273}{Import Modules from Zip Archives}{Written by James C. Ahlstrom, -who also provided an implementation. -Python 2.3 follows the specification in \pep{273}, -but uses an implementation written by Just van~Rossum -that uses the import hooks described in \pep{302}. -See section~\ref{section-pep302} for a description of the new import hooks. -} - -\end{seealso} - -%====================================================================== -\section{PEP 277: Unicode file name support for Windows NT} - -On Windows NT, 2000, and XP, the system stores file names as Unicode -strings. Traditionally, Python has represented file names as byte -strings, which is inadequate because it renders some file names -inaccessible. - -Python now allows using arbitrary Unicode strings (within the -limitations of the file system) for all functions that expect file -names, most notably the \function{open()} built-in function. If a Unicode -string is passed to \function{os.listdir()}, Python now returns a list -of Unicode strings. A new function, \function{os.getcwdu()}, returns -the current directory as a Unicode string. - -Byte strings still work as file names, and on Windows Python will -transparently convert them to Unicode using the \code{mbcs} encoding. - -Other systems also allow Unicode strings as file names but convert -them to byte strings before passing them to the system, which can -cause a \exception{UnicodeError} to be raised. Applications can test -whether arbitrary Unicode strings are supported as file names by -checking \member{os.path.supports_unicode_filenames}, a Boolean value. - -Under MacOS, \function{os.listdir()} may now return Unicode filenames. - -\begin{seealso} - -\seepep{277}{Unicode file name support for Windows NT}{Written by Neil -Hodgson; implemented by Neil Hodgson, Martin von~L\"owis, and Mark -Hammond.} - -\end{seealso} - - -%====================================================================== -\section{PEP 278: Universal Newline Support} - -The three major operating systems used today are Microsoft Windows, -Apple's Macintosh OS, and the various \UNIX\ derivatives. A minor -irritation of cross-platform work -is that these three platforms all use different characters -to mark the ends of lines in text files. \UNIX\ uses the linefeed -(ASCII character 10), MacOS uses the carriage return (ASCII -character 13), and Windows uses a two-character sequence of a -carriage return plus a newline. - -Python's file objects can now support end of line conventions other -than the one followed by the platform on which Python is running. -Opening a file with the mode \code{'U'} or \code{'rU'} will open a file -for reading in universal newline mode. All three line ending -conventions will be translated to a \character{\e n} in the strings -returned by the various file methods such as \method{read()} and -\method{readline()}. - -Universal newline support is also used when importing modules and when -executing a file with the \function{execfile()} function. This means -that Python modules can be shared between all three operating systems -without needing to convert the line-endings. - -This feature can be disabled when compiling Python by specifying -the \longprogramopt{without-universal-newlines} switch when running Python's -\program{configure} script. - -\begin{seealso} - -\seepep{278}{Universal Newline Support}{Written -and implemented by Jack Jansen.} - -\end{seealso} - - -%====================================================================== -\section{PEP 279: enumerate()\label{section-enumerate}} - -A new built-in function, \function{enumerate()}, will make -certain loops a bit clearer. \code{enumerate(thing)}, where -\var{thing} is either an iterator or a sequence, returns a iterator -that will return \code{(0, \var{thing}[0])}, \code{(1, -\var{thing}[1])}, \code{(2, \var{thing}[2])}, and so forth. - -A common idiom to change every element of a list looks like this: - -\begin{verbatim} -for i in range(len(L)): - item = L[i] - # ... compute some result based on item ... - L[i] = result -\end{verbatim} - -This can be rewritten using \function{enumerate()} as: - -\begin{verbatim} -for i, item in enumerate(L): - # ... compute some result based on item ... - L[i] = result -\end{verbatim} - - -\begin{seealso} - -\seepep{279}{The enumerate() built-in function}{Written -and implemented by Raymond D. Hettinger.} - -\end{seealso} - - -%====================================================================== -\section{PEP 282: The logging Package} - -A standard package for writing logs, \module{logging}, has been added -to Python 2.3. It provides a powerful and flexible mechanism for -generating logging output which can then be filtered and processed in -various ways. A configuration file written in a standard format can -be used to control the logging behavior of a program. Python -includes handlers that will write log records to -standard error or to a file or socket, send them to the system log, or -even e-mail them to a particular address; of course, it's also -possible to write your own handler classes. - -The \class{Logger} class is the primary class. -Most application code will deal with one or more \class{Logger} -objects, each one used by a particular subsystem of the application. -Each \class{Logger} is identified by a name, and names are organized -into a hierarchy using \samp{.} as the component separator. For -example, you might have \class{Logger} instances named \samp{server}, -\samp{server.auth} and \samp{server.network}. The latter two -instances are below \samp{server} in the hierarchy. This means that -if you turn up the verbosity for \samp{server} or direct \samp{server} -messages to a different handler, the changes will also apply to -records logged to \samp{server.auth} and \samp{server.network}. -There's also a root \class{Logger} that's the parent of all other -loggers. - -For simple uses, the \module{logging} package contains some -convenience functions that always use the root log: - -\begin{verbatim} -import logging - -logging.debug('Debugging information') -logging.info('Informational message') -logging.warning('Warning:config file %s not found', 'server.conf') -logging.error('Error occurred') -logging.critical('Critical error -- shutting down') -\end{verbatim} - -This produces the following output: - -\begin{verbatim} -WARNING:root:Warning:config file server.conf not found -ERROR:root:Error occurred -CRITICAL:root:Critical error -- shutting down -\end{verbatim} - -In the default configuration, informational and debugging messages are -suppressed and the output is sent to standard error. You can enable -the display of informational and debugging messages by calling the -\method{setLevel()} method on the root logger. - -Notice the \function{warning()} call's use of string formatting -operators; all of the functions for logging messages take the -arguments \code{(\var{msg}, \var{arg1}, \var{arg2}, ...)} and log the -string resulting from \code{\var{msg} \% (\var{arg1}, \var{arg2}, -...)}. - -There's also an \function{exception()} function that records the most -recent traceback. Any of the other functions will also record the -traceback if you specify a true value for the keyword argument -\var{exc_info}. - -\begin{verbatim} -def f(): - try: 1/0 - except: logging.exception('Problem recorded') - -f() -\end{verbatim} - -This produces the following output: - -\begin{verbatim} -ERROR:root:Problem recorded -Traceback (most recent call last): - File "t.py", line 6, in f - 1/0 -ZeroDivisionError: integer division or modulo by zero -\end{verbatim} - -Slightly more advanced programs will use a logger other than the root -logger. The \function{getLogger(\var{name})} function is used to get -a particular log, creating it if it doesn't exist yet. -\function{getLogger(None)} returns the root logger. - - -\begin{verbatim} -log = logging.getLogger('server') - ... -log.info('Listening on port %i', port) - ... -log.critical('Disk full') - ... -\end{verbatim} - -Log records are usually propagated up the hierarchy, so a message -logged to \samp{server.auth} is also seen by \samp{server} and -\samp{root}, but a \class{Logger} can prevent this by setting its -\member{propagate} attribute to \constant{False}. - -There are more classes provided by the \module{logging} package that -can be customized. When a \class{Logger} instance is told to log a -message, it creates a \class{LogRecord} instance that is sent to any -number of different \class{Handler} instances. Loggers and handlers -can also have an attached list of filters, and each filter can cause -the \class{LogRecord} to be ignored or can modify the record before -passing it along. When they're finally output, \class{LogRecord} -instances are converted to text by a \class{Formatter} class. All of -these classes can be replaced by your own specially-written classes. - -With all of these features the \module{logging} package should provide -enough flexibility for even the most complicated applications. This -is only an incomplete overview of its features, so please see the -\ulink{package's reference documentation}{../lib/module-logging.html} -for all of the details. Reading \pep{282} will also be helpful. - - -\begin{seealso} - -\seepep{282}{A Logging System}{Written by Vinay Sajip and Trent Mick; -implemented by Vinay Sajip.} - -\end{seealso} - - -%====================================================================== -\section{PEP 285: A Boolean Type\label{section-bool}} - -A Boolean type was added to Python 2.3. Two new constants were added -to the \module{__builtin__} module, \constant{True} and -\constant{False}. (\constant{True} and -\constant{False} constants were added to the built-ins -in Python 2.2.1, but the 2.2.1 versions are simply set to integer values of -1 and 0 and aren't a different type.) - -The type object for this new type is named -\class{bool}; the constructor for it takes any Python value and -converts it to \constant{True} or \constant{False}. - -\begin{verbatim} ->>> bool(1) -True ->>> bool(0) -False ->>> bool([]) -False ->>> bool( (1,) ) -True -\end{verbatim} - -Most of the standard library modules and built-in functions have been -changed to return Booleans. - -\begin{verbatim} ->>> obj = [] ->>> hasattr(obj, 'append') -True ->>> isinstance(obj, list) -True ->>> isinstance(obj, tuple) -False -\end{verbatim} - -Python's Booleans were added with the primary goal of making code -clearer. For example, if you're reading a function and encounter the -statement \code{return 1}, you might wonder whether the \code{1} -represents a Boolean truth value, an index, or a -coefficient that multiplies some other quantity. If the statement is -\code{return True}, however, the meaning of the return value is quite -clear. - -Python's Booleans were \emph{not} added for the sake of strict -type-checking. A very strict language such as Pascal would also -prevent you performing arithmetic with Booleans, and would require -that the expression in an \keyword{if} statement always evaluate to a -Boolean result. Python is not this strict and never will be, as -\pep{285} explicitly says. This means you can still use any -expression in an \keyword{if} statement, even ones that evaluate to a -list or tuple or some random object. The Boolean type is a -subclass of the \class{int} class so that arithmetic using a Boolean -still works. - -\begin{verbatim} ->>> True + 1 -2 ->>> False + 1 -1 ->>> False * 75 -0 ->>> True * 75 -75 -\end{verbatim} - -To sum up \constant{True} and \constant{False} in a sentence: they're -alternative ways to spell the integer values 1 and 0, with the single -difference that \function{str()} and \function{repr()} return the -strings \code{'True'} and \code{'False'} instead of \code{'1'} and -\code{'0'}. - -\begin{seealso} - -\seepep{285}{Adding a bool type}{Written and implemented by GvR.} - -\end{seealso} - - -%====================================================================== -\section{PEP 293: Codec Error Handling Callbacks} - -When encoding a Unicode string into a byte string, unencodable -characters may be encountered. So far, Python has allowed specifying -the error processing as either ``strict'' (raising -\exception{UnicodeError}), ``ignore'' (skipping the character), or -``replace'' (using a question mark in the output string), with -``strict'' being the default behavior. It may be desirable to specify -alternative processing of such errors, such as inserting an XML -character reference or HTML entity reference into the converted -string. - -Python now has a flexible framework to add different processing -strategies. New error handlers can be added with -\function{codecs.register_error}, and codecs then can access the error -handler with \function{codecs.lookup_error}. An equivalent C API has -been added for codecs written in C. The error handler gets the -necessary state information such as the string being converted, the -position in the string where the error was detected, and the target -encoding. The handler can then either raise an exception or return a -replacement string. - -Two additional error handlers have been implemented using this -framework: ``backslashreplace'' uses Python backslash quoting to -represent unencodable characters and ``xmlcharrefreplace'' emits -XML character references. - -\begin{seealso} - -\seepep{293}{Codec Error Handling Callbacks}{Written and implemented by -Walter D\"orwald.} - -\end{seealso} - - -%====================================================================== -\section{PEP 301: Package Index and Metadata for -Distutils\label{section-pep301}} - -Support for the long-requested Python catalog makes its first -appearance in 2.3. - -The heart of the catalog is the new Distutils \command{register} command. -Running \code{python setup.py register} will collect the metadata -describing a package, such as its name, version, maintainer, -description, \&c., and send it to a central catalog server. The -resulting catalog is available from \url{http://www.python.org/pypi}. - -To make the catalog a bit more useful, a new optional -\var{classifiers} keyword argument has been added to the Distutils -\function{setup()} function. A list of -\ulink{Trove}{http://catb.org/\textasciitilde esr/trove/}-style -strings can be supplied to help classify the software. - -Here's an example \file{setup.py} with classifiers, written to be compatible -with older versions of the Distutils: - -\begin{verbatim} -from distutils import core -kw = {'name': "Quixote", - 'version': "0.5.1", - 'description': "A highly Pythonic Web application framework", - # ... - } - -if (hasattr(core, 'setup_keywords') and - 'classifiers' in core.setup_keywords): - kw['classifiers'] = \ - ['Topic :: Internet :: WWW/HTTP :: Dynamic Content', - 'Environment :: No Input/Output (Daemon)', - 'Intended Audience :: Developers'], - -core.setup(**kw) -\end{verbatim} - -The full list of classifiers can be obtained by running -\verb|python setup.py register --list-classifiers|. - -\begin{seealso} - -\seepep{301}{Package Index and Metadata for Distutils}{Written and -implemented by Richard Jones.} - -\end{seealso} - - -%====================================================================== -\section{PEP 302: New Import Hooks \label{section-pep302}} - -While it's been possible to write custom import hooks ever since the -\module{ihooks} module was introduced in Python 1.3, no one has ever -been really happy with it because writing new import hooks is -difficult and messy. There have been various proposed alternatives -such as the \module{imputil} and \module{iu} modules, but none of them -has ever gained much acceptance, and none of them were easily usable -from \C{} code. - -\pep{302} borrows ideas from its predecessors, especially from -Gordon McMillan's \module{iu} module. Three new items -are added to the \module{sys} module: - -\begin{itemize} - \item \code{sys.path_hooks} is a list of callable objects; most - often they'll be classes. Each callable takes a string containing a - path and either returns an importer object that will handle imports - from this path or raises an \exception{ImportError} exception if it - can't handle this path. - - \item \code{sys.path_importer_cache} caches importer objects for - each path, so \code{sys.path_hooks} will only need to be traversed - once for each path. - - \item \code{sys.meta_path} is a list of importer objects that will - be traversed before \code{sys.path} is checked. This list is - initially empty, but user code can add objects to it. Additional - built-in and frozen modules can be imported by an object added to - this list. - -\end{itemize} - -Importer objects must have a single method, -\method{find_module(\var{fullname}, \var{path}=None)}. \var{fullname} -will be a module or package name, e.g. \samp{string} or -\samp{distutils.core}. \method{find_module()} must return a loader object -that has a single method, \method{load_module(\var{fullname})}, that -creates and returns the corresponding module object. - -Pseudo-code for Python's new import logic, therefore, looks something -like this (simplified a bit; see \pep{302} for the full details): - -\begin{verbatim} -for mp in sys.meta_path: - loader = mp(fullname) - if loader is not None: - <module> = loader.load_module(fullname) - -for path in sys.path: - for hook in sys.path_hooks: - try: - importer = hook(path) - except ImportError: - # ImportError, so try the other path hooks - pass - else: - loader = importer.find_module(fullname) - <module> = loader.load_module(fullname) - -# Not found! -raise ImportError -\end{verbatim} - -\begin{seealso} - -\seepep{302}{New Import Hooks}{Written by Just van~Rossum and Paul Moore. -Implemented by Just van~Rossum. -} - -\end{seealso} - - -%====================================================================== -\section{PEP 305: Comma-separated Files \label{section-pep305}} - -Comma-separated files are a format frequently used for exporting data -from databases and spreadsheets. Python 2.3 adds a parser for -comma-separated files. - -Comma-separated format is deceptively simple at first glance: - -\begin{verbatim} -Costs,150,200,3.95 -\end{verbatim} - -Read a line and call \code{line.split(',')}: what could be simpler? -But toss in string data that can contain commas, and things get more -complicated: - -\begin{verbatim} -"Costs",150,200,3.95,"Includes taxes, shipping, and sundry items" -\end{verbatim} - -A big ugly regular expression can parse this, but using the new -\module{csv} package is much simpler: - -\begin{verbatim} -import csv - -input = open('datafile', 'rb') -reader = csv.reader(input) -for line in reader: - print line -\end{verbatim} - -The \function{reader} function takes a number of different options. -The field separator isn't limited to the comma and can be changed to -any character, and so can the quoting and line-ending characters. - -Different dialects of comma-separated files can be defined and -registered; currently there are two dialects, both used by Microsoft Excel. -A separate \class{csv.writer} class will generate comma-separated files -from a succession of tuples or lists, quoting strings that contain the -delimiter. - -\begin{seealso} - -\seepep{305}{CSV File API}{Written and implemented -by Kevin Altis, Dave Cole, Andrew McNamara, Skip Montanaro, Cliff Wells. -} - -\end{seealso} - -%====================================================================== -\section{PEP 307: Pickle Enhancements \label{section-pep307}} - -The \module{pickle} and \module{cPickle} modules received some -attention during the 2.3 development cycle. In 2.2, new-style classes -could be pickled without difficulty, but they weren't pickled very -compactly; \pep{307} quotes a trivial example where a new-style class -results in a pickled string three times longer than that for a classic -class. - -The solution was to invent a new pickle protocol. The -\function{pickle.dumps()} function has supported a text-or-binary flag -for a long time. In 2.3, this flag is redefined from a Boolean to an -integer: 0 is the old text-mode pickle format, 1 is the old binary -format, and now 2 is a new 2.3-specific format. A new constant, -\constant{pickle.HIGHEST_PROTOCOL}, can be used to select the fanciest -protocol available. - -Unpickling is no longer considered a safe operation. 2.2's -\module{pickle} provided hooks for trying to prevent unsafe classes -from being unpickled (specifically, a -\member{__safe_for_unpickling__} attribute), but none of this code -was ever audited and therefore it's all been ripped out in 2.3. You -should not unpickle untrusted data in any version of Python. - -To reduce the pickling overhead for new-style classes, a new interface -for customizing pickling was added using three special methods: -\method{__getstate__}, \method{__setstate__}, and -\method{__getnewargs__}. Consult \pep{307} for the full semantics -of these methods. - -As a way to compress pickles yet further, it's now possible to use -integer codes instead of long strings to identify pickled classes. -The Python Software Foundation will maintain a list of standardized -codes; there's also a range of codes for private use. Currently no -codes have been specified. - -\begin{seealso} - -\seepep{307}{Extensions to the pickle protocol}{Written and implemented -by Guido van Rossum and Tim Peters.} - -\end{seealso} - -%====================================================================== -\section{Extended Slices\label{section-slices}} - -Ever since Python 1.4, the slicing syntax has supported an optional -third ``step'' or ``stride'' argument. For example, these are all -legal Python syntax: \code{L[1:10:2]}, \code{L[:-1:1]}, -\code{L[::-1]}. This was added to Python at the request of -the developers of Numerical Python, which uses the third argument -extensively. However, Python's built-in list, tuple, and string -sequence types have never supported this feature, raising a -\exception{TypeError} if you tried it. Michael Hudson contributed a -patch to fix this shortcoming. - -For example, you can now easily extract the elements of a list that -have even indexes: - -\begin{verbatim} ->>> L = range(10) ->>> L[::2] -[0, 2, 4, 6, 8] -\end{verbatim} - -Negative values also work to make a copy of the same list in reverse -order: - -\begin{verbatim} ->>> L[::-1] -[9, 8, 7, 6, 5, 4, 3, 2, 1, 0] -\end{verbatim} - -This also works for tuples, arrays, and strings: - -\begin{verbatim} ->>> s='abcd' ->>> s[::2] -'ac' ->>> s[::-1] -'dcba' -\end{verbatim} - -If you have a mutable sequence such as a list or an array you can -assign to or delete an extended slice, but there are some differences -between assignment to extended and regular slices. Assignment to a -regular slice can be used to change the length of the sequence: - -\begin{verbatim} ->>> a = range(3) ->>> a -[0, 1, 2] ->>> a[1:3] = [4, 5, 6] ->>> a -[0, 4, 5, 6] -\end{verbatim} - -Extended slices aren't this flexible. When assigning to an extended -slice, the list on the right hand side of the statement must contain -the same number of items as the slice it is replacing: - -\begin{verbatim} ->>> a = range(4) ->>> a -[0, 1, 2, 3] ->>> a[::2] -[0, 2] ->>> a[::2] = [0, -1] ->>> a -[0, 1, -1, 3] ->>> a[::2] = [0,1,2] -Traceback (most recent call last): - File "<stdin>", line 1, in ? -ValueError: attempt to assign sequence of size 3 to extended slice of size 2 -\end{verbatim} - -Deletion is more straightforward: - -\begin{verbatim} ->>> a = range(4) ->>> a -[0, 1, 2, 3] ->>> a[::2] -[0, 2] ->>> del a[::2] ->>> a -[1, 3] -\end{verbatim} - -One can also now pass slice objects to the -\method{__getitem__} methods of the built-in sequences: - -\begin{verbatim} ->>> range(10).__getitem__(slice(0, 5, 2)) -[0, 2, 4] -\end{verbatim} - -Or use slice objects directly in subscripts: - -\begin{verbatim} ->>> range(10)[slice(0, 5, 2)] -[0, 2, 4] -\end{verbatim} - -To simplify implementing sequences that support extended slicing, -slice objects now have a method \method{indices(\var{length})} which, -given the length of a sequence, returns a \code{(\var{start}, -\var{stop}, \var{step})} tuple that can be passed directly to -\function{range()}. -\method{indices()} handles omitted and out-of-bounds indices in a -manner consistent with regular slices (and this innocuous phrase hides -a welter of confusing details!). The method is intended to be used -like this: - -\begin{verbatim} -class FakeSeq: - ... - def calc_item(self, i): - ... - def __getitem__(self, item): - if isinstance(item, slice): - indices = item.indices(len(self)) - return FakeSeq([self.calc_item(i) for i in range(*indices)]) - else: - return self.calc_item(i) -\end{verbatim} - -From this example you can also see that the built-in \class{slice} -object is now the type object for the slice type, and is no longer a -function. This is consistent with Python 2.2, where \class{int}, -\class{str}, etc., underwent the same change. - - -%====================================================================== -\section{Other Language Changes} - -Here are all of the changes that Python 2.3 makes to the core Python -language. - -\begin{itemize} -\item The \keyword{yield} statement is now always a keyword, as -described in section~\ref{section-generators} of this document. - -\item A new built-in function \function{enumerate()} -was added, as described in section~\ref{section-enumerate} of this -document. - -\item Two new constants, \constant{True} and \constant{False} were -added along with the built-in \class{bool} type, as described in -section~\ref{section-bool} of this document. - -\item The \function{int()} type constructor will now return a long -integer instead of raising an \exception{OverflowError} when a string -or floating-point number is too large to fit into an integer. This -can lead to the paradoxical result that -\code{isinstance(int(\var{expression}), int)} is false, but that seems -unlikely to cause problems in practice. - -\item Built-in types now support the extended slicing syntax, -as described in section~\ref{section-slices} of this document. - -\item A new built-in function, \function{sum(\var{iterable}, \var{start}=0)}, -adds up the numeric items in the iterable object and returns their sum. -\function{sum()} only accepts numbers, meaning that you can't use it -to concatenate a bunch of strings. (Contributed by Alex -Martelli.) - -\item \code{list.insert(\var{pos}, \var{value})} used to -insert \var{value} at the front of the list when \var{pos} was -negative. The behaviour has now been changed to be consistent with -slice indexing, so when \var{pos} is -1 the value will be inserted -before the last element, and so forth. - -\item \code{list.index(\var{value})}, which searches for \var{value} -within the list and returns its index, now takes optional -\var{start} and \var{stop} arguments to limit the search to -only part of the list. - -\item Dictionaries have a new method, \method{pop(\var{key}\optional{, -\var{default}})}, that returns the value corresponding to \var{key} -and removes that key/value pair from the dictionary. If the requested -key isn't present in the dictionary, \var{default} is returned if it's -specified and \exception{KeyError} raised if it isn't. - -\begin{verbatim} ->>> d = {1:2} ->>> d -{1: 2} ->>> d.pop(4) -Traceback (most recent call last): - File "stdin", line 1, in ? -KeyError: 4 ->>> d.pop(1) -2 ->>> d.pop(1) -Traceback (most recent call last): - File "stdin", line 1, in ? -KeyError: 'pop(): dictionary is empty' ->>> d -{} ->>> -\end{verbatim} - -There's also a new class method, -\method{dict.fromkeys(\var{iterable}, \var{value})}, that -creates a dictionary with keys taken from the supplied iterator -\var{iterable} and all values set to \var{value}, defaulting to -\code{None}. - -(Patches contributed by Raymond Hettinger.) - -Also, the \function{dict()} constructor now accepts keyword arguments to -simplify creating small dictionaries: - -\begin{verbatim} ->>> dict(red=1, blue=2, green=3, black=4) -{'blue': 2, 'black': 4, 'green': 3, 'red': 1} -\end{verbatim} - -(Contributed by Just van~Rossum.) - -\item The \keyword{assert} statement no longer checks the \code{__debug__} -flag, so you can no longer disable assertions by assigning to \code{__debug__}. -Running Python with the \programopt{-O} switch will still generate -code that doesn't execute any assertions. - -\item Most type objects are now callable, so you can use them -to create new objects such as functions, classes, and modules. (This -means that the \module{new} module can be deprecated in a future -Python version, because you can now use the type objects available in -the \module{types} module.) -% XXX should new.py use PendingDeprecationWarning? -For example, you can create a new module object with the following code: - -\begin{verbatim} ->>> import types ->>> m = types.ModuleType('abc','docstring') ->>> m -<module 'abc' (built-in)> ->>> m.__doc__ -'docstring' -\end{verbatim} - -\item -A new warning, \exception{PendingDeprecationWarning} was added to -indicate features which are in the process of being -deprecated. The warning will \emph{not} be printed by default. To -check for use of features that will be deprecated in the future, -supply \programopt{-Walways::PendingDeprecationWarning::} on the -command line or use \function{warnings.filterwarnings()}. - -\item The process of deprecating string-based exceptions, as -in \code{raise "Error occurred"}, has begun. Raising a string will -now trigger \exception{PendingDeprecationWarning}. - -\item Using \code{None} as a variable name will now result in a -\exception{SyntaxWarning} warning. In a future version of Python, -\code{None} may finally become a keyword. - -\item The \method{xreadlines()} method of file objects, introduced in -Python 2.1, is no longer necessary because files now behave as their -own iterator. \method{xreadlines()} was originally introduced as a -faster way to loop over all the lines in a file, but now you can -simply write \code{for line in file_obj}. File objects also have a -new read-only \member{encoding} attribute that gives the encoding used -by the file; Unicode strings written to the file will be automatically -converted to bytes using the given encoding. - -\item The method resolution order used by new-style classes has -changed, though you'll only notice the difference if you have a really -complicated inheritance hierarchy. Classic classes are unaffected by -this change. Python 2.2 originally used a topological sort of a -class's ancestors, but 2.3 now uses the C3 algorithm as described in -the paper \ulink{``A Monotonic Superclass Linearization for -Dylan''}{http://www.webcom.com/haahr/dylan/linearization-oopsla96.html}. -To understand the motivation for this change, -read Michele Simionato's article -\ulink{``Python 2.3 Method Resolution Order''} - {http://www.python.org/2.3/mro.html}, or -read the thread on python-dev starting with the message at -\url{http://mail.python.org/pipermail/python-dev/2002-October/029035.html}. -Samuele Pedroni first pointed out the problem and also implemented the -fix by coding the C3 algorithm. - -\item Python runs multithreaded programs by switching between threads -after executing N bytecodes. The default value for N has been -increased from 10 to 100 bytecodes, speeding up single-threaded -applications by reducing the switching overhead. Some multithreaded -applications may suffer slower response time, but that's easily fixed -by setting the limit back to a lower number using -\function{sys.setcheckinterval(\var{N})}. -The limit can be retrieved with the new -\function{sys.getcheckinterval()} function. - -\item One minor but far-reaching change is that the names of extension -types defined by the modules included with Python now contain the -module and a \character{.} in front of the type name. For example, in -Python 2.2, if you created a socket and printed its -\member{__class__}, you'd get this output: - -\begin{verbatim} ->>> s = socket.socket() ->>> s.__class__ -<type 'socket'> -\end{verbatim} - -In 2.3, you get this: -\begin{verbatim} ->>> s.__class__ -<type '_socket.socket'> -\end{verbatim} - -\item One of the noted incompatibilities between old- and new-style - classes has been removed: you can now assign to the - \member{__name__} and \member{__bases__} attributes of new-style - classes. There are some restrictions on what can be assigned to - \member{__bases__} along the lines of those relating to assigning to - an instance's \member{__class__} attribute. - -\end{itemize} - - -%====================================================================== -\subsection{String Changes} - -\begin{itemize} - -\item The \keyword{in} operator now works differently for strings. -Previously, when evaluating \code{\var{X} in \var{Y}} where \var{X} -and \var{Y} are strings, \var{X} could only be a single character. -That's now changed; \var{X} can be a string of any length, and -\code{\var{X} in \var{Y}} will return \constant{True} if \var{X} is a -substring of \var{Y}. If \var{X} is the empty string, the result is -always \constant{True}. - -\begin{verbatim} ->>> 'ab' in 'abcd' -True ->>> 'ad' in 'abcd' -False ->>> '' in 'abcd' -True -\end{verbatim} - -Note that this doesn't tell you where the substring starts; if you -need that information, use the \method{find()} string method. - -\item The \method{strip()}, \method{lstrip()}, and \method{rstrip()} -string methods now have an optional argument for specifying the -characters to strip. The default is still to remove all whitespace -characters: - -\begin{verbatim} ->>> ' abc '.strip() -'abc' ->>> '><><abc<><><>'.strip('<>') -'abc' ->>> '><><abc<><><>\n'.strip('<>') -'abc<><><>\n' ->>> u'\u4000\u4001abc\u4000'.strip(u'\u4000') -u'\u4001abc' ->>> -\end{verbatim} - -(Suggested by Simon Brunning and implemented by Walter D\"orwald.) - -\item The \method{startswith()} and \method{endswith()} -string methods now accept negative numbers for the \var{start} and \var{end} -parameters. - -\item Another new string method is \method{zfill()}, originally a -function in the \module{string} module. \method{zfill()} pads a -numeric string with zeros on the left until it's the specified width. -Note that the \code{\%} operator is still more flexible and powerful -than \method{zfill()}. - -\begin{verbatim} ->>> '45'.zfill(4) -'0045' ->>> '12345'.zfill(4) -'12345' ->>> 'goofy'.zfill(6) -'0goofy' -\end{verbatim} - -(Contributed by Walter D\"orwald.) - -\item A new type object, \class{basestring}, has been added. - Both 8-bit strings and Unicode strings inherit from this type, so - \code{isinstance(obj, basestring)} will return \constant{True} for - either kind of string. It's a completely abstract type, so you - can't create \class{basestring} instances. - -\item Interned strings are no longer immortal and will now be -garbage-collected in the usual way when the only reference to them is -from the internal dictionary of interned strings. (Implemented by -Oren Tirosh.) - -\end{itemize} - - -%====================================================================== -\subsection{Optimizations} - -\begin{itemize} - -\item The creation of new-style class instances has been made much -faster; they're now faster than classic classes! - -\item The \method{sort()} method of list objects has been extensively -rewritten by Tim Peters, and the implementation is significantly -faster. - -\item Multiplication of large long integers is now much faster thanks -to an implementation of Karatsuba multiplication, an algorithm that -scales better than the O(n*n) required for the grade-school -multiplication algorithm. (Original patch by Christopher A. Craig, -and significantly reworked by Tim Peters.) - -\item The \code{SET_LINENO} opcode is now gone. This may provide a -small speed increase, depending on your compiler's idiosyncrasies. -See section~\ref{section-other} for a longer explanation. -(Removed by Michael Hudson.) - -\item \function{xrange()} objects now have their own iterator, making -\code{for i in xrange(n)} slightly faster than -\code{for i in range(n)}. (Patch by Raymond Hettinger.) - -\item A number of small rearrangements have been made in various -hotspots to improve performance, such as inlining a function or removing -some code. (Implemented mostly by GvR, but lots of people have -contributed single changes.) - -\end{itemize} - -The net result of the 2.3 optimizations is that Python 2.3 runs the -pystone benchmark around 25\% faster than Python 2.2. - - -%====================================================================== -\section{New, Improved, and Deprecated Modules} - -As usual, Python's standard library received a number of enhancements and -bug fixes. Here's a partial list of the most notable changes, sorted -alphabetically by module name. Consult the -\file{Misc/NEWS} file in the source tree for a more -complete list of changes, or look through the CVS logs for all the -details. - -\begin{itemize} - -\item The \module{array} module now supports arrays of Unicode -characters using the \character{u} format character. Arrays also now -support using the \code{+=} assignment operator to add another array's -contents, and the \code{*=} assignment operator to repeat an array. -(Contributed by Jason Orendorff.) - -\item The \module{bsddb} module has been replaced by version 4.1.6 -of the \ulink{PyBSDDB}{http://pybsddb.sourceforge.net} package, -providing a more complete interface to the transactional features of -the BerkeleyDB library. - -The old version of the module has been renamed to -\module{bsddb185} and is no longer built automatically; you'll -have to edit \file{Modules/Setup} to enable it. Note that the new -\module{bsddb} package is intended to be compatible with the -old module, so be sure to file bugs if you discover any -incompatibilities. When upgrading to Python 2.3, if the new interpreter is compiled -with a new version of -the underlying BerkeleyDB library, you will almost certainly have to -convert your database files to the new version. You can do this -fairly easily with the new scripts \file{db2pickle.py} and -\file{pickle2db.py} which you will find in the distribution's -\file{Tools/scripts} directory. If you've already been using the PyBSDDB -package and importing it as \module{bsddb3}, you will have to change your -\code{import} statements to import it as \module{bsddb}. - -\item The new \module{bz2} module is an interface to the bz2 data -compression library. bz2-compressed data is usually smaller than -corresponding \module{zlib}-compressed data. (Contributed by Gustavo Niemeyer.) - -\item A set of standard date/time types has been added in the new \module{datetime} -module. See the following section for more details. - -\item The Distutils \class{Extension} class now supports -an extra constructor argument named \var{depends} for listing -additional source files that an extension depends on. This lets -Distutils recompile the module if any of the dependency files are -modified. For example, if \file{sampmodule.c} includes the header -file \file{sample.h}, you would create the \class{Extension} object like -this: - -\begin{verbatim} -ext = Extension("samp", - sources=["sampmodule.c"], - depends=["sample.h"]) -\end{verbatim} - -Modifying \file{sample.h} would then cause the module to be recompiled. -(Contributed by Jeremy Hylton.) - -\item Other minor changes to Distutils: -it now checks for the \envvar{CC}, \envvar{CFLAGS}, \envvar{CPP}, -\envvar{LDFLAGS}, and \envvar{CPPFLAGS} environment variables, using -them to override the settings in Python's configuration (contributed -by Robert Weber). - -\item Previously the \module{doctest} module would only search the -docstrings of public methods and functions for test cases, but it now -also examines private ones as well. The \function{DocTestSuite(} -function creates a \class{unittest.TestSuite} object from a set of -\module{doctest} tests. - -\item The new \function{gc.get_referents(\var{object})} function returns a -list of all the objects referenced by \var{object}. - -\item The \module{getopt} module gained a new function, -\function{gnu_getopt()}, that supports the same arguments as the existing -\function{getopt()} function but uses GNU-style scanning mode. -The existing \function{getopt()} stops processing options as soon as a -non-option argument is encountered, but in GNU-style mode processing -continues, meaning that options and arguments can be mixed. For -example: - -\begin{verbatim} ->>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v') -([('-f', 'filename')], ['output', '-v']) ->>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v') -([('-f', 'filename'), ('-v', '')], ['output']) -\end{verbatim} - -(Contributed by Peter \AA{strand}.) - -\item The \module{grp}, \module{pwd}, and \module{resource} modules -now return enhanced tuples: - -\begin{verbatim} ->>> import grp ->>> g = grp.getgrnam('amk') ->>> g.gr_name, g.gr_gid -('amk', 500) -\end{verbatim} - -\item The \module{gzip} module can now handle files exceeding 2~GiB. - -\item The new \module{heapq} module contains an implementation of a -heap queue algorithm. A heap is an array-like data structure that -keeps items in a partially sorted order such that, for every index -\var{k}, \code{heap[\var{k}] <= heap[2*\var{k}+1]} and -\code{heap[\var{k}] <= heap[2*\var{k}+2]}. This makes it quick to -remove the smallest item, and inserting a new item while maintaining -the heap property is O(lg~n). (See -\url{http://www.nist.gov/dads/HTML/priorityque.html} for more -information about the priority queue data structure.) - -The \module{heapq} module provides \function{heappush()} and -\function{heappop()} functions for adding and removing items while -maintaining the heap property on top of some other mutable Python -sequence type. Here's an example that uses a Python list: - -\begin{verbatim} ->>> import heapq ->>> heap = [] ->>> for item in [3, 7, 5, 11, 1]: -... heapq.heappush(heap, item) -... ->>> heap -[1, 3, 5, 11, 7] ->>> heapq.heappop(heap) -1 ->>> heapq.heappop(heap) -3 ->>> heap -[5, 7, 11] -\end{verbatim} - -(Contributed by Kevin O'Connor.) - -\item The IDLE integrated development environment has been updated -using the code from the IDLEfork project -(\url{http://idlefork.sf.net}). The most notable feature is that the -code being developed is now executed in a subprocess, meaning that -there's no longer any need for manual \code{reload()} operations. -IDLE's core code has been incorporated into the standard library as the -\module{idlelib} package. - -\item The \module{imaplib} module now supports IMAP over SSL. -(Contributed by Piers Lauder and Tino Lange.) - -\item The \module{itertools} contains a number of useful functions for -use with iterators, inspired by various functions provided by the ML -and Haskell languages. For example, -\code{itertools.ifilter(predicate, iterator)} returns all elements in -the iterator for which the function \function{predicate()} returns -\constant{True}, and \code{itertools.repeat(obj, \var{N})} returns -\code{obj} \var{N} times. There are a number of other functions in -the module; see the \ulink{package's reference -documentation}{../lib/module-itertools.html} for details. -(Contributed by Raymond Hettinger.) - -\item Two new functions in the \module{math} module, -\function{degrees(\var{rads})} and \function{radians(\var{degs})}, -convert between radians and degrees. Other functions in the -\module{math} module such as \function{math.sin()} and -\function{math.cos()} have always required input values measured in -radians. Also, an optional \var{base} argument was added to -\function{math.log()} to make it easier to compute logarithms for -bases other than \code{e} and \code{10}. (Contributed by Raymond -Hettinger.) - -\item Several new POSIX functions (\function{getpgid()}, \function{killpg()}, -\function{lchown()}, \function{loadavg()}, \function{major()}, \function{makedev()}, -\function{minor()}, and \function{mknod()}) were added to the -\module{posix} module that underlies the \module{os} module. -(Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S. Otkidach.) - -\item In the \module{os} module, the \function{*stat()} family of -functions can now report fractions of a second in a timestamp. Such -time stamps are represented as floats, similar to -the value returned by \function{time.time()}. - -During testing, it was found that some applications will break if time -stamps are floats. For compatibility, when using the tuple interface -of the \class{stat_result} time stamps will be represented as integers. -When using named fields (a feature first introduced in Python 2.2), -time stamps are still represented as integers, unless -\function{os.stat_float_times()} is invoked to enable float return -values: - -\begin{verbatim} ->>> os.stat("/tmp").st_mtime -1034791200 ->>> os.stat_float_times(True) ->>> os.stat("/tmp").st_mtime -1034791200.6335014 -\end{verbatim} - -In Python 2.4, the default will change to always returning floats. - -Application developers should enable this feature only if all their -libraries work properly when confronted with floating point time -stamps, or if they use the tuple API. If used, the feature should be -activated on an application level instead of trying to enable it on a -per-use basis. - -\item The \module{optparse} module contains a new parser for command-line arguments -that can convert option values to a particular Python type -and will automatically generate a usage message. See the following section for -more details. - -\item The old and never-documented \module{linuxaudiodev} module has -been deprecated, and a new version named \module{ossaudiodev} has been -added. The module was renamed because the OSS sound drivers can be -used on platforms other than Linux, and the interface has also been -tidied and brought up to date in various ways. (Contributed by Greg -Ward and Nicholas FitzRoy-Dale.) - -\item The new \module{platform} module contains a number of functions -that try to determine various properties of the platform you're -running on. There are functions for getting the architecture, CPU -type, the Windows OS version, and even the Linux distribution version. -(Contributed by Marc-Andr\'e Lemburg.) - -\item The parser objects provided by the \module{pyexpat} module -can now optionally buffer character data, resulting in fewer calls to -your character data handler and therefore faster performance. Setting -the parser object's \member{buffer_text} attribute to \constant{True} -will enable buffering. - -\item The \function{sample(\var{population}, \var{k})} function was -added to the \module{random} module. \var{population} is a sequence or -\class{xrange} object containing the elements of a population, and -\function{sample()} chooses \var{k} elements from the population without -replacing chosen elements. \var{k} can be any value up to -\code{len(\var{population})}. For example: - -\begin{verbatim} ->>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn'] ->>> random.sample(days, 3) # Choose 3 elements -['St', 'Sn', 'Th'] ->>> random.sample(days, 7) # Choose 7 elements -['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn'] ->>> random.sample(days, 7) # Choose 7 again -['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th'] ->>> random.sample(days, 8) # Can't choose eight -Traceback (most recent call last): - File "<stdin>", line 1, in ? - File "random.py", line 414, in sample - raise ValueError, "sample larger than population" -ValueError: sample larger than population ->>> random.sample(xrange(1,10000,2), 10) # Choose ten odd nos. under 10000 -[3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195] -\end{verbatim} - -The \module{random} module now uses a new algorithm, the Mersenne -Twister, implemented in C. It's faster and more extensively studied -than the previous algorithm. - -(All changes contributed by Raymond Hettinger.) - -\item The \module{readline} module also gained a number of new -functions: \function{get_history_item()}, -\function{get_current_history_length()}, and \function{redisplay()}. - -\item The \module{rexec} and \module{Bastion} modules have been -declared dead, and attempts to import them will fail with a -\exception{RuntimeError}. New-style classes provide new ways to break -out of the restricted execution environment provided by -\module{rexec}, and no one has interest in fixing them or time to do -so. If you have applications using \module{rexec}, rewrite them to -use something else. - -(Sticking with Python 2.2 or 2.1 will not make your applications any -safer because there are known bugs in the \module{rexec} module in -those versions. To repeat: if you're using \module{rexec}, stop using -it immediately.) - -\item The \module{rotor} module has been deprecated because the - algorithm it uses for encryption is not believed to be secure. If - you need encryption, use one of the several AES Python modules - that are available separately. - -\item The \module{shutil} module gained a \function{move(\var{src}, -\var{dest})} function that recursively moves a file or directory to a new -location. - -\item Support for more advanced POSIX signal handling was added -to the \module{signal} but then removed again as it proved impossible -to make it work reliably across platforms. - -\item The \module{socket} module now supports timeouts. You -can call the \method{settimeout(\var{t})} method on a socket object to -set a timeout of \var{t} seconds. Subsequent socket operations that -take longer than \var{t} seconds to complete will abort and raise a -\exception{socket.timeout} exception. - -The original timeout implementation was by Tim O'Malley. Michael -Gilfix integrated it into the Python \module{socket} module and -shepherded it through a lengthy review. After the code was checked -in, Guido van~Rossum rewrote parts of it. (This is a good example of -a collaborative development process in action.) - -\item On Windows, the \module{socket} module now ships with Secure -Sockets Layer (SSL) support. - -\item The value of the C \constant{PYTHON_API_VERSION} macro is now -exposed at the Python level as \code{sys.api_version}. The current -exception can be cleared by calling the new \function{sys.exc_clear()} -function. - -\item The new \module{tarfile} module -allows reading from and writing to \program{tar}-format archive files. -(Contributed by Lars Gust\"abel.) - -\item The new \module{textwrap} module contains functions for wrapping -strings containing paragraphs of text. The \function{wrap(\var{text}, -\var{width})} function takes a string and returns a list containing -the text split into lines of no more than the chosen width. The -\function{fill(\var{text}, \var{width})} function returns a single -string, reformatted to fit into lines no longer than the chosen width. -(As you can guess, \function{fill()} is built on top of -\function{wrap()}. For example: - -\begin{verbatim} ->>> import textwrap ->>> paragraph = "Not a whit, we defy augury: ... more text ..." ->>> textwrap.wrap(paragraph, 60) -["Not a whit, we defy augury: there's a special providence in", - "the fall of a sparrow. If it be now, 'tis not to come; if it", - ...] ->>> print textwrap.fill(paragraph, 35) -Not a whit, we defy augury: there's -a special providence in the fall of -a sparrow. If it be now, 'tis not -to come; if it be not to come, it -will be now; if it be not now, yet -it will come: the readiness is all. ->>> -\end{verbatim} - -The module also contains a \class{TextWrapper} class that actually -implements the text wrapping strategy. Both the -\class{TextWrapper} class and the \function{wrap()} and -\function{fill()} functions support a number of additional keyword -arguments for fine-tuning the formatting; consult the \ulink{module's -documentation}{../lib/module-textwrap.html} for details. -(Contributed by Greg Ward.) - -\item The \module{thread} and \module{threading} modules now have -companion modules, \module{dummy_thread} and \module{dummy_threading}, -that provide a do-nothing implementation of the \module{thread} -module's interface for platforms where threads are not supported. The -intention is to simplify thread-aware modules (ones that \emph{don't} -rely on threads to run) by putting the following code at the top: - -\begin{verbatim} -try: - import threading as _threading -except ImportError: - import dummy_threading as _threading -\end{verbatim} - -In this example, \module{_threading} is used as the module name to make -it clear that the module being used is not necessarily the actual -\module{threading} module. Code can call functions and use classes in -\module{_threading} whether or not threads are supported, avoiding an -\keyword{if} statement and making the code slightly clearer. This -module will not magically make multithreaded code run without threads; -code that waits for another thread to return or to do something will -simply hang forever. - -\item The \module{time} module's \function{strptime()} function has -long been an annoyance because it uses the platform C library's -\function{strptime()} implementation, and different platforms -sometimes have odd bugs. Brett Cannon contributed a portable -implementation that's written in pure Python and should behave -identically on all platforms. - -\item The new \module{timeit} module helps measure how long snippets -of Python code take to execute. The \file{timeit.py} file can be run -directly from the command line, or the module's \class{Timer} class -can be imported and used directly. Here's a short example that -figures out whether it's faster to convert an 8-bit string to Unicode -by appending an empty Unicode string to it or by using the -\function{unicode()} function: - -\begin{verbatim} -import timeit - -timer1 = timeit.Timer('unicode("abc")') -timer2 = timeit.Timer('"abc" + u""') - -# Run three trials -print timer1.repeat(repeat=3, number=100000) -print timer2.repeat(repeat=3, number=100000) - -# On my laptop this outputs: -# [0.36831796169281006, 0.37441694736480713, 0.35304892063140869] -# [0.17574405670166016, 0.18193507194519043, 0.17565798759460449] -\end{verbatim} - -\item The \module{Tix} module has received various bug fixes and -updates for the current version of the Tix package. - -\item The \module{Tkinter} module now works with a thread-enabled -version of Tcl. Tcl's threading model requires that widgets only be -accessed from the thread in which they're created; accesses from -another thread can cause Tcl to panic. For certain Tcl interfaces, -\module{Tkinter} will now automatically avoid this -when a widget is accessed from a different thread by marshalling a -command, passing it to the correct thread, and waiting for the -results. Other interfaces can't be handled automatically but -\module{Tkinter} will now raise an exception on such an access so that -you can at least find out about the problem. See -\url{http://mail.python.org/pipermail/python-dev/2002-December/031107.html} % -for a more detailed explanation of this change. (Implemented by -Martin von~L\"owis.) - -\item Calling Tcl methods through \module{_tkinter} no longer -returns only strings. Instead, if Tcl returns other objects those -objects are converted to their Python equivalent, if one exists, or -wrapped with a \class{_tkinter.Tcl_Obj} object if no Python equivalent -exists. This behavior can be controlled through the -\method{wantobjects()} method of \class{tkapp} objects. - -When using \module{_tkinter} through the \module{Tkinter} module (as -most Tkinter applications will), this feature is always activated. It -should not cause compatibility problems, since Tkinter would always -convert string results to Python types where possible. - -If any incompatibilities are found, the old behavior can be restored -by setting the \member{wantobjects} variable in the \module{Tkinter} -module to false before creating the first \class{tkapp} object. - -\begin{verbatim} -import Tkinter -Tkinter.wantobjects = 0 -\end{verbatim} - -Any breakage caused by this change should be reported as a bug. - -\item The \module{UserDict} module has a new \class{DictMixin} class which -defines all dictionary methods for classes that already have a minimum -mapping interface. This greatly simplifies writing classes that need -to be substitutable for dictionaries, such as the classes in -the \module{shelve} module. - -Adding the mix-in as a superclass provides the full dictionary -interface whenever the class defines \method{__getitem__}, -\method{__setitem__}, \method{__delitem__}, and \method{keys}. -For example: - -\begin{verbatim} ->>> import UserDict ->>> class SeqDict(UserDict.DictMixin): -... """Dictionary lookalike implemented with lists.""" -... def __init__(self): -... self.keylist = [] -... self.valuelist = [] -... def __getitem__(self, key): -... try: -... i = self.keylist.index(key) -... except ValueError: -... raise KeyError -... return self.valuelist[i] -... def __setitem__(self, key, value): -... try: -... i = self.keylist.index(key) -... self.valuelist[i] = value -... except ValueError: -... self.keylist.append(key) -... self.valuelist.append(value) -... def __delitem__(self, key): -... try: -... i = self.keylist.index(key) -... except ValueError: -... raise KeyError -... self.keylist.pop(i) -... self.valuelist.pop(i) -... def keys(self): -... return list(self.keylist) -... ->>> s = SeqDict() ->>> dir(s) # See that other dictionary methods are implemented -['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__', - '__init__', '__iter__', '__len__', '__module__', '__repr__', - '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems', - 'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem', - 'setdefault', 'update', 'valuelist', 'values'] -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item The DOM implementation -in \module{xml.dom.minidom} can now generate XML output in a -particular encoding by providing an optional encoding argument to -the \method{toxml()} and \method{toprettyxml()} methods of DOM nodes. - -\item The \module{xmlrpclib} module now supports an XML-RPC extension -for handling nil data values such as Python's \code{None}. Nil values -are always supported on unmarshalling an XML-RPC response. To -generate requests containing \code{None}, you must supply a true value -for the \var{allow_none} parameter when creating a \class{Marshaller} -instance. - -\item The new \module{DocXMLRPCServer} module allows writing -self-documenting XML-RPC servers. Run it in demo mode (as a program) -to see it in action. Pointing the Web browser to the RPC server -produces pydoc-style documentation; pointing xmlrpclib to the -server allows invoking the actual methods. -(Contributed by Brian Quinlan.) - -\item Support for internationalized domain names (RFCs 3454, 3490, -3491, and 3492) has been added. The ``idna'' encoding can be used -to convert between a Unicode domain name and the ASCII-compatible -encoding (ACE) of that name. - -\begin{alltt} ->{}>{}> u"www.Alliancefran\c caise.nu".encode("idna") -'www.xn--alliancefranaise-npb.nu' -\end{alltt} - -The \module{socket} module has also been extended to transparently -convert Unicode hostnames to the ACE version before passing them to -the C library. Modules that deal with hostnames such as -\module{httplib} and \module{ftplib}) also support Unicode host names; -\module{httplib} also sends HTTP \samp{Host} headers using the ACE -version of the domain name. \module{urllib} supports Unicode URLs -with non-ASCII host names as long as the \code{path} part of the URL -is ASCII only. - -To implement this change, the \module{stringprep} module, the -\code{mkstringprep} tool and the \code{punycode} encoding have been added. - -\end{itemize} - - -%====================================================================== -\subsection{Date/Time Type} - -Date and time types suitable for expressing timestamps were added as -the \module{datetime} module. The types don't support different -calendars or many fancy features, and just stick to the basics of -representing time. - -The three primary types are: \class{date}, representing a day, month, -and year; \class{time}, consisting of hour, minute, and second; and -\class{datetime}, which contains all the attributes of both -\class{date} and \class{time}. There's also a -\class{timedelta} class representing differences between two points -in time, and time zone logic is implemented by classes inheriting from -the abstract \class{tzinfo} class. - -You can create instances of \class{date} and \class{time} by either -supplying keyword arguments to the appropriate constructor, -e.g. \code{datetime.date(year=1972, month=10, day=15)}, or by using -one of a number of class methods. For example, the \method{date.today()} -class method returns the current local date. - -Once created, instances of the date/time classes are all immutable. -There are a number of methods for producing formatted strings from -objects: - -\begin{verbatim} ->>> import datetime ->>> now = datetime.datetime.now() ->>> now.isoformat() -'2002-12-30T21:27:03.994956' ->>> now.ctime() # Only available on date, datetime -'Mon Dec 30 21:27:03 2002' ->>> now.strftime('%Y %d %b') -'2002 30 Dec' -\end{verbatim} - -The \method{replace()} method allows modifying one or more fields -of a \class{date} or \class{datetime} instance, returning a new instance: - -\begin{verbatim} ->>> d = datetime.datetime.now() ->>> d -datetime.datetime(2002, 12, 30, 22, 15, 38, 827738) ->>> d.replace(year=2001, hour = 12) -datetime.datetime(2001, 12, 30, 12, 15, 38, 827738) ->>> -\end{verbatim} - -Instances can be compared, hashed, and converted to strings (the -result is the same as that of \method{isoformat()}). \class{date} and -\class{datetime} instances can be subtracted from each other, and -added to \class{timedelta} instances. The largest missing feature is -that there's no standard library support for parsing strings and getting back a -\class{date} or \class{datetime}. - -For more information, refer to the \ulink{module's reference -documentation}{../lib/module-datetime.html}. -(Contributed by Tim Peters.) - - -%====================================================================== -\subsection{The optparse Module} - -The \module{getopt} module provides simple parsing of command-line -arguments. The new \module{optparse} module (originally named Optik) -provides more elaborate command-line parsing that follows the \UNIX{} -conventions, automatically creates the output for \longprogramopt{help}, -and can perform different actions for different options. - -You start by creating an instance of \class{OptionParser} and telling -it what your program's options are. - -\begin{verbatim} -import sys -from optparse import OptionParser - -op = OptionParser() -op.add_option('-i', '--input', - action='store', type='string', dest='input', - help='set input filename') -op.add_option('-l', '--length', - action='store', type='int', dest='length', - help='set maximum length of output') -\end{verbatim} - -Parsing a command line is then done by calling the \method{parse_args()} -method. - -\begin{verbatim} -options, args = op.parse_args(sys.argv[1:]) -print options -print args -\end{verbatim} - -This returns an object containing all of the option values, -and a list of strings containing the remaining arguments. - -Invoking the script with the various arguments now works as you'd -expect it to. Note that the length argument is automatically -converted to an integer. - -\begin{verbatim} -$ ./python opt.py -i data arg1 -<Values at 0x400cad4c: {'input': 'data', 'length': None}> -['arg1'] -$ ./python opt.py --input=data --length=4 -<Values at 0x400cad2c: {'input': 'data', 'length': 4}> -[] -$ -\end{verbatim} - -The help message is automatically generated for you: - -\begin{verbatim} -$ ./python opt.py --help -usage: opt.py [options] - -options: - -h, --help show this help message and exit - -iINPUT, --input=INPUT - set input filename - -lLENGTH, --length=LENGTH - set maximum length of output -$ -\end{verbatim} -% $ prevent Emacs tex-mode from getting confused - -See the \ulink{module's documentation}{../lib/module-optparse.html} -for more details. - -Optik was written by Greg Ward, with suggestions from the readers of -the Getopt SIG. - - -%====================================================================== -\section{Pymalloc: A Specialized Object Allocator\label{section-pymalloc}} - -Pymalloc, a specialized object allocator written by Vladimir -Marangozov, was a feature added to Python 2.1. Pymalloc is intended -to be faster than the system \cfunction{malloc()} and to have less -memory overhead for allocation patterns typical of Python programs. -The allocator uses C's \cfunction{malloc()} function to get large -pools of memory and then fulfills smaller memory requests from these -pools. - -In 2.1 and 2.2, pymalloc was an experimental feature and wasn't -enabled by default; you had to explicitly enable it when compiling -Python by providing the -\longprogramopt{with-pymalloc} option to the \program{configure} -script. In 2.3, pymalloc has had further enhancements and is now -enabled by default; you'll have to supply -\longprogramopt{without-pymalloc} to disable it. - -This change is transparent to code written in Python; however, -pymalloc may expose bugs in C extensions. Authors of C extension -modules should test their code with pymalloc enabled, -because some incorrect code may cause core dumps at runtime. - -There's one particularly common error that causes problems. There are -a number of memory allocation functions in Python's C API that have -previously just been aliases for the C library's \cfunction{malloc()} -and \cfunction{free()}, meaning that if you accidentally called -mismatched functions the error wouldn't be noticeable. When the -object allocator is enabled, these functions aren't aliases of -\cfunction{malloc()} and \cfunction{free()} any more, and calling the -wrong function to free memory may get you a core dump. For example, -if memory was allocated using \cfunction{PyObject_Malloc()}, it has to -be freed using \cfunction{PyObject_Free()}, not \cfunction{free()}. A -few modules included with Python fell afoul of this and had to be -fixed; doubtless there are more third-party modules that will have the -same problem. - -As part of this change, the confusing multiple interfaces for -allocating memory have been consolidated down into two API families. -Memory allocated with one family must not be manipulated with -functions from the other family. There is one family for allocating -chunks of memory and another family of functions specifically for -allocating Python objects. - -\begin{itemize} - \item To allocate and free an undistinguished chunk of memory use - the ``raw memory'' family: \cfunction{PyMem_Malloc()}, - \cfunction{PyMem_Realloc()}, and \cfunction{PyMem_Free()}. - - \item The ``object memory'' family is the interface to the pymalloc - facility described above and is biased towards a large number of - ``small'' allocations: \cfunction{PyObject_Malloc}, - \cfunction{PyObject_Realloc}, and \cfunction{PyObject_Free}. - - \item To allocate and free Python objects, use the ``object'' family - \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()}, and - \cfunction{PyObject_Del()}. -\end{itemize} - -Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides -debugging features to catch memory overwrites and doubled frees in -both extension modules and in the interpreter itself. To enable this -support, compile a debugging version of the Python interpreter by -running \program{configure} with \longprogramopt{with-pydebug}. - -To aid extension writers, a header file \file{Misc/pymemcompat.h} is -distributed with the source to Python 2.3 that allows Python -extensions to use the 2.3 interfaces to memory allocation while -compiling against any version of Python since 1.5.2. You would copy -the file from Python's source distribution and bundle it with the -source of your extension. - -\begin{seealso} - -\seeurl{http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/obmalloc.c} -{For the full details of the pymalloc implementation, see -the comments at the top of the file \file{Objects/obmalloc.c} in the -Python source code. The above link points to the file within the -SourceForge CVS browser.} - -\end{seealso} - - -% ====================================================================== -\section{Build and C API Changes} - -Changes to Python's build process and to the C API include: - -\begin{itemize} - -\item The cycle detection implementation used by the garbage collection -has proven to be stable, so it's now been made mandatory. You can no -longer compile Python without it, and the -\longprogramopt{with-cycle-gc} switch to \program{configure} has been removed. - -\item Python can now optionally be built as a shared library -(\file{libpython2.3.so}) by supplying \longprogramopt{enable-shared} -when running Python's \program{configure} script. (Contributed by Ondrej -Palkovsky.) - -\item The \csimplemacro{DL_EXPORT} and \csimplemacro{DL_IMPORT} macros -are now deprecated. Initialization functions for Python extension -modules should now be declared using the new macro -\csimplemacro{PyMODINIT_FUNC}, while the Python core will generally -use the \csimplemacro{PyAPI_FUNC} and \csimplemacro{PyAPI_DATA} -macros. - -\item The interpreter can be compiled without any docstrings for -the built-in functions and modules by supplying -\longprogramopt{without-doc-strings} to the \program{configure} script. -This makes the Python executable about 10\% smaller, but will also -mean that you can't get help for Python's built-ins. (Contributed by -Gustavo Niemeyer.) - -\item The \cfunction{PyArg_NoArgs()} macro is now deprecated, and code -that uses it should be changed. For Python 2.2 and later, the method -definition table can specify the -\constant{METH_NOARGS} flag, signalling that there are no arguments, and -the argument checking can then be removed. If compatibility with -pre-2.2 versions of Python is important, the code could use -\code{PyArg_ParseTuple(\var{args}, "")} instead, but this will be slower -than using \constant{METH_NOARGS}. - -\item \cfunction{PyArg_ParseTuple()} accepts new format characters for various sizes of unsigned integers: \samp{B} for \ctype{unsigned char}, -\samp{H} for \ctype{unsigned short int}, -\samp{I} for \ctype{unsigned int}, -and \samp{K} for \ctype{unsigned long long}. - -\item A new function, \cfunction{PyObject_DelItemString(\var{mapping}, -char *\var{key})} was added as shorthand for -\code{PyObject_DelItem(\var{mapping}, PyString_New(\var{key}))}. - -\item File objects now manage their internal string buffer -differently, increasing it exponentially when needed. This results in -the benchmark tests in \file{Lib/test/test_bufio.py} speeding up -considerably (from 57 seconds to 1.7 seconds, according to one -measurement). - -\item It's now possible to define class and static methods for a C -extension type by setting either the \constant{METH_CLASS} or -\constant{METH_STATIC} flags in a method's \ctype{PyMethodDef} -structure. - -\item Python now includes a copy of the Expat XML parser's source code, -removing any dependence on a system version or local installation of -Expat. - -\item If you dynamically allocate type objects in your extension, you -should be aware of a change in the rules relating to the -\member{__module__} and \member{__name__} attributes. In summary, -you will want to ensure the type's dictionary contains a -\code{'__module__'} key; making the module name the part of the type -name leading up to the final period will no longer have the desired -effect. For more detail, read the API reference documentation or the -source. - -\end{itemize} - - -%====================================================================== -\subsection{Port-Specific Changes} - -Support for a port to IBM's OS/2 using the EMX runtime environment was -merged into the main Python source tree. EMX is a POSIX emulation -layer over the OS/2 system APIs. The Python port for EMX tries to -support all the POSIX-like capability exposed by the EMX runtime, and -mostly succeeds; \function{fork()} and \function{fcntl()} are -restricted by the limitations of the underlying emulation layer. The -standard OS/2 port, which uses IBM's Visual Age compiler, also gained -support for case-sensitive import semantics as part of the integration -of the EMX port into CVS. (Contributed by Andrew MacIntyre.) - -On MacOS, most toolbox modules have been weaklinked to improve -backward compatibility. This means that modules will no longer fail -to load if a single routine is missing on the current OS version. -Instead calling the missing routine will raise an exception. -(Contributed by Jack Jansen.) - -The RPM spec files, found in the \file{Misc/RPM/} directory in the -Python source distribution, were updated for 2.3. (Contributed by -Sean Reifschneider.) - -Other new platforms now supported by Python include AtheOS -(\url{http://www.atheos.cx/}), GNU/Hurd, and OpenVMS. - - -%====================================================================== -\section{Other Changes and Fixes \label{section-other}} - -As usual, there were a bunch of other improvements and bugfixes -scattered throughout the source tree. A search through the CVS change -logs finds there were 523 patches applied and 514 bugs fixed between -Python 2.2 and 2.3. Both figures are likely to be underestimates. - -Some of the more notable changes are: - -\begin{itemize} - -\item If the \envvar{PYTHONINSPECT} environment variable is set, the -Python interpreter will enter the interactive prompt after running a -Python program, as if Python had been invoked with the \programopt{-i} -option. The environment variable can be set before running the Python -interpreter, or it can be set by the Python program as part of its -execution. - -\item The \file{regrtest.py} script now provides a way to allow ``all -resources except \var{foo}.'' A resource name passed to the -\programopt{-u} option can now be prefixed with a hyphen -(\character{-}) to mean ``remove this resource.'' For example, the -option `\code{\programopt{-u}all,-bsddb}' could be used to enable the -use of all resources except \code{bsddb}. - -\item The tools used to build the documentation now work under Cygwin -as well as \UNIX. - -\item The \code{SET_LINENO} opcode has been removed. Back in the -mists of time, this opcode was needed to produce line numbers in -tracebacks and support trace functions (for, e.g., \module{pdb}). -Since Python 1.5, the line numbers in tracebacks have been computed -using a different mechanism that works with ``python -O''. For Python -2.3 Michael Hudson implemented a similar scheme to determine when to -call the trace function, removing the need for \code{SET_LINENO} -entirely. - -It would be difficult to detect any resulting difference from Python -code, apart from a slight speed up when Python is run without -\programopt{-O}. - -C extensions that access the \member{f_lineno} field of frame objects -should instead call \code{PyCode_Addr2Line(f->f_code, f->f_lasti)}. -This will have the added effect of making the code work as desired -under ``python -O'' in earlier versions of Python. - -A nifty new feature is that trace functions can now assign to the -\member{f_lineno} attribute of frame objects, changing the line that -will be executed next. A \samp{jump} command has been added to the -\module{pdb} debugger taking advantage of this new feature. -(Implemented by Richie Hindle.) - -\end{itemize} - - -%====================================================================== -\section{Porting to Python 2.3} - -This section lists previously described changes that may require -changes to your code: - -\begin{itemize} - -\item \keyword{yield} is now always a keyword; if it's used as a -variable name in your code, a different name must be chosen. - -\item For strings \var{X} and \var{Y}, \code{\var{X} in \var{Y}} now works -if \var{X} is more than one character long. - -\item The \function{int()} type constructor will now return a long -integer instead of raising an \exception{OverflowError} when a string -or floating-point number is too large to fit into an integer. - -\item If you have Unicode strings that contain 8-bit characters, you -must declare the file's encoding (UTF-8, Latin-1, or whatever) by -adding a comment to the top of the file. See -section~\ref{section-encodings} for more information. - -\item Calling Tcl methods through \module{_tkinter} no longer -returns only strings. Instead, if Tcl returns other objects those -objects are converted to their Python equivalent, if one exists, or -wrapped with a \class{_tkinter.Tcl_Obj} object if no Python equivalent -exists. - -\item Large octal and hex literals such as -\code{0xffffffff} now trigger a \exception{FutureWarning}. Currently -they're stored as 32-bit numbers and result in a negative value, but -in Python 2.4 they'll become positive long integers. - -% The empty groups below prevent conversion to guillemets. -There are a few ways to fix this warning. If you really need a -positive number, just add an \samp{L} to the end of the literal. If -you're trying to get a 32-bit integer with low bits set and have -previously used an expression such as \code{\textasciitilde(1 <{}< 31)}, -it's probably -clearest to start with all bits set and clear the desired upper bits. -For example, to clear just the top bit (bit 31), you could write -\code{0xffffffffL {\&}{\textasciitilde}(1L<{}<31)}. - -\item You can no longer disable assertions by assigning to \code{__debug__}. - -\item The Distutils \function{setup()} function has gained various new -keyword arguments such as \var{depends}. Old versions of the -Distutils will abort if passed unknown keywords. A solution is to check -for the presence of the new \function{get_distutil_options()} function -in your \file{setup.py} and only uses the new keywords -with a version of the Distutils that supports them: - -\begin{verbatim} -from distutils import core - -kw = {'sources': 'foo.c', ...} -if hasattr(core, 'get_distutil_options'): - kw['depends'] = ['foo.h'] -ext = Extension(**kw) -\end{verbatim} - -\item Using \code{None} as a variable name will now result in a -\exception{SyntaxWarning} warning. - -\item Names of extension types defined by the modules included with -Python now contain the module and a \character{.} in front of the type -name. - -\end{itemize} - - -%====================================================================== -\section{Acknowledgements \label{acks}} - -The author would like to thank the following people for offering -suggestions, corrections and assistance with various drafts of this -article: Jeff Bauer, Simon Brunning, Brett Cannon, Michael Chermside, -Andrew Dalke, Scott David Daniels, Fred~L. Drake, Jr., David Fraser, -Kelly Gerber, -Raymond Hettinger, Michael Hudson, Chris Lambert, Detlef Lannert, -Martin von~L\"owis, Andrew MacIntyre, Lalo Martins, Chad Netzer, -Gustavo Niemeyer, Neal Norwitz, Hans Nowak, Chris Reedy, Francesco -Ricciardi, Vinay Sajip, Neil Schemenauer, Roman Suzi, Jason Tishler, -Just van~Rossum. - -\end{document} diff --git a/Doc/whatsnew/whatsnew24.tex b/Doc/whatsnew/whatsnew24.tex deleted file mode 100644 index 399bc0e..0000000 --- a/Doc/whatsnew/whatsnew24.tex +++ /dev/null @@ -1,1757 +0,0 @@ -\documentclass{howto} -\usepackage{distutils} -% $Id$ - -% Don't write extensive text for new sections; I'll do that. -% Feel free to add commented-out reminders of things that need -% to be covered. --amk - -\title{What's New in Python 2.4} -\release{1.02} -\author{A.M.\ Kuchling} -\authoraddress{ - \strong{Python Software Foundation}\\ - Email: \email{amk@amk.ca} -} - -\begin{document} -\maketitle -\tableofcontents - -This article explains the new features in Python 2.4.1, released on -March~30, 2005. - -Python 2.4 is a medium-sized release. It doesn't introduce as many -changes as the radical Python 2.2, but introduces more features than -the conservative 2.3 release. The most significant new language -features are function decorators and generator expressions; most other -changes are to the standard library. - -According to the CVS change logs, there were 481 patches applied and -502 bugs fixed between Python 2.3 and 2.4. Both figures are likely to -be underestimates. - -This article doesn't attempt to provide a complete specification of -every single new feature, but instead provides a brief introduction to -each feature. For full details, you should refer to the documentation -for Python 2.4, such as the \citetitle[../lib/lib.html]{Python Library -Reference} and the \citetitle[../ref/ref.html]{Python Reference -Manual}. Often you will be referred to the PEP for a particular new -feature for explanations of the implementation and design rationale. - - -%====================================================================== -\section{PEP 218: Built-In Set Objects} - -Python 2.3 introduced the \module{sets} module. C implementations of -set data types have now been added to the Python core as two new -built-in types, \function{set(\var{iterable})} and -\function{frozenset(\var{iterable})}. They provide high speed -operations for membership testing, for eliminating duplicates from -sequences, and for mathematical operations like unions, intersections, -differences, and symmetric differences. - -\begin{verbatim} ->>> a = set('abracadabra') # form a set from a string ->>> 'z' in a # fast membership testing -False ->>> a # unique letters in a -set(['a', 'r', 'b', 'c', 'd']) ->>> ''.join(a) # convert back into a string -'arbcd' - ->>> b = set('alacazam') # form a second set ->>> a - b # letters in a but not in b -set(['r', 'd', 'b']) ->>> a | b # letters in either a or b -set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) ->>> a & b # letters in both a and b -set(['a', 'c']) ->>> a ^ b # letters in a or b but not both -set(['r', 'd', 'b', 'm', 'z', 'l']) - ->>> a.add('z') # add a new element ->>> a.update('wxy') # add multiple new elements ->>> a -set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z']) ->>> a.remove('x') # take one element out ->>> a -set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z']) -\end{verbatim} - -The \function{frozenset} type is an immutable version of \function{set}. -Since it is immutable and hashable, it may be used as a dictionary key or -as a member of another set. - -The \module{sets} module remains in the standard library, and may be -useful if you wish to subclass the \class{Set} or \class{ImmutableSet} -classes. There are currently no plans to deprecate the module. - -\begin{seealso} -\seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by -Greg Wilson and ultimately implemented by Raymond Hettinger.} -\end{seealso} - - -%====================================================================== -\section{PEP 237: Unifying Long Integers and Integers} - -The lengthy transition process for this PEP, begun in Python 2.2, -takes another step forward in Python 2.4. In 2.3, certain integer -operations that would behave differently after int/long unification -triggered \exception{FutureWarning} warnings and returned values -limited to 32 or 64 bits (depending on your platform). In 2.4, these -expressions no longer produce a warning and instead produce a -different result that's usually a long integer. - -The problematic expressions are primarily left shifts and lengthy -hexadecimal and octal constants. For example, -\code{2 \textless{}\textless{} 32} results -in a warning in 2.3, evaluating to 0 on 32-bit platforms. In Python -2.4, this expression now returns the correct answer, 8589934592. - -\begin{seealso} -\seepep{237}{Unifying Long Integers and Integers}{Original PEP -written by Moshe Zadka and GvR. The changes for 2.4 were implemented by -Kalle Svensson.} -\end{seealso} - - -%====================================================================== -\section{PEP 289: Generator Expressions} - -The iterator feature introduced in Python 2.2 and the -\module{itertools} module make it easier to write programs that loop -through large data sets without having the entire data set in memory -at one time. List comprehensions don't fit into this picture very -well because they produce a Python list object containing all of the -items. This unavoidably pulls all of the objects into memory, which -can be a problem if your data set is very large. When trying to write -a functionally-styled program, it would be natural to write something -like: - -\begin{verbatim} -links = [link for link in get_all_links() if not link.followed] -for link in links: - ... -\end{verbatim} - -instead of - -\begin{verbatim} -for link in get_all_links(): - if link.followed: - continue - ... -\end{verbatim} - -The first form is more concise and perhaps more readable, but if -you're dealing with a large number of link objects you'd have to write -the second form to avoid having all link objects in memory at the same -time. - -Generator expressions work similarly to list comprehensions but don't -materialize the entire list; instead they create a generator that will -return elements one by one. The above example could be written as: - -\begin{verbatim} -links = (link for link in get_all_links() if not link.followed) -for link in links: - ... -\end{verbatim} - -Generator expressions always have to be written inside parentheses, as -in the above example. The parentheses signalling a function call also -count, so if you want to create an iterator that will be immediately -passed to a function you could write: - -\begin{verbatim} -print sum(obj.count for obj in list_all_objects()) -\end{verbatim} - -Generator expressions differ from list comprehensions in various small -ways. Most notably, the loop variable (\var{obj} in the above -example) is not accessible outside of the generator expression. List -comprehensions leave the variable assigned to its last value; future -versions of Python will change this, making list comprehensions match -generator expressions in this respect. - -\begin{seealso} -\seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and -implemented by Jiwon Seo with early efforts steered by Hye-Shik Chang.} -\end{seealso} - - -%====================================================================== -\section{PEP 292: Simpler String Substitutions} - -Some new classes in the standard library provide an alternative -mechanism for substituting variables into strings; this style of -substitution may be better for applications where untrained -users need to edit templates. - -The usual way of substituting variables by name is the \code{\%} -operator: - -\begin{verbatim} ->>> '%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'} -'2: The Best of Times' -\end{verbatim} - -When writing the template string, it can be easy to forget the -\samp{i} or \samp{s} after the closing parenthesis. This isn't a big -problem if the template is in a Python module, because you run the -code, get an ``Unsupported format character'' \exception{ValueError}, -and fix the problem. However, consider an application such as Mailman -where template strings or translations are being edited by users who -aren't aware of the Python language. The format string's syntax is -complicated to explain to such users, and if they make a mistake, it's -difficult to provide helpful feedback to them. - -PEP 292 adds a \class{Template} class to the \module{string} module -that uses \samp{\$} to indicate a substitution: - -\begin{verbatim} ->>> import string ->>> t = string.Template('$page: $title') ->>> t.substitute({'page':2, 'title': 'The Best of Times'}) -'2: The Best of Times' -\end{verbatim} - -% $ Terminate $-mode for Emacs - -If a key is missing from the dictionary, the \method{substitute} method -will raise a \exception{KeyError}. There's also a \method{safe_substitute} -method that ignores missing keys: - -\begin{verbatim} ->>> t = string.Template('$page: $title') ->>> t.safe_substitute({'page':3}) -'3: $title' -\end{verbatim} - -% $ Terminate math-mode for Emacs - - -\begin{seealso} -\seepep{292}{Simpler String Substitutions}{Written and implemented -by Barry Warsaw.} -\end{seealso} - - -%====================================================================== -\section{PEP 318: Decorators for Functions and Methods} - -Python 2.2 extended Python's object model by adding static methods and -class methods, but it didn't extend Python's syntax to provide any new -way of defining static or class methods. Instead, you had to write a -\keyword{def} statement in the usual way, and pass the resulting -method to a \function{staticmethod()} or \function{classmethod()} -function that would wrap up the function as a method of the new type. -Your code would look like this: - -\begin{verbatim} -class C: - def meth (cls): - ... - - meth = classmethod(meth) # Rebind name to wrapped-up class method -\end{verbatim} - -If the method was very long, it would be easy to miss or forget the -\function{classmethod()} invocation after the function body. - -The intention was always to add some syntax to make such definitions -more readable, but at the time of 2.2's release a good syntax was not -obvious. Today a good syntax \emph{still} isn't obvious but users are -asking for easier access to the feature; a new syntactic feature has -been added to meet this need. - -The new feature is called ``function decorators''. The name comes -from the idea that \function{classmethod}, \function{staticmethod}, -and friends are storing additional information on a function object; -they're \emph{decorating} functions with more details. - -The notation borrows from Java and uses the \character{@} character as an -indicator. Using the new syntax, the example above would be written: - -\begin{verbatim} -class C: - - @classmethod - def meth (cls): - ... - -\end{verbatim} - -The \code{@classmethod} is shorthand for the -\code{meth=classmethod(meth)} assignment. More generally, if you have -the following: - -\begin{verbatim} -@A -@B -@C -def f (): - ... -\end{verbatim} - -It's equivalent to the following pre-decorator code: - -\begin{verbatim} -def f(): ... -f = A(B(C(f))) -\end{verbatim} - -Decorators must come on the line before a function definition, one decorator -per line, and can't be on the same line as the def statement, meaning that -\code{@A def f(): ...} is illegal. You can only decorate function -definitions, either at the module level or inside a class; you can't -decorate class definitions. - -A decorator is just a function that takes the function to be decorated as an -argument and returns either the same function or some new object. The -return value of the decorator need not be callable (though it typically is), -unless further decorators will be applied to the result. It's easy to write -your own decorators. The following simple example just sets an attribute on -the function object: - -\begin{verbatim} ->>> def deco(func): -... func.attr = 'decorated' -... return func -... ->>> @deco -... def f(): pass -... ->>> f -<function f at 0x402ef0d4> ->>> f.attr -'decorated' ->>> -\end{verbatim} - -As a slightly more realistic example, the following decorator checks -that the supplied argument is an integer: - -\begin{verbatim} -def require_int (func): - def wrapper (arg): - assert isinstance(arg, int) - return func(arg) - - return wrapper - -@require_int -def p1 (arg): - print arg - -@require_int -def p2(arg): - print arg*2 -\end{verbatim} - -An example in \pep{318} contains a fancier version of this idea that -lets you both specify the required type and check the returned type. - -Decorator functions can take arguments. If arguments are supplied, -your decorator function is called with only those arguments and must -return a new decorator function; this function must take a single -function and return a function, as previously described. In other -words, \code{@A @B @C(args)} becomes: - -\begin{verbatim} -def f(): ... -_deco = C(args) -f = A(B(_deco(f))) -\end{verbatim} - -Getting this right can be slightly brain-bending, but it's not too -difficult. - -A small related change makes the \member{func_name} attribute of -functions writable. This attribute is used to display function names -in tracebacks, so decorators should change the name of any new -function that's constructed and returned. - -\begin{seealso} -\seepep{318}{Decorators for Functions, Methods and Classes}{Written -by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people -wrote patches implementing function decorators, but the one that was -actually checked in was patch \#979728, written by Mark Russell.} - -\seeurl{http://www.python.org/moin/PythonDecoratorLibrary} -{This Wiki page contains several examples of decorators.} - -\end{seealso} - - -%====================================================================== -\section{PEP 322: Reverse Iteration} - -A new built-in function, \function{reversed(\var{seq})}, takes a sequence -and returns an iterator that loops over the elements of the sequence -in reverse order. - -\begin{verbatim} ->>> for i in reversed(xrange(1,4)): -... print i -... -3 -2 -1 -\end{verbatim} - -Compared to extended slicing, such as \code{range(1,4)[::-1]}, -\function{reversed()} is easier to read, runs faster, and uses -substantially less memory. - -Note that \function{reversed()} only accepts sequences, not arbitrary -iterators. If you want to reverse an iterator, first convert it to -a list with \function{list()}. - -\begin{verbatim} ->>> input = open('/etc/passwd', 'r') ->>> for line in reversed(list(input)): -... print line -... -root:*:0:0:System Administrator:/var/root:/bin/tcsh - ... -\end{verbatim} - -\begin{seealso} -\seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.} - -\end{seealso} - - -%====================================================================== -\section{PEP 324: New subprocess Module} - -The standard library provides a number of ways to execute a -subprocess, offering different features and different levels of -complexity. \function{os.system(\var{command})} is easy to use, but -slow (it runs a shell process which executes the command) and -dangerous (you have to be careful about escaping the shell's -metacharacters). The \module{popen2} module offers classes that can -capture standard output and standard error from the subprocess, but -the naming is confusing. The \module{subprocess} module cleans -this up, providing a unified interface that offers all the features -you might need. - -Instead of \module{popen2}'s collection of classes, -\module{subprocess} contains a single class called \class{Popen} -whose constructor supports a number of different keyword arguments. - -\begin{verbatim} -class Popen(args, bufsize=0, executable=None, - stdin=None, stdout=None, stderr=None, - preexec_fn=None, close_fds=False, shell=False, - cwd=None, env=None, universal_newlines=False, - startupinfo=None, creationflags=0): -\end{verbatim} - -\var{args} is commonly a sequence of strings that will be the -arguments to the program executed as the subprocess. (If the -\var{shell} argument is true, \var{args} can be a string which will -then be passed on to the shell for interpretation, just as -\function{os.system()} does.) - -\var{stdin}, \var{stdout}, and \var{stderr} specify what the -subprocess's input, output, and error streams will be. You can -provide a file object or a file descriptor, or you can use the -constant \code{subprocess.PIPE} to create a pipe between the -subprocess and the parent. - -The constructor has a number of handy options: - -\begin{itemize} - \item \var{close_fds} requests that all file descriptors be closed - before running the subprocess. - - \item \var{cwd} specifies the working directory in which the - subprocess will be executed (defaulting to whatever the parent's - working directory is). - - \item \var{env} is a dictionary specifying environment variables. - - \item \var{preexec_fn} is a function that gets called before the - child is started. - - \item \var{universal_newlines} opens the child's input and output - using Python's universal newline feature. - -\end{itemize} - -Once you've created the \class{Popen} instance, -you can call its \method{wait()} method to pause until the subprocess -has exited, \method{poll()} to check if it's exited without pausing, -or \method{communicate(\var{data})} to send the string \var{data} to -the subprocess's standard input. \method{communicate(\var{data})} -then reads any data that the subprocess has sent to its standard output -or standard error, returning a tuple \code{(\var{stdout_data}, -\var{stderr_data})}. - -\function{call()} is a shortcut that passes its arguments along to the -\class{Popen} constructor, waits for the command to complete, and -returns the status code of the subprocess. It can serve as a safer -analog to \function{os.system()}: - -\begin{verbatim} -sts = subprocess.call(['dpkg', '-i', '/tmp/new-package.deb']) -if sts == 0: - # Success - ... -else: - # dpkg returned an error - ... -\end{verbatim} - -The command is invoked without use of the shell. If you really do want to -use the shell, you can add \code{shell=True} as a keyword argument and provide -a string instead of a sequence: - -\begin{verbatim} -sts = subprocess.call('dpkg -i /tmp/new-package.deb', shell=True) -\end{verbatim} - -The PEP takes various examples of shell and Python code and shows how -they'd be translated into Python code that uses \module{subprocess}. -Reading this section of the PEP is highly recommended. - -\begin{seealso} -\seepep{324}{subprocess - New process module}{Written and implemented by Peter {\AA}strand, with assistance from Fredrik Lundh and others.} -\end{seealso} - - -%====================================================================== -\section{PEP 327: Decimal Data Type} - -Python has always supported floating-point (FP) numbers, based on the -underlying C \ctype{double} type, as a data type. However, while most -programming languages provide a floating-point type, many people (even -programmers) are unaware that floating-point numbers don't represent -certain decimal fractions accurately. The new \class{Decimal} type -can represent these fractions accurately, up to a user-specified -precision limit. - - -\subsection{Why is Decimal needed?} - -The limitations arise from the representation used for floating-point numbers. -FP numbers are made up of three components: - -\begin{itemize} -\item The sign, which is positive or negative. -\item The mantissa, which is a single-digit binary number -followed by a fractional part. For example, \code{1.01} in base-2 notation -is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation. -\item The exponent, which tells where the decimal point is located in the number represented. -\end{itemize} - -For example, the number 1.25 has positive sign, a mantissa value of -1.01 (in binary), and an exponent of 0 (the decimal point doesn't need -to be shifted). The number 5 has the same sign and mantissa, but the -exponent is 2 because the mantissa is multiplied by 4 (2 to the power -of the exponent 2); 1.25 * 4 equals 5. - -Modern systems usually provide floating-point support that conforms to -a standard called IEEE 754. C's \ctype{double} type is usually -implemented as a 64-bit IEEE 754 number, which uses 52 bits of space -for the mantissa. This means that numbers can only be specified to 52 -bits of precision. If you're trying to represent numbers whose -expansion repeats endlessly, the expansion is cut off after 52 bits. -Unfortunately, most software needs to produce output in base 10, and -common fractions in base 10 are often repeating decimals in binary. -For example, 1.1 decimal is binary \code{1.0001100110011 ...}; .1 = -1/16 + 1/32 + 1/256 plus an infinite number of additional terms. IEEE -754 has to chop off that infinitely repeated decimal after 52 digits, -so the representation is slightly inaccurate. - -Sometimes you can see this inaccuracy when the number is printed: -\begin{verbatim} ->>> 1.1 -1.1000000000000001 -\end{verbatim} - -The inaccuracy isn't always visible when you print the number because -the FP-to-decimal-string conversion is provided by the C library, and -most C libraries try to produce sensible output. Even if it's not -displayed, however, the inaccuracy is still there and subsequent -operations can magnify the error. - -For many applications this doesn't matter. If I'm plotting points and -displaying them on my monitor, the difference between 1.1 and -1.1000000000000001 is too small to be visible. Reports often limit -output to a certain number of decimal places, and if you round the -number to two or three or even eight decimal places, the error is -never apparent. However, for applications where it does matter, -it's a lot of work to implement your own custom arithmetic routines. - -Hence, the \class{Decimal} type was created. - -\subsection{The \class{Decimal} type} - -A new module, \module{decimal}, was added to Python's standard -library. It contains two classes, \class{Decimal} and -\class{Context}. \class{Decimal} instances represent numbers, and -\class{Context} instances are used to wrap up various settings such as -the precision and default rounding mode. - -\class{Decimal} instances are immutable, like regular Python integers -and FP numbers; once it's been created, you can't change the value an -instance represents. \class{Decimal} instances can be created from -integers or strings: - -\begin{verbatim} ->>> import decimal ->>> decimal.Decimal(1972) -Decimal("1972") ->>> decimal.Decimal("1.1") -Decimal("1.1") -\end{verbatim} - -You can also provide tuples containing the sign, the mantissa represented -as a tuple of decimal digits, and the exponent: - -\begin{verbatim} ->>> decimal.Decimal((1, (1, 4, 7, 5), -2)) -Decimal("-14.75") -\end{verbatim} - -Cautionary note: the sign bit is a Boolean value, so 0 is positive and -1 is negative. - -Converting from floating-point numbers poses a bit of a problem: -should the FP number representing 1.1 turn into the decimal number for -exactly 1.1, or for 1.1 plus whatever inaccuracies are introduced? -The decision was to dodge the issue and leave such a conversion out of -the API. Instead, you should convert the floating-point number into a -string using the desired precision and pass the string to the -\class{Decimal} constructor: - -\begin{verbatim} ->>> f = 1.1 ->>> decimal.Decimal(str(f)) -Decimal("1.1") ->>> decimal.Decimal('%.12f' % f) -Decimal("1.100000000000") -\end{verbatim} - -Once you have \class{Decimal} instances, you can perform the usual -mathematical operations on them. One limitation: exponentiation -requires an integer exponent: - -\begin{verbatim} ->>> a = decimal.Decimal('35.72') ->>> b = decimal.Decimal('1.73') ->>> a+b -Decimal("37.45") ->>> a-b -Decimal("33.99") ->>> a*b -Decimal("61.7956") ->>> a/b -Decimal("20.64739884393063583815028902") ->>> a ** 2 -Decimal("1275.9184") ->>> a**b -Traceback (most recent call last): - ... -decimal.InvalidOperation: x ** (non-integer) -\end{verbatim} - -You can combine \class{Decimal} instances with integers, but not with -floating-point numbers: - -\begin{verbatim} ->>> a + 4 -Decimal("39.72") ->>> a + 4.5 -Traceback (most recent call last): - ... -TypeError: You can interact Decimal only with int, long or Decimal data types. ->>> -\end{verbatim} - -\class{Decimal} numbers can be used with the \module{math} and -\module{cmath} modules, but note that they'll be immediately converted to -floating-point numbers before the operation is performed, resulting in -a possible loss of precision and accuracy. You'll also get back a -regular floating-point number and not a \class{Decimal}. - -\begin{verbatim} ->>> import math, cmath ->>> d = decimal.Decimal('123456789012.345') ->>> math.sqrt(d) -351364.18288201344 ->>> cmath.sqrt(-d) -351364.18288201344j -\end{verbatim} - -\class{Decimal} instances have a \method{sqrt()} method that -returns a \class{Decimal}, but if you need other things such as -trigonometric functions you'll have to implement them. - -\begin{verbatim} ->>> d.sqrt() -Decimal("351364.1828820134592177245001") -\end{verbatim} - - -\subsection{The \class{Context} type} - -Instances of the \class{Context} class encapsulate several settings for -decimal operations: - -\begin{itemize} - \item \member{prec} is the precision, the number of decimal places. - \item \member{rounding} specifies the rounding mode. The \module{decimal} - module has constants for the various possibilities: - \constant{ROUND_DOWN}, \constant{ROUND_CEILING}, - \constant{ROUND_HALF_EVEN}, and various others. - \item \member{traps} is a dictionary specifying what happens on -encountering certain error conditions: either an exception is raised or -a value is returned. Some examples of error conditions are -division by zero, loss of precision, and overflow. -\end{itemize} - -There's a thread-local default context available by calling -\function{getcontext()}; you can change the properties of this context -to alter the default precision, rounding, or trap handling. The -following example shows the effect of changing the precision of the default -context: - -\begin{verbatim} ->>> decimal.getcontext().prec -28 ->>> decimal.Decimal(1) / decimal.Decimal(7) -Decimal("0.1428571428571428571428571429") ->>> decimal.getcontext().prec = 9 ->>> decimal.Decimal(1) / decimal.Decimal(7) -Decimal("0.142857143") -\end{verbatim} - -The default action for error conditions is selectable; the module can -either return a special value such as infinity or not-a-number, or -exceptions can be raised: - -\begin{verbatim} ->>> decimal.Decimal(1) / decimal.Decimal(0) -Traceback (most recent call last): - ... -decimal.DivisionByZero: x / 0 ->>> decimal.getcontext().traps[decimal.DivisionByZero] = False ->>> decimal.Decimal(1) / decimal.Decimal(0) -Decimal("Infinity") ->>> -\end{verbatim} - -The \class{Context} instance also has various methods for formatting -numbers such as \method{to_eng_string()} and \method{to_sci_string()}. - -For more information, see the documentation for the \module{decimal} -module, which includes a quick-start tutorial and a reference. - -\begin{seealso} -\seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented - by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.} - -\seeurl{http://research.microsoft.com/\textasciitilde hollasch/cgindex/coding/ieeefloat.html} -{A more detailed overview of the IEEE-754 representation.} - -\seeurl{http://www.lahey.com/float.htm} -{The article uses Fortran code to illustrate many of the problems -that floating-point inaccuracy can cause.} - -\seeurl{http://www2.hursley.ibm.com/decimal/} -{A description of a decimal-based representation. This representation -is being proposed as a standard, and underlies the new Python decimal -type. Much of this material was written by Mike Cowlishaw, designer of the -Rexx language.} - -\end{seealso} - - -%====================================================================== -\section{PEP 328: Multi-line Imports} - -One language change is a small syntactic tweak aimed at making it -easier to import many names from a module. In a -\code{from \var{module} import \var{names}} statement, -\var{names} is a sequence of names separated by commas. If the sequence is -very long, you can either write multiple imports from the same module, -or you can use backslashes to escape the line endings like this: - -\begin{verbatim} -from SimpleXMLRPCServer import SimpleXMLRPCServer,\ - SimpleXMLRPCRequestHandler,\ - CGIXMLRPCRequestHandler,\ - resolve_dotted_attribute -\end{verbatim} - -The syntactic change in Python 2.4 simply allows putting the names -within parentheses. Python ignores newlines within a parenthesized -expression, so the backslashes are no longer needed: - -\begin{verbatim} -from SimpleXMLRPCServer import (SimpleXMLRPCServer, - SimpleXMLRPCRequestHandler, - CGIXMLRPCRequestHandler, - resolve_dotted_attribute) -\end{verbatim} - -The PEP also proposes that all \keyword{import} statements be absolute -imports, with a leading \samp{.} character to indicate a relative -import. This part of the PEP was not implemented for Python 2.4, -but was completed for Python 2.5. - -\begin{seealso} -\seepep{328}{Imports: Multi-Line and Absolute/Relative} - {Written by Aahz. Multi-line imports were implemented by - Dima Dorfman.} -\end{seealso} - - -%====================================================================== -\section{PEP 331: Locale-Independent Float/String Conversions} - -The \module{locale} modules lets Python software select various -conversions and display conventions that are localized to a particular -country or language. However, the module was careful to not change -the numeric locale because various functions in Python's -implementation required that the numeric locale remain set to the -\code{'C'} locale. Often this was because the code was using the C library's -\cfunction{atof()} function. - -Not setting the numeric locale caused trouble for extensions that used -third-party C libraries, however, because they wouldn't have the -correct locale set. The motivating example was GTK+, whose user -interface widgets weren't displaying numbers in the current locale. - -The solution described in the PEP is to add three new functions to the -Python API that perform ASCII-only conversions, ignoring the locale -setting: - -\begin{itemize} - \item \cfunction{PyOS_ascii_strtod(\var{str}, \var{ptr})} -and \cfunction{PyOS_ascii_atof(\var{str}, \var{ptr})} -both convert a string to a C \ctype{double}. - \item \cfunction{PyOS_ascii_formatd(\var{buffer}, \var{buf_len}, \var{format}, \var{d})} converts a \ctype{double} to an ASCII string. -\end{itemize} - -The code for these functions came from the GLib library -(\url{http://developer.gnome.org/arch/gtk/glib.html}), whose -developers kindly relicensed the relevant functions and donated them -to the Python Software Foundation. The \module{locale} module -can now change the numeric locale, letting extensions such as GTK+ -produce the correct results. - -\begin{seealso} -\seepep{331}{Locale-Independent Float/String Conversions} -{Written by Christian R. Reis, and implemented by Gustavo Carneiro.} -\end{seealso} - -%====================================================================== -\section{Other Language Changes} - -Here are all of the changes that Python 2.4 makes to the core Python -language. - -\begin{itemize} - -\item Decorators for functions and methods were added (\pep{318}). - -\item Built-in \function{set} and \function{frozenset} types were -added (\pep{218}). Other new built-ins include the \function{reversed(\var{seq})} function (\pep{322}). - -\item Generator expressions were added (\pep{289}). - -\item Certain numeric expressions no longer return values restricted to 32 or 64 bits (\pep{237}). - -\item You can now put parentheses around the list of names in a -\code{from \var{module} import \var{names}} statement (\pep{328}). - -\item The \method{dict.update()} method now accepts the same -argument forms as the \class{dict} constructor. This includes any -mapping, any iterable of key/value pairs, and keyword arguments. -(Contributed by Raymond Hettinger.) - -\item The string methods \method{ljust()}, \method{rjust()}, and -\method{center()} now take an optional argument for specifying a -fill character other than a space. -(Contributed by Raymond Hettinger.) - -\item Strings also gained an \method{rsplit()} method that -works like the \method{split()} method but splits from the end of -the string. -(Contributed by Sean Reifschneider.) - -\begin{verbatim} ->>> 'www.python.org'.split('.', 1) -['www', 'python.org'] -'www.python.org'.rsplit('.', 1) -['www.python', 'org'] -\end{verbatim} - -\item Three keyword parameters, \var{cmp}, \var{key}, and -\var{reverse}, were added to the \method{sort()} method of lists. -These parameters make some common usages of \method{sort()} simpler. -All of these parameters are optional. - -For the \var{cmp} parameter, the value should be a comparison function -that takes two parameters and returns -1, 0, or +1 depending on how -the parameters compare. This function will then be used to sort the -list. Previously this was the only parameter that could be provided -to \method{sort()}. - -\var{key} should be a single-parameter function that takes a list -element and returns a comparison key for the element. The list is -then sorted using the comparison keys. The following example sorts a -list case-insensitively: - -\begin{verbatim} ->>> L = ['A', 'b', 'c', 'D'] ->>> L.sort() # Case-sensitive sort ->>> L -['A', 'D', 'b', 'c'] ->>> # Using 'key' parameter to sort list ->>> L.sort(key=lambda x: x.lower()) ->>> L -['A', 'b', 'c', 'D'] ->>> # Old-fashioned way ->>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower())) ->>> L -['A', 'b', 'c', 'D'] -\end{verbatim} - -The last example, which uses the \var{cmp} parameter, is the old way -to perform a case-insensitive sort. It works but is slower than using -a \var{key} parameter. Using \var{key} calls \method{lower()} method -once for each element in the list while using \var{cmp} will call it -twice for each comparison, so using \var{key} saves on invocations of -the \method{lower()} method. - -For simple key functions and comparison functions, it is often -possible to avoid a \keyword{lambda} expression by using an unbound -method instead. For example, the above case-insensitive sort is best -written as: - -\begin{verbatim} ->>> L.sort(key=str.lower) ->>> L -['A', 'b', 'c', 'D'] -\end{verbatim} - -Finally, the \var{reverse} parameter takes a Boolean value. If the -value is true, the list will be sorted into reverse order. -Instead of \code{L.sort() ; L.reverse()}, you can now write -\code{L.sort(reverse=True)}. - -The results of sorting are now guaranteed to be stable. This means -that two entries with equal keys will be returned in the same order as -they were input. For example, you can sort a list of people by name, -and then sort the list by age, resulting in a list sorted by age where -people with the same age are in name-sorted order. - -(All changes to \method{sort()} contributed by Raymond Hettinger.) - -\item There is a new built-in function -\function{sorted(\var{iterable})} that works like the in-place -\method{list.sort()} method but can be used in -expressions. The differences are: - \begin{itemize} - \item the input may be any iterable; - \item a newly formed copy is sorted, leaving the original intact; and - \item the expression returns the new sorted copy - \end{itemize} - -\begin{verbatim} ->>> L = [9,7,8,3,2,4,1,6,5] ->>> [10+i for i in sorted(L)] # usable in a list comprehension -[11, 12, 13, 14, 15, 16, 17, 18, 19] ->>> L # original is left unchanged -[9,7,8,3,2,4,1,6,5] ->>> sorted('Monty Python') # any iterable may be an input -[' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y'] - ->>> # List the contents of a dict sorted by key values ->>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5) ->>> for k, v in sorted(colormap.iteritems()): -... print k, v -... -black 4 -blue 2 -green 3 -red 1 -yellow 5 -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item Integer operations will no longer trigger an \exception{OverflowWarning}. -The \exception{OverflowWarning} warning will disappear in Python 2.5. - -\item The interpreter gained a new switch, \programopt{-m}, that -takes a name, searches for the corresponding module on \code{sys.path}, -and runs the module as a script. For example, -you can now run the Python profiler with \code{python -m profile}. -(Contributed by Nick Coghlan.) - -\item The \function{eval(\var{expr}, \var{globals}, \var{locals})} -and \function{execfile(\var{filename}, \var{globals}, \var{locals})} -functions and the \keyword{exec} statement now accept any mapping type -for the \var{locals} parameter. Previously this had to be a regular -Python dictionary. (Contributed by Raymond Hettinger.) - -\item The \function{zip()} built-in function and \function{itertools.izip()} - now return an empty list if called with no arguments. - Previously they raised a \exception{TypeError} - exception. This makes them more - suitable for use with variable length argument lists: - -\begin{verbatim} ->>> def transpose(array): -... return zip(*array) -... ->>> transpose([(1,2,3), (4,5,6)]) -[(1, 4), (2, 5), (3, 6)] ->>> transpose([]) -[] -\end{verbatim} -(Contributed by Raymond Hettinger.) - -\item Encountering a failure while importing a module no longer leaves -a partially-initialized module object in \code{sys.modules}. The -incomplete module object left behind would fool further imports of the -same module into succeeding, leading to confusing errors. -(Fixed by Tim Peters.) - -\item \constant{None} is now a constant; code that binds a new value to -the name \samp{None} is now a syntax error. -(Contributed by Raymond Hettinger.) - -\end{itemize} - - -%====================================================================== -\subsection{Optimizations} - -\begin{itemize} - -\item The inner loops for list and tuple slicing - were optimized and now run about one-third faster. The inner loops - for dictionaries were also optimized, resulting in performance boosts for - \method{keys()}, \method{values()}, \method{items()}, - \method{iterkeys()}, \method{itervalues()}, and \method{iteritems()}. - (Contributed by Raymond Hettinger.) - -\item The machinery for growing and shrinking lists was optimized for - speed and for space efficiency. Appending and popping from lists now - runs faster due to more efficient code paths and less frequent use of - the underlying system \cfunction{realloc()}. List comprehensions - also benefit. \method{list.extend()} was also optimized and no - longer converts its argument into a temporary list before extending - the base list. (Contributed by Raymond Hettinger.) - -\item \function{list()}, \function{tuple()}, \function{map()}, - \function{filter()}, and \function{zip()} now run several times - faster with non-sequence arguments that supply a \method{__len__()} - method. (Contributed by Raymond Hettinger.) - -\item The methods \method{list.__getitem__()}, - \method{dict.__getitem__()}, and \method{dict.__contains__()} are - are now implemented as \class{method_descriptor} objects rather - than \class{wrapper_descriptor} objects. This form of - access doubles their performance and makes them more suitable for - use as arguments to functionals: - \samp{map(mydict.__getitem__, keylist)}. - (Contributed by Raymond Hettinger.) - -\item Added a new opcode, \code{LIST_APPEND}, that simplifies - the generated bytecode for list comprehensions and speeds them up - by about a third. (Contributed by Raymond Hettinger.) - -\item The peephole bytecode optimizer has been improved to -produce shorter, faster bytecode; remarkably, the resulting bytecode is -more readable. (Enhanced by Raymond Hettinger.) - -\item String concatenations in statements of the form \code{s = s + -"abc"} and \code{s += "abc"} are now performed more efficiently in -certain circumstances. This optimization won't be present in other -Python implementations such as Jython, so you shouldn't rely on it; -using the \method{join()} method of strings is still recommended when -you want to efficiently glue a large number of strings together. -(Contributed by Armin Rigo.) - -\end{itemize} - -% pystone is almost useless for comparing different versions of Python; -% instead, it excels at predicting relative Python performance on -% different machines. -% So, this section would be more informative if it used other tools -% such as pybench and parrotbench. For a more application oriented -% benchmark, try comparing the timings of test_decimal.py under 2.3 -% and 2.4. - -The net result of the 2.4 optimizations is that Python 2.4 runs the -pystone benchmark around 5\% faster than Python 2.3 and 35\% faster -than Python 2.2. (pystone is not a particularly good benchmark, but -it's the most commonly used measurement of Python's performance. Your -own applications may show greater or smaller benefits from Python~2.4.) - - -%====================================================================== -\section{New, Improved, and Deprecated Modules} - -As usual, Python's standard library received a number of enhancements and -bug fixes. Here's a partial list of the most notable changes, sorted -alphabetically by module name. Consult the -\file{Misc/NEWS} file in the source tree for a more -complete list of changes, or look through the CVS logs for all the -details. - -\begin{itemize} - -\item The \module{asyncore} module's \function{loop()} function now - has a \var{count} parameter that lets you perform a limited number - of passes through the polling loop. The default is still to loop - forever. - -\item The \module{base64} module now has more complete RFC 3548 support - for Base64, Base32, and Base16 encoding and decoding, including - optional case folding and optional alternative alphabets. - (Contributed by Barry Warsaw.) - -\item The \module{bisect} module now has an underlying C implementation - for improved performance. - (Contributed by Dmitry Vasiliev.) - -\item The CJKCodecs collections of East Asian codecs, maintained -by Hye-Shik Chang, was integrated into 2.4. -The new encodings are: - -\begin{itemize} - \item Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz - \item Chinese (ROC): big5, cp950 - \item Japanese: cp932, euc-jis-2004, euc-jp, -euc-jisx0213, iso-2022-jp, iso-2022-jp-1, iso-2022-jp-2, - iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004, - shift-jis, shift-jisx0213, shift-jis-2004 - \item Korean: cp949, euc-kr, johab, iso-2022-kr -\end{itemize} - -\item Some other new encodings were added: HP Roman8, -ISO_8859-11, ISO_8859-16, PCTP-154, and TIS-620. - -\item The UTF-8 and UTF-16 codecs now cope better with receiving partial input. -Previously the \class{StreamReader} class would try to read more data, -making it impossible to resume decoding from the stream. The -\method{read()} method will now return as much data as it can and future -calls will resume decoding where previous ones left off. -(Implemented by Walter D\"orwald.) - -\item There is a new \module{collections} module for - various specialized collection datatypes. - Currently it contains just one type, \class{deque}, - a double-ended queue that supports efficiently adding and removing - elements from either end: - -\begin{verbatim} ->>> from collections import deque ->>> d = deque('ghi') # make a new deque with three items ->>> d.append('j') # add a new entry to the right side ->>> d.appendleft('f') # add a new entry to the left side ->>> d # show the representation of the deque -deque(['f', 'g', 'h', 'i', 'j']) ->>> d.pop() # return and remove the rightmost item -'j' ->>> d.popleft() # return and remove the leftmost item -'f' ->>> list(d) # list the contents of the deque -['g', 'h', 'i'] ->>> 'h' in d # search the deque -True -\end{verbatim} - -Several modules, such as the \module{Queue} and \module{threading} -modules, now take advantage of \class{collections.deque} for improved -performance. (Contributed by Raymond Hettinger.) - -\item The \module{ConfigParser} classes have been enhanced slightly. - The \method{read()} method now returns a list of the files that - were successfully parsed, and the \method{set()} method raises - \exception{TypeError} if passed a \var{value} argument that isn't a - string. (Contributed by John Belmonte and David Goodger.) - -\item The \module{curses} module now supports the ncurses extension - \function{use_default_colors()}. On platforms where the terminal - supports transparency, this makes it possible to use a transparent - background. (Contributed by J\"org Lehmann.) - -\item The \module{difflib} module now includes an \class{HtmlDiff} class -that creates an HTML table showing a side by side comparison -of two versions of a text. (Contributed by Dan Gass.) - -\item The \module{email} package was updated to version 3.0, -which dropped various deprecated APIs and removes support for Python -versions earlier than 2.3. The 3.0 version of the package uses a new -incremental parser for MIME messages, available in the -\module{email.FeedParser} module. The new parser doesn't require -reading the entire message into memory, and doesn't throw exceptions -if a message is malformed; instead it records any problems in the -\member{defect} attribute of the message. (Developed by Anthony -Baxter, Barry Warsaw, Thomas Wouters, and others.) - -\item The \module{heapq} module has been converted to C. The resulting - tenfold improvement in speed makes the module suitable for handling - high volumes of data. In addition, the module has two new functions - \function{nlargest()} and \function{nsmallest()} that use heaps to - find the N largest or smallest values in a dataset without the - expense of a full sort. (Contributed by Raymond Hettinger.) - -\item The \module{httplib} module now contains constants for HTTP -status codes defined in various HTTP-related RFC documents. Constants -have names such as \constant{OK}, \constant{CREATED}, -\constant{CONTINUE}, and \constant{MOVED_PERMANENTLY}; use pydoc to -get a full list. (Contributed by Andrew Eland.) - -\item The \module{imaplib} module now supports IMAP's THREAD command -(contributed by Yves Dionne) and new \method{deleteacl()} and -\method{myrights()} methods (contributed by Arnaud Mazin). - -\item The \module{itertools} module gained a - \function{groupby(\var{iterable}\optional{, \var{func}})} function. - \var{iterable} is something that can be iterated over to return a - stream of elements, and the optional \var{func} parameter is a - function that takes an element and returns a key value; if omitted, - the key is simply the element itself. \function{groupby()} then - groups the elements into subsequences which have matching values of - the key, and returns a series of 2-tuples containing the key value - and an iterator over the subsequence. - -Here's an example to make this clearer. The \var{key} function simply -returns whether a number is even or odd, so the result of -\function{groupby()} is to return consecutive runs of odd or even -numbers. - -\begin{verbatim} ->>> import itertools ->>> L = [2, 4, 6, 7, 8, 9, 11, 12, 14] ->>> for key_val, it in itertools.groupby(L, lambda x: x % 2): -... print key_val, list(it) -... -0 [2, 4, 6] -1 [7] -0 [8] -1 [9, 11] -0 [12, 14] ->>> -\end{verbatim} - -\function{groupby()} is typically used with sorted input. The logic -for \function{groupby()} is similar to the \UNIX{} \code{uniq} filter -which makes it handy for eliminating, counting, or identifying -duplicate elements: - -\begin{verbatim} ->>> word = 'abracadabra' ->>> letters = sorted(word) # Turn string into a sorted list of letters ->>> letters -['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r'] ->>> for k, g in itertools.groupby(letters): -... print k, list(g) -... -a ['a', 'a', 'a', 'a', 'a'] -b ['b', 'b'] -c ['c'] -d ['d'] -r ['r', 'r'] ->>> # List unique letters ->>> [k for k, g in groupby(letters)] -['a', 'b', 'c', 'd', 'r'] ->>> # Count letter occurrences ->>> [(k, len(list(g))) for k, g in groupby(letters)] -[('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)] -\end{verbatim} - -(Contributed by Hye-Shik Chang.) - -\item \module{itertools} also gained a function named -\function{tee(\var{iterator}, \var{N})} that returns \var{N} independent -iterators that replicate \var{iterator}. If \var{N} is omitted, the -default is 2. - -\begin{verbatim} ->>> L = [1,2,3] ->>> i1, i2 = itertools.tee(L) ->>> i1,i2 -(<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>) ->>> list(i1) # Run the first iterator to exhaustion -[1, 2, 3] ->>> list(i2) # Run the second iterator to exhaustion -[1, 2, 3] -\end{verbatim} - -Note that \function{tee()} has to keep copies of the values returned -by the iterator; in the worst case, it may need to keep all of them. -This should therefore be used carefully if the leading iterator -can run far ahead of the trailing iterator in a long stream of inputs. -If the separation is large, then you might as well use -\function{list()} instead. When the iterators track closely with one -another, \function{tee()} is ideal. Possible applications include -bookmarking, windowing, or lookahead iterators. -(Contributed by Raymond Hettinger.) - -\item A number of functions were added to the \module{locale} -module, such as \function{bind_textdomain_codeset()} to specify a -particular encoding and a family of \function{l*gettext()} functions -that return messages in the chosen encoding. -(Contributed by Gustavo Niemeyer.) - -\item Some keyword arguments were added to the \module{logging} -package's \function{basicConfig} function to simplify log -configuration. The default behavior is to log messages to standard -error, but various keyword arguments can be specified to log to a -particular file, change the logging format, or set the logging level. -For example: - -\begin{verbatim} -import logging -logging.basicConfig(filename='/var/log/application.log', - level=0, # Log all messages - format='%(levelname):%(process):%(thread):%(message)') -\end{verbatim} - -Other additions to the \module{logging} package include a -\method{log(\var{level}, \var{msg})} convenience method, as well as a -\class{TimedRotatingFileHandler} class that rotates its log files at a -timed interval. The module already had \class{RotatingFileHandler}, -which rotated logs once the file exceeded a certain size. Both -classes derive from a new \class{BaseRotatingHandler} class that can -be used to implement other rotating handlers. - -(Changes implemented by Vinay Sajip.) - -\item The \module{marshal} module now shares interned strings on unpacking a -data structure. This may shrink the size of certain pickle strings, -but the primary effect is to make \file{.pyc} files significantly smaller. -(Contributed by Martin von~L\"owis.) - -\item The \module{nntplib} module's \class{NNTP} class gained -\method{description()} and \method{descriptions()} methods to retrieve -newsgroup descriptions for a single group or for a range of groups. -(Contributed by J\"urgen A. Erhard.) - -\item Two new functions were added to the \module{operator} module, -\function{attrgetter(\var{attr})} and \function{itemgetter(\var{index})}. -Both functions return callables that take a single argument and return -the corresponding attribute or item; these callables make excellent -data extractors when used with \function{map()} or -\function{sorted()}. For example: - -\begin{verbatim} ->>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)] ->>> map(operator.itemgetter(0), L) -['c', 'd', 'a', 'b'] ->>> map(operator.itemgetter(1), L) -[2, 1, 4, 3] ->>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item -[('d', 1), ('c', 2), ('b', 3), ('a', 4)] -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item The \module{optparse} module was updated in various ways. The -module now passes its messages through \function{gettext.gettext()}, -making it possible to internationalize Optik's help and error -messages. Help messages for options can now include the string -\code{'\%default'}, which will be replaced by the option's default -value. (Contributed by Greg Ward.) - -\item The long-term plan is to deprecate the \module{rfc822} module -in some future Python release in favor of the \module{email} package. -To this end, the \function{email.Utils.formatdate()} function has been -changed to make it usable as a replacement for -\function{rfc822.formatdate()}. You may want to write new e-mail -processing code with this in mind. (Change implemented by Anthony -Baxter.) - -\item A new \function{urandom(\var{n})} function was added to the -\module{os} module, returning a string containing \var{n} bytes of -random data. This function provides access to platform-specific -sources of randomness such as \file{/dev/urandom} on Linux or the -Windows CryptoAPI. (Contributed by Trevor Perrin.) - -\item Another new function: \function{os.path.lexists(\var{path})} -returns true if the file specified by \var{path} exists, whether or -not it's a symbolic link. This differs from the existing -\function{os.path.exists(\var{path})} function, which returns false if -\var{path} is a symlink that points to a destination that doesn't exist. -(Contributed by Beni Cherniavsky.) - -\item A new \function{getsid()} function was added to the -\module{posix} module that underlies the \module{os} module. -(Contributed by J. Raynor.) - -\item The \module{poplib} module now supports POP over SSL. (Contributed by -Hector Urtubia.) - -\item The \module{profile} module can now profile C extension functions. -(Contributed by Nick Bastin.) - -\item The \module{random} module has a new method called - \method{getrandbits(\var{N})} that returns a long integer \var{N} - bits in length. The existing \method{randrange()} method now uses - \method{getrandbits()} where appropriate, making generation of - arbitrarily large random numbers more efficient. (Contributed by - Raymond Hettinger.) - -\item The regular expression language accepted by the \module{re} module - was extended with simple conditional expressions, written as - \regexp{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a - numeric group ID or a group name defined with \regexp{(?P<group>...)} - earlier in the expression. If the specified group matched, the - regular expression pattern \var{A} will be tested against the string; if - the group didn't match, the pattern \var{B} will be used instead. - (Contributed by Gustavo Niemeyer.) - -\item The \module{re} module is also no longer recursive, thanks to a -massive amount of work by Gustavo Niemeyer. In a recursive regular -expression engine, certain patterns result in a large amount of C -stack space being consumed, and it was possible to overflow the stack. -For example, if you matched a 30000-byte string of \samp{a} characters -against the expression \regexp{(a|b)+}, one stack frame was consumed -per character. Python 2.3 tried to check for stack overflow and raise -a \exception{RuntimeError} exception, but certain patterns could -sidestep the checking and if you were unlucky Python could segfault. -Python 2.4's regular expression engine can match this pattern without -problems. - -\item The \module{signal} module now performs tighter error-checking -on the parameters to the \function{signal.signal()} function. For -example, you can't set a handler on the \constant{SIGKILL} signal; -previous versions of Python would quietly accept this, but 2.4 will -raise a \exception{RuntimeError} exception. - -\item Two new functions were added to the \module{socket} module. -\function{socketpair()} returns a pair of connected sockets and -\function{getservbyport(\var{port})} looks up the service name for a -given port number. (Contributed by Dave Cole and Barry Warsaw.) - -\item The \function{sys.exitfunc()} function has been deprecated. Code -should be using the existing \module{atexit} module, which correctly -handles calling multiple exit functions. Eventually -\function{sys.exitfunc()} will become a purely internal interface, -accessed only by \module{atexit}. - -\item The \module{tarfile} module now generates GNU-format tar files -by default. (Contributed by Lars Gustaebel.) - -\item The \module{threading} module now has an elegantly simple way to support -thread-local data. The module contains a \class{local} class whose -attribute values are local to different threads. - -\begin{verbatim} -import threading - -data = threading.local() -data.number = 42 -data.url = ('www.python.org', 80) -\end{verbatim} - -Other threads can assign and retrieve their own values for the -\member{number} and \member{url} attributes. You can subclass -\class{local} to initialize attributes or to add methods. -(Contributed by Jim Fulton.) - -\item The \module{timeit} module now automatically disables periodic - garbage collection during the timing loop. This change makes - consecutive timings more comparable. (Contributed by Raymond Hettinger.) - -\item The \module{weakref} module now supports a wider variety of objects - including Python functions, class instances, sets, frozensets, deques, - arrays, files, sockets, and regular expression pattern objects. - (Contributed by Raymond Hettinger.) - -\item The \module{xmlrpclib} module now supports a multi-call extension for -transmitting multiple XML-RPC calls in a single HTTP operation. -(Contributed by Brian Quinlan.) - -\item The \module{mpz}, \module{rotor}, and \module{xreadlines} modules have -been removed. - -\end{itemize} - - -%====================================================================== -% whole new modules get described in subsections here - -%===================== -\subsection{cookielib} - -The \module{cookielib} library supports client-side handling for HTTP -cookies, mirroring the \module{Cookie} module's server-side cookie -support. Cookies are stored in cookie jars; the library transparently -stores cookies offered by the web server in the cookie jar, and -fetches the cookie from the jar when connecting to the server. As in -web browsers, policy objects control whether cookies are accepted or -not. - -In order to store cookies across sessions, two implementations of -cookie jars are provided: one that stores cookies in the Netscape -format so applications can use the Mozilla or Lynx cookie files, and -one that stores cookies in the same format as the Perl libwww library. - -\module{urllib2} has been changed to interact with \module{cookielib}: -\class{HTTPCookieProcessor} manages a cookie jar that is used when -accessing URLs. - -This module was contributed by John J. Lee. - - -% ================== -\subsection{doctest} - -The \module{doctest} module underwent considerable refactoring thanks -to Edward Loper and Tim Peters. Testing can still be as simple as -running \function{doctest.testmod()}, but the refactorings allow -customizing the module's operation in various ways - -The new \class{DocTestFinder} class extracts the tests from a given -object's docstrings: - -\begin{verbatim} -def f (x, y): - """>>> f(2,2) -4 ->>> f(3,2) -6 - """ - return x*y - -finder = doctest.DocTestFinder() - -# Get list of DocTest instances -tests = finder.find(f) -\end{verbatim} - -The new \class{DocTestRunner} class then runs individual tests and can -produce a summary of the results: - -\begin{verbatim} -runner = doctest.DocTestRunner() -for t in tests: - tried, failed = runner.run(t) - -runner.summarize(verbose=1) -\end{verbatim} - -The above example produces the following output: - -\begin{verbatim} -1 items passed all tests: - 2 tests in f -2 tests in 1 items. -2 passed and 0 failed. -Test passed. -\end{verbatim} - -\class{DocTestRunner} uses an instance of the \class{OutputChecker} -class to compare the expected output with the actual output. This -class takes a number of different flags that customize its behaviour; -ambitious users can also write a completely new subclass of -\class{OutputChecker}. - -The default output checker provides a number of handy features. -For example, with the \constant{doctest.ELLIPSIS} option flag, -an ellipsis (\samp{...}) in the expected output matches any substring, -making it easier to accommodate outputs that vary in minor ways: - -\begin{verbatim} -def o (n): - """>>> o(1) -<__main__.C instance at 0x...> ->>> -""" -\end{verbatim} - -Another special string, \samp{<BLANKLINE>}, matches a blank line: - -\begin{verbatim} -def p (n): - """>>> p(1) -<BLANKLINE> ->>> -""" -\end{verbatim} - -Another new capability is producing a diff-style display of the output -by specifying the \constant{doctest.REPORT_UDIFF} (unified diffs), -\constant{doctest.REPORT_CDIFF} (context diffs), or -\constant{doctest.REPORT_NDIFF} (delta-style) option flags. For example: - -\begin{verbatim} -def g (n): - """>>> g(4) -here -is -a -lengthy ->>>""" - L = 'here is a rather lengthy list of words'.split() - for word in L[:n]: - print word -\end{verbatim} - -Running the above function's tests with -\constant{doctest.REPORT_UDIFF} specified, you get the following output: - -\begin{verbatim} -********************************************************************** -File ``t.py'', line 15, in g -Failed example: - g(4) -Differences (unified diff with -expected +actual): - @@ -2,3 +2,3 @@ - is - a - -lengthy - +rather -********************************************************************** -\end{verbatim} - - -% ====================================================================== -\section{Build and C API Changes} - -Some of the changes to Python's build process and to the C API are: - -\begin{itemize} - - \item Three new convenience macros were added for common return - values from extension functions: \csimplemacro{Py_RETURN_NONE}, - \csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}. - (Contributed by Brett Cannon.) - - \item Another new macro, \csimplemacro{Py_CLEAR(\var{obj})}, - decreases the reference count of \var{obj} and sets \var{obj} to the - null pointer. (Contributed by Jim Fulton.) - - \item A new function, \cfunction{PyTuple_Pack(\var{N}, \var{obj1}, - \var{obj2}, ..., \var{objN})}, constructs tuples from a variable - length argument list of Python objects. (Contributed by Raymond Hettinger.) - - \item A new function, \cfunction{PyDict_Contains(\var{d}, \var{k})}, - implements fast dictionary lookups without masking exceptions raised - during the look-up process. (Contributed by Raymond Hettinger.) - - \item The \csimplemacro{Py_IS_NAN(\var{X})} macro returns 1 if - its float or double argument \var{X} is a NaN. - (Contributed by Tim Peters.) - - \item C code can avoid unnecessary locking by using the new - \cfunction{PyEval_ThreadsInitialized()} function to tell - if any thread operations have been performed. If this function - returns false, no lock operations are needed. - (Contributed by Nick Coghlan.) - - \item A new function, \cfunction{PyArg_VaParseTupleAndKeywords()}, - is the same as \cfunction{PyArg_ParseTupleAndKeywords()} but takes a - \ctype{va_list} instead of a number of arguments. - (Contributed by Greg Chapman.) - - \item A new method flag, \constant{METH_COEXISTS}, allows a function - defined in slots to co-exist with a \ctype{PyCFunction} having the - same name. This can halve the access time for a method such as - \method{set.__contains__()}. (Contributed by Raymond Hettinger.) - - \item Python can now be built with additional profiling for the - interpreter itself, intended as an aid to people developing the - Python core. Providing \longprogramopt{--enable-profiling} to the - \program{configure} script will let you profile the interpreter with - \program{gprof}, and providing the \longprogramopt{--with-tsc} - switch enables profiling using the Pentium's Time-Stamp-Counter - register. Note that the \longprogramopt{--with-tsc} switch is slightly - misnamed, because the profiling feature also works on the PowerPC - platform, though that processor architecture doesn't call that - register ``the TSC register''. (Contributed by Jeremy Hylton.) - - \item The \ctype{tracebackobject} type has been renamed to \ctype{PyTracebackObject}. - -\end{itemize} - - -%====================================================================== -\subsection{Port-Specific Changes} - -\begin{itemize} - -\item The Windows port now builds under MSVC++ 7.1 as well as version 6. - (Contributed by Martin von~L\"owis.) - -\end{itemize} - - - -%====================================================================== -\section{Porting to Python 2.4} - -This section lists previously described changes that may require -changes to your code: - -\begin{itemize} - -\item Left shifts and hexadecimal/octal constants that are too - large no longer trigger a \exception{FutureWarning} and return - a value limited to 32 or 64 bits; instead they return a long integer. - -\item Integer operations will no longer trigger an \exception{OverflowWarning}. -The \exception{OverflowWarning} warning will disappear in Python 2.5. - -\item The \function{zip()} built-in function and \function{itertools.izip()} - now return an empty list instead of raising a \exception{TypeError} - exception if called with no arguments. - -\item You can no longer compare the \class{date} and \class{datetime} - instances provided by the \module{datetime} module. Two - instances of different classes will now always be unequal, and - relative comparisons (\code{<}, \code{>}) will raise a \exception{TypeError}. - -\item \function{dircache.listdir()} now passes exceptions to the caller - instead of returning empty lists. - -\item \function{LexicalHandler.startDTD()} used to receive the public and - system IDs in the wrong order. This has been corrected; applications - relying on the wrong order need to be fixed. - -\item \function{fcntl.ioctl} now warns if the \var{mutate} - argument is omitted and relevant. - -\item The \module{tarfile} module now generates GNU-format tar files -by default. - -\item Encountering a failure while importing a module no longer leaves -a partially-initialized module object in \code{sys.modules}. - -\item \constant{None} is now a constant; code that binds a new value to -the name \samp{None} is now a syntax error. - -\item The \function{signals.signal()} function now raises a -\exception{RuntimeError} exception for certain illegal values; -previously these errors would pass silently. For example, you can no -longer set a handler on the \constant{SIGKILL} signal. - -\end{itemize} - - -%====================================================================== -\section{Acknowledgements \label{acks}} - -The author would like to thank the following people for offering -suggestions, corrections and assistance with various drafts of this -article: Koray Can, Hye-Shik Chang, Michael Dyck, Raymond Hettinger, -Brian Hurt, Hamish Lawson, Fredrik Lundh, Sean Reifschneider, -Sadruddin Rejeb. - -\end{document} diff --git a/Doc/whatsnew/whatsnew25.tex b/Doc/whatsnew/whatsnew25.tex deleted file mode 100644 index b6bac49..0000000 --- a/Doc/whatsnew/whatsnew25.tex +++ /dev/null @@ -1,2539 +0,0 @@ -\documentclass{howto} -\usepackage{distutils} -% $Id$ - -% Fix XXX comments - -\title{What's New in Python 2.5} -\release{1.01} -\author{A.M. Kuchling} -\authoraddress{\email{amk@amk.ca}} - -\begin{document} -\maketitle -\tableofcontents - -This article explains the new features in Python 2.5. The final -release of Python 2.5 is scheduled for August 2006; -\pep{356} describes the planned release schedule. - -The changes in Python 2.5 are an interesting mix of language and -library improvements. The library enhancements will be more important -to Python's user community, I think, because several widely-useful -packages were added. New modules include ElementTree for XML -processing (section~\ref{module-etree}), the SQLite database module -(section~\ref{module-sqlite}), and the \module{ctypes} module for -calling C functions (section~\ref{module-ctypes}). - -The language changes are of middling significance. Some pleasant new -features were added, but most of them aren't features that you'll use -every day. Conditional expressions were finally added to the language -using a novel syntax; see section~\ref{pep-308}. The new -'\keyword{with}' statement will make writing cleanup code easier -(section~\ref{pep-343}). Values can now be passed into generators -(section~\ref{pep-342}). Imports are now visible as either absolute -or relative (section~\ref{pep-328}). Some corner cases of exception -handling are handled better (section~\ref{pep-341}). All these -improvements are worthwhile, but they're improvements to one specific -language feature or another; none of them are broad modifications to -Python's semantics. - -As well as the language and library additions, other improvements and -bugfixes were made throughout the source tree. A search through the -SVN change logs finds there were 353 patches applied and 458 bugs -fixed between Python 2.4 and 2.5. (Both figures are likely to be -underestimates.) - -This article doesn't try to be a complete specification of the new -features; instead changes are briefly introduced using helpful -examples. For full details, you should always refer to the -documentation for Python 2.5 at \url{http://docs.python.org}. -If you want to understand the complete implementation and design -rationale, refer to the PEP for a particular new feature. - -Comments, suggestions, and error reports for this document are -welcome; please e-mail them to the author or open a bug in the Python -bug tracker. - -%====================================================================== -\section{PEP 308: Conditional Expressions\label{pep-308}} - -For a long time, people have been requesting a way to write -conditional expressions, which are expressions that return value A or -value B depending on whether a Boolean value is true or false. A -conditional expression lets you write a single assignment statement -that has the same effect as the following: - -\begin{verbatim} -if condition: - x = true_value -else: - x = false_value -\end{verbatim} - -There have been endless tedious discussions of syntax on both -python-dev and comp.lang.python. A vote was even held that found the -majority of voters wanted conditional expressions in some form, -but there was no syntax that was preferred by a clear majority. -Candidates included C's \code{cond ? true_v : false_v}, -\code{if cond then true_v else false_v}, and 16 other variations. - -Guido van~Rossum eventually chose a surprising syntax: - -\begin{verbatim} -x = true_value if condition else false_value -\end{verbatim} - -Evaluation is still lazy as in existing Boolean expressions, so the -order of evaluation jumps around a bit. The \var{condition} -expression in the middle is evaluated first, and the \var{true_value} -expression is evaluated only if the condition was true. Similarly, -the \var{false_value} expression is only evaluated when the condition -is false. - -This syntax may seem strange and backwards; why does the condition go -in the \emph{middle} of the expression, and not in the front as in C's -\code{c ? x : y}? The decision was checked by applying the new syntax -to the modules in the standard library and seeing how the resulting -code read. In many cases where a conditional expression is used, one -value seems to be the 'common case' and one value is an 'exceptional -case', used only on rarer occasions when the condition isn't met. The -conditional syntax makes this pattern a bit more obvious: - -\begin{verbatim} -contents = ((doc + '\n') if doc else '') -\end{verbatim} - -I read the above statement as meaning ``here \var{contents} is -usually assigned a value of \code{doc+'\e n'}; sometimes -\var{doc} is empty, in which special case an empty string is returned.'' -I doubt I will use conditional expressions very often where there -isn't a clear common and uncommon case. - -There was some discussion of whether the language should require -surrounding conditional expressions with parentheses. The decision -was made to \emph{not} require parentheses in the Python language's -grammar, but as a matter of style I think you should always use them. -Consider these two statements: - -\begin{verbatim} -# First version -- no parens -level = 1 if logging else 0 - -# Second version -- with parens -level = (1 if logging else 0) -\end{verbatim} - -In the first version, I think a reader's eye might group the statement -into 'level = 1', 'if logging', 'else 0', and think that the condition -decides whether the assignment to \var{level} is performed. The -second version reads better, in my opinion, because it makes it clear -that the assignment is always performed and the choice is being made -between two values. - -Another reason for including the brackets: a few odd combinations of -list comprehensions and lambdas could look like incorrect conditional -expressions. See \pep{308} for some examples. If you put parentheses -around your conditional expressions, you won't run into this case. - - -\begin{seealso} - -\seepep{308}{Conditional Expressions}{PEP written by -Guido van~Rossum and Raymond D. Hettinger; implemented by Thomas -Wouters.} - -\end{seealso} - - -%====================================================================== -\section{PEP 309: Partial Function Application\label{pep-309}} - -The \module{functools} module is intended to contain tools for -functional-style programming. - -One useful tool in this module is the \function{partial()} function. -For programs written in a functional style, you'll sometimes want to -construct variants of existing functions that have some of the -parameters filled in. Consider a Python function \code{f(a, b, c)}; -you could create a new function \code{g(b, c)} that was equivalent to -\code{f(1, b, c)}. This is called ``partial function application''. - -\function{partial} takes the arguments -\code{(\var{function}, \var{arg1}, \var{arg2}, ... -\var{kwarg1}=\var{value1}, \var{kwarg2}=\var{value2})}. The resulting -object is callable, so you can just call it to invoke \var{function} -with the filled-in arguments. - -Here's a small but realistic example: - -\begin{verbatim} -import functools - -def log (message, subsystem): - "Write the contents of 'message' to the specified subsystem." - print '%s: %s' % (subsystem, message) - ... - -server_log = functools.partial(log, subsystem='server') -server_log('Unable to open socket') -\end{verbatim} - -Here's another example, from a program that uses PyGTK. Here a -context-sensitive pop-up menu is being constructed dynamically. The -callback provided for the menu option is a partially applied version -of the \method{open_item()} method, where the first argument has been -provided. - -\begin{verbatim} -... -class Application: - def open_item(self, path): - ... - def init (self): - open_func = functools.partial(self.open_item, item_path) - popup_menu.append( ("Open", open_func, 1) ) -\end{verbatim} - - -Another function in the \module{functools} module is the -\function{update_wrapper(\var{wrapper}, \var{wrapped})} function that -helps you write well-behaved decorators. \function{update_wrapper()} -copies the name, module, and docstring attribute to a wrapper function -so that tracebacks inside the wrapped function are easier to -understand. For example, you might write: - -\begin{verbatim} -def my_decorator(f): - def wrapper(*args, **kwds): - print 'Calling decorated function' - return f(*args, **kwds) - functools.update_wrapper(wrapper, f) - return wrapper -\end{verbatim} - -\function{wraps()} is a decorator that can be used inside your own -decorators to copy the wrapped function's information. An alternate -version of the previous example would be: - -\begin{verbatim} -def my_decorator(f): - @functools.wraps(f) - def wrapper(*args, **kwds): - print 'Calling decorated function' - return f(*args, **kwds) - return wrapper -\end{verbatim} - -\begin{seealso} - -\seepep{309}{Partial Function Application}{PEP proposed and written by -Peter Harris; implemented by Hye-Shik Chang and Nick Coghlan, with -adaptations by Raymond Hettinger.} - -\end{seealso} - - -%====================================================================== -\section{PEP 314: Metadata for Python Software Packages v1.1\label{pep-314}} - -Some simple dependency support was added to Distutils. The -\function{setup()} function now has \code{requires}, \code{provides}, -and \code{obsoletes} keyword parameters. When you build a source -distribution using the \code{sdist} command, the dependency -information will be recorded in the \file{PKG-INFO} file. - -Another new keyword parameter is \code{download_url}, which should be -set to a URL for the package's source code. This means it's now -possible to look up an entry in the package index, determine the -dependencies for a package, and download the required packages. - -\begin{verbatim} -VERSION = '1.0' -setup(name='PyPackage', - version=VERSION, - requires=['numarray', 'zlib (>=1.1.4)'], - obsoletes=['OldPackage'] - download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz' - % VERSION), - ) -\end{verbatim} - -Another new enhancement to the Python package index at -\url{http://cheeseshop.python.org} is storing source and binary -archives for a package. The new \command{upload} Distutils command -will upload a package to the repository. - -Before a package can be uploaded, you must be able to build a -distribution using the \command{sdist} Distutils command. Once that -works, you can run \code{python setup.py upload} to add your package -to the PyPI archive. Optionally you can GPG-sign the package by -supplying the \longprogramopt{sign} and -\longprogramopt{identity} options. - -Package uploading was implemented by Martin von~L\"owis and Richard Jones. - -\begin{seealso} - -\seepep{314}{Metadata for Python Software Packages v1.1}{PEP proposed -and written by A.M. Kuchling, Richard Jones, and Fred Drake; -implemented by Richard Jones and Fred Drake.} - -\end{seealso} - - -%====================================================================== -\section{PEP 328: Absolute and Relative Imports\label{pep-328}} - -The simpler part of PEP 328 was implemented in Python 2.4: parentheses -could now be used to enclose the names imported from a module using -the \code{from ... import ...} statement, making it easier to import -many different names. - -The more complicated part has been implemented in Python 2.5: -importing a module can be specified to use absolute or -package-relative imports. The plan is to move toward making absolute -imports the default in future versions of Python. - -Let's say you have a package directory like this: -\begin{verbatim} -pkg/ -pkg/__init__.py -pkg/main.py -pkg/string.py -\end{verbatim} - -This defines a package named \module{pkg} containing the -\module{pkg.main} and \module{pkg.string} submodules. - -Consider the code in the \file{main.py} module. What happens if it -executes the statement \code{import string}? In Python 2.4 and -earlier, it will first look in the package's directory to perform a -relative import, finds \file{pkg/string.py}, imports the contents of -that file as the \module{pkg.string} module, and that module is bound -to the name \samp{string} in the \module{pkg.main} module's namespace. - -That's fine if \module{pkg.string} was what you wanted. But what if -you wanted Python's standard \module{string} module? There's no clean -way to ignore \module{pkg.string} and look for the standard module; -generally you had to look at the contents of \code{sys.modules}, which -is slightly unclean. -Holger Krekel's \module{py.std} package provides a tidier way to perform -imports from the standard library, \code{import py ; py.std.string.join()}, -but that package isn't available on all Python installations. - -Reading code which relies on relative imports is also less clear, -because a reader may be confused about which module, \module{string} -or \module{pkg.string}, is intended to be used. Python users soon -learned not to duplicate the names of standard library modules in the -names of their packages' submodules, but you can't protect against -having your submodule's name being used for a new module added in a -future version of Python. - -In Python 2.5, you can switch \keyword{import}'s behaviour to -absolute imports using a \code{from __future__ import absolute_import} -directive. This absolute-import behaviour will become the default in -a future version (probably Python 2.7). Once absolute imports -are the default, \code{import string} will -always find the standard library's version. -It's suggested that users should begin using absolute imports as much -as possible, so it's preferable to begin writing \code{from pkg import -string} in your code. - -Relative imports are still possible by adding a leading period -to the module name when using the \code{from ... import} form: - -\begin{verbatim} -# Import names from pkg.string -from .string import name1, name2 -# Import pkg.string -from . import string -\end{verbatim} - -This imports the \module{string} module relative to the current -package, so in \module{pkg.main} this will import \var{name1} and -\var{name2} from \module{pkg.string}. Additional leading periods -perform the relative import starting from the parent of the current -package. For example, code in the \module{A.B.C} module can do: - -\begin{verbatim} -from . import D # Imports A.B.D -from .. import E # Imports A.E -from ..F import G # Imports A.F.G -\end{verbatim} - -Leading periods cannot be used with the \code{import \var{modname}} -form of the import statement, only the \code{from ... import} form. - -\begin{seealso} - -\seepep{328}{Imports: Multi-Line and Absolute/Relative} -{PEP written by Aahz; implemented by Thomas Wouters.} - -\seeurl{http://codespeak.net/py/current/doc/index.html} -{The py library by Holger Krekel, which contains the \module{py.std} package.} - -\end{seealso} - - -%====================================================================== -\section{PEP 338: Executing Modules as Scripts\label{pep-338}} - -The \programopt{-m} switch added in Python 2.4 to execute a module as -a script gained a few more abilities. Instead of being implemented in -C code inside the Python interpreter, the switch now uses an -implementation in a new module, \module{runpy}. - -The \module{runpy} module implements a more sophisticated import -mechanism so that it's now possible to run modules in a package such -as \module{pychecker.checker}. The module also supports alternative -import mechanisms such as the \module{zipimport} module. This means -you can add a .zip archive's path to \code{sys.path} and then use the -\programopt{-m} switch to execute code from the archive. - - -\begin{seealso} - -\seepep{338}{Executing modules as scripts}{PEP written and -implemented by Nick Coghlan.} - -\end{seealso} - - -%====================================================================== -\section{PEP 341: Unified try/except/finally\label{pep-341}} - -Until Python 2.5, the \keyword{try} statement came in two -flavours. You could use a \keyword{finally} block to ensure that code -is always executed, or one or more \keyword{except} blocks to catch -specific exceptions. You couldn't combine both \keyword{except} blocks and a -\keyword{finally} block, because generating the right bytecode for the -combined version was complicated and it wasn't clear what the -semantics of the combined statement should be. - -Guido van~Rossum spent some time working with Java, which does support the -equivalent of combining \keyword{except} blocks and a -\keyword{finally} block, and this clarified what the statement should -mean. In Python 2.5, you can now write: - -\begin{verbatim} -try: - block-1 ... -except Exception1: - handler-1 ... -except Exception2: - handler-2 ... -else: - else-block -finally: - final-block -\end{verbatim} - -The code in \var{block-1} is executed. If the code raises an -exception, the various \keyword{except} blocks are tested: if the -exception is of class \class{Exception1}, \var{handler-1} is executed; -otherwise if it's of class \class{Exception2}, \var{handler-2} is -executed, and so forth. If no exception is raised, the -\var{else-block} is executed. - -No matter what happened previously, the \var{final-block} is executed -once the code block is complete and any raised exceptions handled. -Even if there's an error in an exception handler or the -\var{else-block} and a new exception is raised, the -code in the \var{final-block} is still run. - -\begin{seealso} - -\seepep{341}{Unifying try-except and try-finally}{PEP written by Georg Brandl; -implementation by Thomas Lee.} - -\end{seealso} - - -%====================================================================== -\section{PEP 342: New Generator Features\label{pep-342}} - -Python 2.5 adds a simple way to pass values \emph{into} a generator. -As introduced in Python 2.3, generators only produce output; once a -generator's code was invoked to create an iterator, there was no way to -pass any new information into the function when its execution is -resumed. Sometimes the ability to pass in some information would be -useful. Hackish solutions to this include making the generator's code -look at a global variable and then changing the global variable's -value, or passing in some mutable object that callers then modify. - -To refresh your memory of basic generators, here's a simple example: - -\begin{verbatim} -def counter (maximum): - i = 0 - while i < maximum: - yield i - i += 1 -\end{verbatim} - -When you call \code{counter(10)}, the result is an iterator that -returns the values from 0 up to 9. On encountering the -\keyword{yield} statement, the iterator returns the provided value and -suspends the function's execution, preserving the local variables. -Execution resumes on the following call to the iterator's -\method{next()} method, picking up after the \keyword{yield} statement. - -In Python 2.3, \keyword{yield} was a statement; it didn't return any -value. In 2.5, \keyword{yield} is now an expression, returning a -value that can be assigned to a variable or otherwise operated on: - -\begin{verbatim} -val = (yield i) -\end{verbatim} - -I recommend that you always put parentheses around a \keyword{yield} -expression when you're doing something with the returned value, as in -the above example. The parentheses aren't always necessary, but it's -easier to always add them instead of having to remember when they're -needed. - -(\pep{342} explains the exact rules, which are that a -\keyword{yield}-expression must always be parenthesized except when it -occurs at the top-level expression on the right-hand side of an -assignment. This means you can write \code{val = yield i} but have to -use parentheses when there's an operation, as in \code{val = (yield i) -+ 12}.) - -Values are sent into a generator by calling its -\method{send(\var{value})} method. The generator's code is then -resumed and the \keyword{yield} expression returns the specified -\var{value}. If the regular \method{next()} method is called, the -\keyword{yield} returns \constant{None}. - -Here's the previous example, modified to allow changing the value of -the internal counter. - -\begin{verbatim} -def counter (maximum): - i = 0 - while i < maximum: - val = (yield i) - # If value provided, change counter - if val is not None: - i = val - else: - i += 1 -\end{verbatim} - -And here's an example of changing the counter: - -\begin{verbatim} ->>> it = counter(10) ->>> print it.next() -0 ->>> print it.next() -1 ->>> print it.send(8) -8 ->>> print it.next() -9 ->>> print it.next() -Traceback (most recent call last): - File ``t.py'', line 15, in ? - print it.next() -StopIteration -\end{verbatim} - -\keyword{yield} will usually return \constant{None}, so you -should always check for this case. Don't just use its value in -expressions unless you're sure that the \method{send()} method -will be the only method used to resume your generator function. - -In addition to \method{send()}, there are two other new methods on -generators: - -\begin{itemize} - - \item \method{throw(\var{type}, \var{value}=None, - \var{traceback}=None)} is used to raise an exception inside the - generator; the exception is raised by the \keyword{yield} expression - where the generator's execution is paused. - - \item \method{close()} raises a new \exception{GeneratorExit} - exception inside the generator to terminate the iteration. On - receiving this exception, the generator's code must either raise - \exception{GeneratorExit} or \exception{StopIteration}. Catching - the \exception{GeneratorExit} exception and returning a value is - illegal and will trigger a \exception{RuntimeError}; if the function - raises some other exception, that exception is propagated to the - caller. \method{close()} will also be called by Python's garbage - collector when the generator is garbage-collected. - - If you need to run cleanup code when a \exception{GeneratorExit} occurs, - I suggest using a \code{try: ... finally:} suite instead of - catching \exception{GeneratorExit}. - -\end{itemize} - -The cumulative effect of these changes is to turn generators from -one-way producers of information into both producers and consumers. - -Generators also become \emph{coroutines}, a more generalized form of -subroutines. Subroutines are entered at one point and exited at -another point (the top of the function, and a \keyword{return} -statement), but coroutines can be entered, exited, and resumed at -many different points (the \keyword{yield} statements). We'll have to -figure out patterns for using coroutines effectively in Python. - -The addition of the \method{close()} method has one side effect that -isn't obvious. \method{close()} is called when a generator is -garbage-collected, so this means the generator's code gets one last -chance to run before the generator is destroyed. This last chance -means that \code{try...finally} statements in generators can now be -guaranteed to work; the \keyword{finally} clause will now always get a -chance to run. The syntactic restriction that you couldn't mix -\keyword{yield} statements with a \code{try...finally} suite has -therefore been removed. This seems like a minor bit of language -trivia, but using generators and \code{try...finally} is actually -necessary in order to implement the \keyword{with} statement -described by PEP 343. I'll look at this new statement in the following -section. - -Another even more esoteric effect of this change: previously, the -\member{gi_frame} attribute of a generator was always a frame object. -It's now possible for \member{gi_frame} to be \code{None} -once the generator has been exhausted. - -\begin{seealso} - -\seepep{342}{Coroutines via Enhanced Generators}{PEP written by -Guido van~Rossum and Phillip J. Eby; -implemented by Phillip J. Eby. Includes examples of -some fancier uses of generators as coroutines. - -Earlier versions of these features were proposed in -\pep{288} by Raymond Hettinger and \pep{325} by Samuele Pedroni. -} - -\seeurl{http://en.wikipedia.org/wiki/Coroutine}{The Wikipedia entry for -coroutines.} - -\seeurl{http://www.sidhe.org/\~{}dan/blog/archives/000178.html}{An -explanation of coroutines from a Perl point of view, written by Dan -Sugalski.} - -\end{seealso} - - -%====================================================================== -\section{PEP 343: The 'with' statement\label{pep-343}} - -The '\keyword{with}' statement clarifies code that previously would -use \code{try...finally} blocks to ensure that clean-up code is -executed. In this section, I'll discuss the statement as it will -commonly be used. In the next section, I'll examine the -implementation details and show how to write objects for use with this -statement. - -The '\keyword{with}' statement is a new control-flow structure whose -basic structure is: - -\begin{verbatim} -with expression [as variable]: - with-block -\end{verbatim} - -The expression is evaluated, and it should result in an object that -supports the context management protocol (that is, has \method{__enter__()} -and \method{__exit__()} methods. - -The object's \method{__enter__()} is called before \var{with-block} is -executed and therefore can run set-up code. It also may return a value -that is bound to the name \var{variable}, if given. (Note carefully -that \var{variable} is \emph{not} assigned the result of \var{expression}.) - -After execution of the \var{with-block} is finished, the object's -\method{__exit__()} method is called, even if the block raised an exception, -and can therefore run clean-up code. - -To enable the statement in Python 2.5, you need to add the following -directive to your module: - -\begin{verbatim} -from __future__ import with_statement -\end{verbatim} - -The statement will always be enabled in Python 2.6. - -Some standard Python objects now support the context management -protocol and can be used with the '\keyword{with}' statement. File -objects are one example: - -\begin{verbatim} -with open('/etc/passwd', 'r') as f: - for line in f: - print line - ... more processing code ... -\end{verbatim} - -After this statement has executed, the file object in \var{f} will -have been automatically closed, even if the \keyword{for} loop -raised an exception part-way through the block. - -\note{In this case, \var{f} is the same object created by - \function{open()}, because \method{file.__enter__()} returns - \var{self}.} - -The \module{threading} module's locks and condition variables -also support the '\keyword{with}' statement: - -\begin{verbatim} -lock = threading.Lock() -with lock: - # Critical section of code - ... -\end{verbatim} - -The lock is acquired before the block is executed and always released once -the block is complete. - -The new \function{localcontext()} function in the \module{decimal} module -makes it easy to save and restore the current decimal context, which -encapsulates the desired precision and rounding characteristics for -computations: - -\begin{verbatim} -from decimal import Decimal, Context, localcontext - -# Displays with default precision of 28 digits -v = Decimal('578') -print v.sqrt() - -with localcontext(Context(prec=16)): - # All code in this block uses a precision of 16 digits. - # The original context is restored on exiting the block. - print v.sqrt() -\end{verbatim} - -\subsection{Writing Context Managers\label{context-managers}} - -Under the hood, the '\keyword{with}' statement is fairly complicated. -Most people will only use '\keyword{with}' in company with existing -objects and don't need to know these details, so you can skip the rest -of this section if you like. Authors of new objects will need to -understand the details of the underlying implementation and should -keep reading. - -A high-level explanation of the context management protocol is: - -\begin{itemize} - -\item The expression is evaluated and should result in an object -called a ``context manager''. The context manager must have -\method{__enter__()} and \method{__exit__()} methods. - -\item The context manager's \method{__enter__()} method is called. The value -returned is assigned to \var{VAR}. If no \code{'as \var{VAR}'} clause -is present, the value is simply discarded. - -\item The code in \var{BLOCK} is executed. - -\item If \var{BLOCK} raises an exception, the -\method{__exit__(\var{type}, \var{value}, \var{traceback})} is called -with the exception details, the same values returned by -\function{sys.exc_info()}. The method's return value controls whether -the exception is re-raised: any false value re-raises the exception, -and \code{True} will result in suppressing it. You'll only rarely -want to suppress the exception, because if you do -the author of the code containing the -'\keyword{with}' statement will never realize anything went wrong. - -\item If \var{BLOCK} didn't raise an exception, -the \method{__exit__()} method is still called, -but \var{type}, \var{value}, and \var{traceback} are all \code{None}. - -\end{itemize} - -Let's think through an example. I won't present detailed code but -will only sketch the methods necessary for a database that supports -transactions. - -(For people unfamiliar with database terminology: a set of changes to -the database are grouped into a transaction. Transactions can be -either committed, meaning that all the changes are written into the -database, or rolled back, meaning that the changes are all discarded -and the database is unchanged. See any database textbook for more -information.) - -Let's assume there's an object representing a database connection. -Our goal will be to let the user write code like this: - -\begin{verbatim} -db_connection = DatabaseConnection() -with db_connection as cursor: - cursor.execute('insert into ...') - cursor.execute('delete from ...') - # ... more operations ... -\end{verbatim} - -The transaction should be committed if the code in the block -runs flawlessly or rolled back if there's an exception. -Here's the basic interface -for \class{DatabaseConnection} that I'll assume: - -\begin{verbatim} -class DatabaseConnection: - # Database interface - def cursor (self): - "Returns a cursor object and starts a new transaction" - def commit (self): - "Commits current transaction" - def rollback (self): - "Rolls back current transaction" -\end{verbatim} - -The \method {__enter__()} method is pretty easy, having only to start -a new transaction. For this application the resulting cursor object -would be a useful result, so the method will return it. The user can -then add \code{as cursor} to their '\keyword{with}' statement to bind -the cursor to a variable name. - -\begin{verbatim} -class DatabaseConnection: - ... - def __enter__ (self): - # Code to start a new transaction - cursor = self.cursor() - return cursor -\end{verbatim} - -The \method{__exit__()} method is the most complicated because it's -where most of the work has to be done. The method has to check if an -exception occurred. If there was no exception, the transaction is -committed. The transaction is rolled back if there was an exception. - -In the code below, execution will just fall off the end of the -function, returning the default value of \code{None}. \code{None} is -false, so the exception will be re-raised automatically. If you -wished, you could be more explicit and add a \keyword{return} -statement at the marked location. - -\begin{verbatim} -class DatabaseConnection: - ... - def __exit__ (self, type, value, tb): - if tb is None: - # No exception, so commit - self.commit() - else: - # Exception occurred, so rollback. - self.rollback() - # return False -\end{verbatim} - - -\subsection{The contextlib module\label{module-contextlib}} - -The new \module{contextlib} module provides some functions and a -decorator that are useful for writing objects for use with the -'\keyword{with}' statement. - -The decorator is called \function{contextmanager}, and lets you write -a single generator function instead of defining a new class. The generator -should yield exactly one value. The code up to the \keyword{yield} -will be executed as the \method{__enter__()} method, and the value -yielded will be the method's return value that will get bound to the -variable in the '\keyword{with}' statement's \keyword{as} clause, if -any. The code after the \keyword{yield} will be executed in the -\method{__exit__()} method. Any exception raised in the block will be -raised by the \keyword{yield} statement. - -Our database example from the previous section could be written -using this decorator as: - -\begin{verbatim} -from contextlib import contextmanager - -@contextmanager -def db_transaction (connection): - cursor = connection.cursor() - try: - yield cursor - except: - connection.rollback() - raise - else: - connection.commit() - -db = DatabaseConnection() -with db_transaction(db) as cursor: - ... -\end{verbatim} - -The \module{contextlib} module also has a \function{nested(\var{mgr1}, -\var{mgr2}, ...)} function that combines a number of context managers so you -don't need to write nested '\keyword{with}' statements. In this -example, the single '\keyword{with}' statement both starts a database -transaction and acquires a thread lock: - -\begin{verbatim} -lock = threading.Lock() -with nested (db_transaction(db), lock) as (cursor, locked): - ... -\end{verbatim} - -Finally, the \function{closing(\var{object})} function -returns \var{object} so that it can be bound to a variable, -and calls \code{\var{object}.close()} at the end of the block. - -\begin{verbatim} -import urllib, sys -from contextlib import closing - -with closing(urllib.urlopen('http://www.yahoo.com')) as f: - for line in f: - sys.stdout.write(line) -\end{verbatim} - -\begin{seealso} - -\seepep{343}{The ``with'' statement}{PEP written by Guido van~Rossum -and Nick Coghlan; implemented by Mike Bland, Guido van~Rossum, and -Neal Norwitz. The PEP shows the code generated for a '\keyword{with}' -statement, which can be helpful in learning how the statement works.} - -\seeurl{../lib/module-contextlib.html}{The documentation -for the \module{contextlib} module.} - -\end{seealso} - - -%====================================================================== -\section{PEP 352: Exceptions as New-Style Classes\label{pep-352}} - -Exception classes can now be new-style classes, not just classic -classes, and the built-in \exception{Exception} class and all the -standard built-in exceptions (\exception{NameError}, -\exception{ValueError}, etc.) are now new-style classes. - -The inheritance hierarchy for exceptions has been rearranged a bit. -In 2.5, the inheritance relationships are: - -\begin{verbatim} -BaseException # New in Python 2.5 -|- KeyboardInterrupt -|- SystemExit -|- Exception - |- (all other current built-in exceptions) -\end{verbatim} - -This rearrangement was done because people often want to catch all -exceptions that indicate program errors. \exception{KeyboardInterrupt} and -\exception{SystemExit} aren't errors, though, and usually represent an explicit -action such as the user hitting Control-C or code calling -\function{sys.exit()}. A bare \code{except:} will catch all exceptions, -so you commonly need to list \exception{KeyboardInterrupt} and -\exception{SystemExit} in order to re-raise them. The usual pattern is: - -\begin{verbatim} -try: - ... -except (KeyboardInterrupt, SystemExit): - raise -except: - # Log error... - # Continue running program... -\end{verbatim} - -In Python 2.5, you can now write \code{except Exception} to achieve -the same result, catching all the exceptions that usually indicate errors -but leaving \exception{KeyboardInterrupt} and -\exception{SystemExit} alone. As in previous versions, -a bare \code{except:} still catches all exceptions. - -The goal for Python 3.0 is to require any class raised as an exception -to derive from \exception{BaseException} or some descendant of -\exception{BaseException}, and future releases in the -Python 2.x series may begin to enforce this constraint. Therefore, I -suggest you begin making all your exception classes derive from -\exception{Exception} now. It's been suggested that the bare -\code{except:} form should be removed in Python 3.0, but Guido van~Rossum -hasn't decided whether to do this or not. - -Raising of strings as exceptions, as in the statement \code{raise -"Error occurred"}, is deprecated in Python 2.5 and will trigger a -warning. The aim is to be able to remove the string-exception feature -in a few releases. - - -\begin{seealso} - -\seepep{352}{Required Superclass for Exceptions}{PEP written by -Brett Cannon and Guido van~Rossum; implemented by Brett Cannon.} - -\end{seealso} - - -%====================================================================== -\section{PEP 353: Using ssize_t as the index type\label{pep-353}} - -A wide-ranging change to Python's C API, using a new -\ctype{Py_ssize_t} type definition instead of \ctype{int}, -will permit the interpreter to handle more data on 64-bit platforms. -This change doesn't affect Python's capacity on 32-bit platforms. - -Various pieces of the Python interpreter used C's \ctype{int} type to -store sizes or counts; for example, the number of items in a list or -tuple were stored in an \ctype{int}. The C compilers for most 64-bit -platforms still define \ctype{int} as a 32-bit type, so that meant -that lists could only hold up to \code{2**31 - 1} = 2147483647 items. -(There are actually a few different programming models that 64-bit C -compilers can use -- see -\url{http://www.unix.org/version2/whatsnew/lp64_wp.html} for a -discussion -- but the most commonly available model leaves \ctype{int} -as 32 bits.) - -A limit of 2147483647 items doesn't really matter on a 32-bit platform -because you'll run out of memory before hitting the length limit. -Each list item requires space for a pointer, which is 4 bytes, plus -space for a \ctype{PyObject} representing the item. 2147483647*4 is -already more bytes than a 32-bit address space can contain. - -It's possible to address that much memory on a 64-bit platform, -however. The pointers for a list that size would only require 16~GiB -of space, so it's not unreasonable that Python programmers might -construct lists that large. Therefore, the Python interpreter had to -be changed to use some type other than \ctype{int}, and this will be a -64-bit type on 64-bit platforms. The change will cause -incompatibilities on 64-bit machines, so it was deemed worth making -the transition now, while the number of 64-bit users is still -relatively small. (In 5 or 10 years, we may \emph{all} be on 64-bit -machines, and the transition would be more painful then.) - -This change most strongly affects authors of C extension modules. -Python strings and container types such as lists and tuples -now use \ctype{Py_ssize_t} to store their size. -Functions such as \cfunction{PyList_Size()} -now return \ctype{Py_ssize_t}. Code in extension modules -may therefore need to have some variables changed to -\ctype{Py_ssize_t}. - -The \cfunction{PyArg_ParseTuple()} and \cfunction{Py_BuildValue()} functions -have a new conversion code, \samp{n}, for \ctype{Py_ssize_t}. -\cfunction{PyArg_ParseTuple()}'s \samp{s\#} and \samp{t\#} still output -\ctype{int} by default, but you can define the macro -\csimplemacro{PY_SSIZE_T_CLEAN} before including \file{Python.h} -to make them return \ctype{Py_ssize_t}. - -\pep{353} has a section on conversion guidelines that -extension authors should read to learn about supporting 64-bit -platforms. - -\begin{seealso} - -\seepep{353}{Using ssize_t as the index type}{PEP written and implemented by Martin von~L\"owis.} - -\end{seealso} - - -%====================================================================== -\section{PEP 357: The '__index__' method\label{pep-357}} - -The NumPy developers had a problem that could only be solved by adding -a new special method, \method{__index__}. When using slice notation, -as in \code{[\var{start}:\var{stop}:\var{step}]}, the values of the -\var{start}, \var{stop}, and \var{step} indexes must all be either -integers or long integers. NumPy defines a variety of specialized -integer types corresponding to unsigned and signed integers of 8, 16, -32, and 64 bits, but there was no way to signal that these types could -be used as slice indexes. - -Slicing can't just use the existing \method{__int__} method because -that method is also used to implement coercion to integers. If -slicing used \method{__int__}, floating-point numbers would also -become legal slice indexes and that's clearly an undesirable -behaviour. - -Instead, a new special method called \method{__index__} was added. It -takes no arguments and returns an integer giving the slice index to -use. For example: - -\begin{verbatim} -class C: - def __index__ (self): - return self.value -\end{verbatim} - -The return value must be either a Python integer or long integer. -The interpreter will check that the type returned is correct, and -raises a \exception{TypeError} if this requirement isn't met. - -A corresponding \member{nb_index} slot was added to the C-level -\ctype{PyNumberMethods} structure to let C extensions implement this -protocol. \cfunction{PyNumber_Index(\var{obj})} can be used in -extension code to call the \method{__index__} function and retrieve -its result. - -\begin{seealso} - -\seepep{357}{Allowing Any Object to be Used for Slicing}{PEP written -and implemented by Travis Oliphant.} - -\end{seealso} - - -%====================================================================== -\section{Other Language Changes\label{other-lang}} - -Here are all of the changes that Python 2.5 makes to the core Python -language. - -\begin{itemize} - -\item The \class{dict} type has a new hook for letting subclasses -provide a default value when a key isn't contained in the dictionary. -When a key isn't found, the dictionary's -\method{__missing__(\var{key})} -method will be called. This hook is used to implement -the new \class{defaultdict} class in the \module{collections} -module. The following example defines a dictionary -that returns zero for any missing key: - -\begin{verbatim} -class zerodict (dict): - def __missing__ (self, key): - return 0 - -d = zerodict({1:1, 2:2}) -print d[1], d[2] # Prints 1, 2 -print d[3], d[4] # Prints 0, 0 -\end{verbatim} - -\item Both 8-bit and Unicode strings have new \method{partition(sep)} -and \method{rpartition(sep)} methods that simplify a common use case. - -The \method{find(S)} method is often used to get an index which is -then used to slice the string and obtain the pieces that are before -and after the separator. -\method{partition(sep)} condenses this -pattern into a single method call that returns a 3-tuple containing -the substring before the separator, the separator itself, and the -substring after the separator. If the separator isn't found, the -first element of the tuple is the entire string and the other two -elements are empty. \method{rpartition(sep)} also returns a 3-tuple -but starts searching from the end of the string; the \samp{r} stands -for 'reverse'. - -Some examples: - -\begin{verbatim} ->>> ('http://www.python.org').partition('://') -('http', '://', 'www.python.org') ->>> ('file:/usr/share/doc/index.html').partition('://') -('file:/usr/share/doc/index.html', '', '') ->>> (u'Subject: a quick question').partition(':') -(u'Subject', u':', u' a quick question') ->>> 'www.python.org'.rpartition('.') -('www.python', '.', 'org') ->>> 'www.python.org'.rpartition(':') -('', '', 'www.python.org') -\end{verbatim} - -(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.) - -\item The \method{startswith()} and \method{endswith()} methods -of string types now accept tuples of strings to check for. - -\begin{verbatim} -def is_image_file (filename): - return filename.endswith(('.gif', '.jpg', '.tiff')) -\end{verbatim} - -(Implemented by Georg Brandl following a suggestion by Tom Lynn.) -% RFE #1491485 - -\item The \function{min()} and \function{max()} built-in functions -gained a \code{key} keyword parameter analogous to the \code{key} -argument for \method{sort()}. This parameter supplies a function that -takes a single argument and is called for every value in the list; -\function{min()}/\function{max()} will return the element with the -smallest/largest return value from this function. -For example, to find the longest string in a list, you can do: - -\begin{verbatim} -L = ['medium', 'longest', 'short'] -# Prints 'longest' -print max(L, key=len) -# Prints 'short', because lexicographically 'short' has the largest value -print max(L) -\end{verbatim} - -(Contributed by Steven Bethard and Raymond Hettinger.) - -\item Two new built-in functions, \function{any()} and -\function{all()}, evaluate whether an iterator contains any true or -false values. \function{any()} returns \constant{True} if any value -returned by the iterator is true; otherwise it will return -\constant{False}. \function{all()} returns \constant{True} only if -all of the values returned by the iterator evaluate as true. -(Suggested by Guido van~Rossum, and implemented by Raymond Hettinger.) - -\item The result of a class's \method{__hash__()} method can now -be either a long integer or a regular integer. If a long integer is -returned, the hash of that value is taken. In earlier versions the -hash value was required to be a regular integer, but in 2.5 the -\function{id()} built-in was changed to always return non-negative -numbers, and users often seem to use \code{id(self)} in -\method{__hash__()} methods (though this is discouraged). -% Bug #1536021 - -\item ASCII is now the default encoding for modules. It's now -a syntax error if a module contains string literals with 8-bit -characters but doesn't have an encoding declaration. In Python 2.4 -this triggered a warning, not a syntax error. See \pep{263} -for how to declare a module's encoding; for example, you might add -a line like this near the top of the source file: - -\begin{verbatim} -# -*- coding: latin1 -*- -\end{verbatim} - -\item A new warning, \class{UnicodeWarning}, is triggered when -you attempt to compare a Unicode string and an 8-bit string -that can't be converted to Unicode using the default ASCII encoding. -The result of the comparison is false: - -\begin{verbatim} ->>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode -__main__:1: UnicodeWarning: Unicode equal comparison failed - to convert both arguments to Unicode - interpreting them - as being unequal -False ->>> chr(127) == unichr(127) # chr(127) can be converted -True -\end{verbatim} - -Previously this would raise a \class{UnicodeDecodeError} exception, -but in 2.5 this could result in puzzling problems when accessing a -dictionary. If you looked up \code{unichr(128)} and \code{chr(128)} -was being used as a key, you'd get a \class{UnicodeDecodeError} -exception. Other changes in 2.5 resulted in this exception being -raised instead of suppressed by the code in \file{dictobject.c} that -implements dictionaries. - -Raising an exception for such a comparison is strictly correct, but -the change might have broken code, so instead -\class{UnicodeWarning} was introduced. - -(Implemented by Marc-Andr\'e Lemburg.) - -\item One error that Python programmers sometimes make is forgetting -to include an \file{__init__.py} module in a package directory. -Debugging this mistake can be confusing, and usually requires running -Python with the \programopt{-v} switch to log all the paths searched. -In Python 2.5, a new \exception{ImportWarning} warning is triggered when -an import would have picked up a directory as a package but no -\file{__init__.py} was found. This warning is silently ignored by default; -provide the \programopt{-Wd} option when running the Python executable -to display the warning message. -(Implemented by Thomas Wouters.) - -\item The list of base classes in a class definition can now be empty. -As an example, this is now legal: - -\begin{verbatim} -class C(): - pass -\end{verbatim} -(Implemented by Brett Cannon.) - -\end{itemize} - - -%====================================================================== -\subsection{Interactive Interpreter Changes\label{interactive}} - -In the interactive interpreter, \code{quit} and \code{exit} -have long been strings so that new users get a somewhat helpful message -when they try to quit: - -\begin{verbatim} ->>> quit -'Use Ctrl-D (i.e. EOF) to exit.' -\end{verbatim} - -In Python 2.5, \code{quit} and \code{exit} are now objects that still -produce string representations of themselves, but are also callable. -Newbies who try \code{quit()} or \code{exit()} will now exit the -interpreter as they expect. (Implemented by Georg Brandl.) - -The Python executable now accepts the standard long options -\longprogramopt{help} and \longprogramopt{version}; on Windows, -it also accepts the \programopt{/?} option for displaying a help message. -(Implemented by Georg Brandl.) - - -%====================================================================== -\subsection{Optimizations\label{opts}} - -Several of the optimizations were developed at the NeedForSpeed -sprint, an event held in Reykjavik, Iceland, from May 21--28 2006. -The sprint focused on speed enhancements to the CPython implementation -and was funded by EWT LLC with local support from CCP Games. Those -optimizations added at this sprint are specially marked in the -following list. - -\begin{itemize} - -\item When they were introduced -in Python 2.4, the built-in \class{set} and \class{frozenset} types -were built on top of Python's dictionary type. -In 2.5 the internal data structure has been customized for implementing sets, -and as a result sets will use a third less memory and are somewhat faster. -(Implemented by Raymond Hettinger.) - -\item The speed of some Unicode operations, such as finding -substrings, string splitting, and character map encoding and decoding, -has been improved. (Substring search and splitting improvements were -added by Fredrik Lundh and Andrew Dalke at the NeedForSpeed -sprint. Character maps were improved by Walter D\"orwald and -Martin von~L\"owis.) -% Patch 1313939, 1359618 - -\item The \function{long(\var{str}, \var{base})} function is now -faster on long digit strings because fewer intermediate results are -calculated. The peak is for strings of around 800--1000 digits where -the function is 6 times faster. -(Contributed by Alan McIntyre and committed at the NeedForSpeed sprint.) -% Patch 1442927 - -\item It's now illegal to mix iterating over a file -with \code{for line in \var{file}} and calling -the file object's \method{read()}/\method{readline()}/\method{readlines()} -methods. Iteration uses an internal buffer and the -\method{read*()} methods don't use that buffer. -Instead they would return the data following the buffer, causing the -data to appear out of order. Mixing iteration and these methods will -now trigger a \exception{ValueError} from the \method{read*()} method. -(Implemented by Thomas Wouters.) -% Patch 1397960 - -\item The \module{struct} module now compiles structure format -strings into an internal representation and caches this -representation, yielding a 20\% speedup. (Contributed by Bob Ippolito -at the NeedForSpeed sprint.) - -\item The \module{re} module got a 1 or 2\% speedup by switching to -Python's allocator functions instead of the system's -\cfunction{malloc()} and \cfunction{free()}. -(Contributed by Jack Diederich at the NeedForSpeed sprint.) - -\item The code generator's peephole optimizer now performs -simple constant folding in expressions. If you write something like -\code{a = 2+3}, the code generator will do the arithmetic and produce -code corresponding to \code{a = 5}. (Proposed and implemented -by Raymond Hettinger.) - -\item Function calls are now faster because code objects now keep -the most recently finished frame (a ``zombie frame'') in an internal -field of the code object, reusing it the next time the code object is -invoked. (Original patch by Michael Hudson, modified by Armin Rigo -and Richard Jones; committed at the NeedForSpeed sprint.) -% Patch 876206 - -Frame objects are also slightly smaller, which may improve cache locality -and reduce memory usage a bit. (Contributed by Neal Norwitz.) -% Patch 1337051 - -\item Python's built-in exceptions are now new-style classes, a change -that speeds up instantiation considerably. Exception handling in -Python 2.5 is therefore about 30\% faster than in 2.4. -(Contributed by Richard Jones, Georg Brandl and Sean Reifschneider at -the NeedForSpeed sprint.) - -\item Importing now caches the paths tried, recording whether -they exist or not so that the interpreter makes fewer -\cfunction{open()} and \cfunction{stat()} calls on startup. -(Contributed by Martin von~L\"owis and Georg Brandl.) -% Patch 921466 - -\end{itemize} - - -%====================================================================== -\section{New, Improved, and Removed Modules\label{modules}} - -The standard library received many enhancements and bug fixes in -Python 2.5. Here's a partial list of the most notable changes, sorted -alphabetically by module name. Consult the \file{Misc/NEWS} file in -the source tree for a more complete list of changes, or look through -the SVN logs for all the details. - -\begin{itemize} - -\item The \module{audioop} module now supports the a-LAW encoding, -and the code for u-LAW encoding has been improved. (Contributed by -Lars Immisch.) - -\item The \module{codecs} module gained support for incremental -codecs. The \function{codec.lookup()} function now -returns a \class{CodecInfo} instance instead of a tuple. -\class{CodecInfo} instances behave like a 4-tuple to preserve backward -compatibility but also have the attributes \member{encode}, -\member{decode}, \member{incrementalencoder}, \member{incrementaldecoder}, -\member{streamwriter}, and \member{streamreader}. Incremental codecs -can receive input and produce output in multiple chunks; the output is -the same as if the entire input was fed to the non-incremental codec. -See the \module{codecs} module documentation for details. -(Designed and implemented by Walter D\"orwald.) -% Patch 1436130 - -\item The \module{collections} module gained a new type, -\class{defaultdict}, that subclasses the standard \class{dict} -type. The new type mostly behaves like a dictionary but constructs a -default value when a key isn't present, automatically adding it to the -dictionary for the requested key value. - -The first argument to \class{defaultdict}'s constructor is a factory -function that gets called whenever a key is requested but not found. -This factory function receives no arguments, so you can use built-in -type constructors such as \function{list()} or \function{int()}. For -example, -you can make an index of words based on their initial letter like this: - -\begin{verbatim} -words = """Nel mezzo del cammin di nostra vita -mi ritrovai per una selva oscura -che la diritta via era smarrita""".lower().split() - -index = defaultdict(list) - -for w in words: - init_letter = w[0] - index[init_letter].append(w) -\end{verbatim} - -Printing \code{index} results in the following output: - -\begin{verbatim} -defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'], - 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'], - 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'], - 'p': ['per'], 's': ['selva', 'smarrita'], - 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']} -\end{verbatim} - -(Contributed by Guido van~Rossum.) - -\item The \class{deque} double-ended queue type supplied by the -\module{collections} module now has a \method{remove(\var{value})} -method that removes the first occurrence of \var{value} in the queue, -raising \exception{ValueError} if the value isn't found. -(Contributed by Raymond Hettinger.) - -\item New module: The \module{contextlib} module contains helper functions for use -with the new '\keyword{with}' statement. See -section~\ref{module-contextlib} for more about this module. - -\item New module: The \module{cProfile} module is a C implementation of -the existing \module{profile} module that has much lower overhead. -The module's interface is the same as \module{profile}: you run -\code{cProfile.run('main()')} to profile a function, can save profile -data to a file, etc. It's not yet known if the Hotshot profiler, -which is also written in C but doesn't match the \module{profile} -module's interface, will continue to be maintained in future versions -of Python. (Contributed by Armin Rigo.) - -Also, the \module{pstats} module for analyzing the data measured by -the profiler now supports directing the output to any file object -by supplying a \var{stream} argument to the \class{Stats} constructor. -(Contributed by Skip Montanaro.) - -\item The \module{csv} module, which parses files in -comma-separated value format, received several enhancements and a -number of bugfixes. You can now set the maximum size in bytes of a -field by calling the \method{csv.field_size_limit(\var{new_limit})} -function; omitting the \var{new_limit} argument will return the -currently-set limit. The \class{reader} class now has a -\member{line_num} attribute that counts the number of physical lines -read from the source; records can span multiple physical lines, so -\member{line_num} is not the same as the number of records read. - -The CSV parser is now stricter about multi-line quoted -fields. Previously, if a line ended within a quoted field without a -terminating newline character, a newline would be inserted into the -returned field. This behavior caused problems when reading files that -contained carriage return characters within fields, so the code was -changed to return the field without inserting newlines. As a -consequence, if newlines embedded within fields are important, the -input should be split into lines in a manner that preserves the -newline characters. - -(Contributed by Skip Montanaro and Andrew McNamara.) - -\item The \class{datetime} class in the \module{datetime} -module now has a \method{strptime(\var{string}, \var{format})} -method for parsing date strings, contributed by Josh Spoerri. -It uses the same format characters as \function{time.strptime()} and -\function{time.strftime()}: - -\begin{verbatim} -from datetime import datetime - -ts = datetime.strptime('10:13:15 2006-03-07', - '%H:%M:%S %Y-%m-%d') -\end{verbatim} - -\item The \method{SequenceMatcher.get_matching_blocks()} method -in the \module{difflib} module now guarantees to return a minimal list -of blocks describing matching subsequences. Previously, the algorithm would -occasionally break a block of matching elements into two list entries. -(Enhancement by Tim Peters.) - -\item The \module{doctest} module gained a \code{SKIP} option that -keeps an example from being executed at all. This is intended for -code snippets that are usage examples intended for the reader and -aren't actually test cases. - -An \var{encoding} parameter was added to the \function{testfile()} -function and the \class{DocFileSuite} class to specify the file's -encoding. This makes it easier to use non-ASCII characters in -tests contained within a docstring. (Contributed by Bjorn Tillenius.) -% Patch 1080727 - -\item The \module{email} package has been updated to version 4.0. -% XXX need to provide some more detail here -(Contributed by Barry Warsaw.) - -\item The \module{fileinput} module was made more flexible. -Unicode filenames are now supported, and a \var{mode} parameter that -defaults to \code{"r"} was added to the -\function{input()} function to allow opening files in binary or -universal-newline mode. Another new parameter, \var{openhook}, -lets you use a function other than \function{open()} -to open the input files. Once you're iterating over -the set of files, the \class{FileInput} object's new -\method{fileno()} returns the file descriptor for the currently opened file. -(Contributed by Georg Brandl.) - -\item In the \module{gc} module, the new \function{get_count()} function -returns a 3-tuple containing the current collection counts for the -three GC generations. This is accounting information for the garbage -collector; when these counts reach a specified threshold, a garbage -collection sweep will be made. The existing \function{gc.collect()} -function now takes an optional \var{generation} argument of 0, 1, or 2 -to specify which generation to collect. -(Contributed by Barry Warsaw.) - -\item The \function{nsmallest()} and -\function{nlargest()} functions in the \module{heapq} module -now support a \code{key} keyword parameter similar to the one -provided by the \function{min()}/\function{max()} functions -and the \method{sort()} methods. For example: - -\begin{verbatim} ->>> import heapq ->>> L = ["short", 'medium', 'longest', 'longer still'] ->>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically -['longer still', 'longest'] ->>> heapq.nsmallest(2, L, key=len) # Return two shortest elements -['short', 'medium'] -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item The \function{itertools.islice()} function now accepts -\code{None} for the start and step arguments. This makes it more -compatible with the attributes of slice objects, so that you can now write -the following: - -\begin{verbatim} -s = slice(5) # Create slice object -itertools.islice(iterable, s.start, s.stop, s.step) -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item The \function{format()} function in the \module{locale} module -has been modified and two new functions were added, -\function{format_string()} and \function{currency()}. - -The \function{format()} function's \var{val} parameter could -previously be a string as long as no more than one \%char specifier -appeared; now the parameter must be exactly one \%char specifier with -no surrounding text. An optional \var{monetary} parameter was also -added which, if \code{True}, will use the locale's rules for -formatting currency in placing a separator between groups of three -digits. - -To format strings with multiple \%char specifiers, use the new -\function{format_string()} function that works like \function{format()} -but also supports mixing \%char specifiers with -arbitrary text. - -A new \function{currency()} function was also added that formats a -number according to the current locale's settings. - -(Contributed by Georg Brandl.) -% Patch 1180296 - -\item The \module{mailbox} module underwent a massive rewrite to add -the capability to modify mailboxes in addition to reading them. A new -set of classes that include \class{mbox}, \class{MH}, and -\class{Maildir} are used to read mailboxes, and have an -\method{add(\var{message})} method to add messages, -\method{remove(\var{key})} to remove messages, and -\method{lock()}/\method{unlock()} to lock/unlock the mailbox. The -following example converts a maildir-format mailbox into an mbox-format one: - -\begin{verbatim} -import mailbox - -# 'factory=None' uses email.Message.Message as the class representing -# individual messages. -src = mailbox.Maildir('maildir', factory=None) -dest = mailbox.mbox('/tmp/mbox') - -for msg in src: - dest.add(msg) -\end{verbatim} - -(Contributed by Gregory K. Johnson. Funding was provided by Google's -2005 Summer of Code.) - -\item New module: the \module{msilib} module allows creating -Microsoft Installer \file{.msi} files and CAB files. Some support -for reading the \file{.msi} database is also included. -(Contributed by Martin von~L\"owis.) - -\item The \module{nis} module now supports accessing domains other -than the system default domain by supplying a \var{domain} argument to -the \function{nis.match()} and \function{nis.maps()} functions. -(Contributed by Ben Bell.) - -\item The \module{operator} module's \function{itemgetter()} -and \function{attrgetter()} functions now support multiple fields. -A call such as \code{operator.attrgetter('a', 'b')} -will return a function -that retrieves the \member{a} and \member{b} attributes. Combining -this new feature with the \method{sort()} method's \code{key} parameter -lets you easily sort lists using multiple fields. -(Contributed by Raymond Hettinger.) - -\item The \module{optparse} module was updated to version 1.5.1 of the -Optik library. The \class{OptionParser} class gained an -\member{epilog} attribute, a string that will be printed after the -help message, and a \method{destroy()} method to break reference -cycles created by the object. (Contributed by Greg Ward.) - -\item The \module{os} module underwent several changes. The -\member{stat_float_times} variable now defaults to true, meaning that -\function{os.stat()} will now return time values as floats. (This -doesn't necessarily mean that \function{os.stat()} will return times -that are precise to fractions of a second; not all systems support -such precision.) - -Constants named \member{os.SEEK_SET}, \member{os.SEEK_CUR}, and -\member{os.SEEK_END} have been added; these are the parameters to the -\function{os.lseek()} function. Two new constants for locking are -\member{os.O_SHLOCK} and \member{os.O_EXLOCK}. - -Two new functions, \function{wait3()} and \function{wait4()}, were -added. They're similar the \function{waitpid()} function which waits -for a child process to exit and returns a tuple of the process ID and -its exit status, but \function{wait3()} and \function{wait4()} return -additional information. \function{wait3()} doesn't take a process ID -as input, so it waits for any child process to exit and returns a -3-tuple of \var{process-id}, \var{exit-status}, \var{resource-usage} -as returned from the \function{resource.getrusage()} function. -\function{wait4(\var{pid})} does take a process ID. -(Contributed by Chad J. Schroeder.) - -On FreeBSD, the \function{os.stat()} function now returns -times with nanosecond resolution, and the returned object -now has \member{st_gen} and \member{st_birthtime}. -The \member{st_flags} member is also available, if the platform supports it. -(Contributed by Antti Louko and Diego Petten\`o.) -% (Patch 1180695, 1212117) - -\item The Python debugger provided by the \module{pdb} module -can now store lists of commands to execute when a breakpoint is -reached and execution stops. Once breakpoint \#1 has been created, -enter \samp{commands 1} and enter a series of commands to be executed, -finishing the list with \samp{end}. The command list can include -commands that resume execution, such as \samp{continue} or -\samp{next}. (Contributed by Gr\'egoire Dooms.) -% Patch 790710 - -\item The \module{pickle} and \module{cPickle} modules no -longer accept a return value of \code{None} from the -\method{__reduce__()} method; the method must return a tuple of -arguments instead. The ability to return \code{None} was deprecated -in Python 2.4, so this completes the removal of the feature. - -\item The \module{pkgutil} module, containing various utility -functions for finding packages, was enhanced to support PEP 302's -import hooks and now also works for packages stored in ZIP-format archives. -(Contributed by Phillip J. Eby.) - -\item The pybench benchmark suite by Marc-Andr\'e~Lemburg is now -included in the \file{Tools/pybench} directory. The pybench suite is -an improvement on the commonly used \file{pystone.py} program because -pybench provides a more detailed measurement of the interpreter's -speed. It times particular operations such as function calls, -tuple slicing, method lookups, and numeric operations, instead of -performing many different operations and reducing the result to a -single number as \file{pystone.py} does. - -\item The \module{pyexpat} module now uses version 2.0 of the Expat parser. -(Contributed by Trent Mick.) - -\item The \class{Queue} class provided by the \module{Queue} module -gained two new methods. \method{join()} blocks until all items in -the queue have been retrieved and all processing work on the items -have been completed. Worker threads call the other new method, -\method{task_done()}, to signal that processing for an item has been -completed. (Contributed by Raymond Hettinger.) - -\item The old \module{regex} and \module{regsub} modules, which have been -deprecated ever since Python 2.0, have finally been deleted. -Other deleted modules: \module{statcache}, \module{tzparse}, -\module{whrandom}. - -\item Also deleted: the \file{lib-old} directory, -which includes ancient modules such as \module{dircmp} and -\module{ni}, was removed. \file{lib-old} wasn't on the default -\code{sys.path}, so unless your programs explicitly added the directory to -\code{sys.path}, this removal shouldn't affect your code. - -\item The \module{rlcompleter} module is no longer -dependent on importing the \module{readline} module and -therefore now works on non-{\UNIX} platforms. -(Patch from Robert Kiendl.) -% Patch #1472854 - -\item The \module{SimpleXMLRPCServer} and \module{DocXMLRPCServer} -classes now have a \member{rpc_paths} attribute that constrains -XML-RPC operations to a limited set of URL paths; the default is -to allow only \code{'/'} and \code{'/RPC2'}. Setting -\member{rpc_paths} to \code{None} or an empty tuple disables -this path checking. -% Bug #1473048 - -\item The \module{socket} module now supports \constant{AF_NETLINK} -sockets on Linux, thanks to a patch from Philippe Biondi. -Netlink sockets are a Linux-specific mechanism for communications -between a user-space process and kernel code; an introductory -article about them is at \url{http://www.linuxjournal.com/article/7356}. -In Python code, netlink addresses are represented as a tuple of 2 integers, -\code{(\var{pid}, \var{group_mask})}. - -Two new methods on socket objects, \method{recv_into(\var{buffer})} and -\method{recvfrom_into(\var{buffer})}, store the received data in an object -that supports the buffer protocol instead of returning the data as a -string. This means you can put the data directly into an array or a -memory-mapped file. - -Socket objects also gained \method{getfamily()}, \method{gettype()}, -and \method{getproto()} accessor methods to retrieve the family, type, -and protocol values for the socket. - -\item New module: the \module{spwd} module provides functions for -accessing the shadow password database on systems that support -shadow passwords. - -\item The \module{struct} is now faster because it -compiles format strings into \class{Struct} objects -with \method{pack()} and \method{unpack()} methods. This is similar -to how the \module{re} module lets you create compiled regular -expression objects. You can still use the module-level -\function{pack()} and \function{unpack()} functions; they'll create -\class{Struct} objects and cache them. Or you can use -\class{Struct} instances directly: - -\begin{verbatim} -s = struct.Struct('ih3s') - -data = s.pack(1972, 187, 'abc') -year, number, name = s.unpack(data) -\end{verbatim} - -You can also pack and unpack data to and from buffer objects directly -using the \method{pack_into(\var{buffer}, \var{offset}, \var{v1}, -\var{v2}, ...)} and \method{unpack_from(\var{buffer}, \var{offset})} -methods. This lets you store data directly into an array or a -memory-mapped file. - -(\class{Struct} objects were implemented by Bob Ippolito at the -NeedForSpeed sprint. Support for buffer objects was added by Martin -Blais, also at the NeedForSpeed sprint.) - -\item The Python developers switched from CVS to Subversion during the 2.5 -development process. Information about the exact build version is -available as the \code{sys.subversion} variable, a 3-tuple of -\code{(\var{interpreter-name}, \var{branch-name}, -\var{revision-range})}. For example, at the time of writing my copy -of 2.5 was reporting \code{('CPython', 'trunk', '45313:45315')}. - -This information is also available to C extensions via the -\cfunction{Py_GetBuildInfo()} function that returns a -string of build information like this: -\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}. -(Contributed by Barry Warsaw.) - -\item Another new function, \function{sys._current_frames()}, returns -the current stack frames for all running threads as a dictionary -mapping thread identifiers to the topmost stack frame currently active -in that thread at the time the function is called. (Contributed by -Tim Peters.) - -\item The \class{TarFile} class in the \module{tarfile} module now has -an \method{extractall()} method that extracts all members from the -archive into the current working directory. It's also possible to set -a different directory as the extraction target, and to unpack only a -subset of the archive's members. - -The compression used for a tarfile opened in stream mode can now be -autodetected using the mode \code{'r|*'}. -% patch 918101 -(Contributed by Lars Gust\"abel.) - -\item The \module{threading} module now lets you set the stack size -used when new threads are created. The -\function{stack_size(\optional{\var{size}})} function returns the -currently configured stack size, and supplying the optional \var{size} -parameter sets a new value. Not all platforms support changing the -stack size, but Windows, POSIX threading, and OS/2 all do. -(Contributed by Andrew MacIntyre.) -% Patch 1454481 - -\item The \module{unicodedata} module has been updated to use version 4.1.0 -of the Unicode character database. Version 3.2.0 is required -by some specifications, so it's still available as -\member{unicodedata.ucd_3_2_0}. - -\item New module: the \module{uuid} module generates -universally unique identifiers (UUIDs) according to \rfc{4122}. The -RFC defines several different UUID versions that are generated from a -starting string, from system properties, or purely randomly. This -module contains a \class{UUID} class and -functions named \function{uuid1()}, -\function{uuid3()}, \function{uuid4()}, and -\function{uuid5()} to generate different versions of UUID. (Version 2 UUIDs -are not specified in \rfc{4122} and are not supported by this module.) - -\begin{verbatim} ->>> import uuid ->>> # make a UUID based on the host ID and current time ->>> uuid.uuid1() -UUID('a8098c1a-f86e-11da-bd1a-00112444be1e') - ->>> # make a UUID using an MD5 hash of a namespace UUID and a name ->>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org') -UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e') - ->>> # make a random UUID ->>> uuid.uuid4() -UUID('16fd2706-8baf-433b-82eb-8c7fada847da') - ->>> # make a UUID using a SHA-1 hash of a namespace UUID and a name ->>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org') -UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d') -\end{verbatim} - -(Contributed by Ka-Ping Yee.) - -\item The \module{weakref} module's \class{WeakKeyDictionary} and -\class{WeakValueDictionary} types gained new methods for iterating -over the weak references contained in the dictionary. -\method{iterkeyrefs()} and \method{keyrefs()} methods were -added to \class{WeakKeyDictionary}, and -\method{itervaluerefs()} and \method{valuerefs()} were added to -\class{WeakValueDictionary}. (Contributed by Fred L.~Drake, Jr.) - -\item The \module{webbrowser} module received a number of -enhancements. -It's now usable as a script with \code{python -m webbrowser}, taking a -URL as the argument; there are a number of switches -to control the behaviour (\programopt{-n} for a new browser window, -\programopt{-t} for a new tab). New module-level functions, -\function{open_new()} and \function{open_new_tab()}, were added -to support this. The module's \function{open()} function supports an -additional feature, an \var{autoraise} parameter that signals whether -to raise the open window when possible. A number of additional -browsers were added to the supported list such as Firefox, Opera, -Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg -Brandl.) -% Patch #754022 - -\item The \module{xmlrpclib} module now supports returning - \class{datetime} objects for the XML-RPC date type. Supply - \code{use_datetime=True} to the \function{loads()} function - or the \class{Unmarshaller} class to enable this feature. - (Contributed by Skip Montanaro.) -% Patch 1120353 - -\item The \module{zipfile} module now supports the ZIP64 version of the -format, meaning that a .zip archive can now be larger than 4~GiB and -can contain individual files larger than 4~GiB. (Contributed by -Ronald Oussoren.) -% Patch 1446489 - -\item The \module{zlib} module's \class{Compress} and \class{Decompress} -objects now support a \method{copy()} method that makes a copy of the -object's internal state and returns a new -\class{Compress} or \class{Decompress} object. -(Contributed by Chris AtLee.) -% Patch 1435422 - -\end{itemize} - - - -%====================================================================== -\subsection{The ctypes package\label{module-ctypes}} - -The \module{ctypes} package, written by Thomas Heller, has been added -to the standard library. \module{ctypes} lets you call arbitrary functions -in shared libraries or DLLs. Long-time users may remember the \module{dl} module, which -provides functions for loading shared libraries and calling functions in them. The \module{ctypes} package is much fancier. - -To load a shared library or DLL, you must create an instance of the -\class{CDLL} class and provide the name or path of the shared library -or DLL. Once that's done, you can call arbitrary functions -by accessing them as attributes of the \class{CDLL} object. - -\begin{verbatim} -import ctypes - -libc = ctypes.CDLL('libc.so.6') -result = libc.printf("Line of output\n") -\end{verbatim} - -Type constructors for the various C types are provided: \function{c_int}, -\function{c_float}, \function{c_double}, \function{c_char_p} (equivalent to \ctype{char *}), and so forth. Unlike Python's types, the C versions are all mutable; you can assign to their \member{value} attribute -to change the wrapped value. Python integers and strings will be automatically -converted to the corresponding C types, but for other types you -must call the correct type constructor. (And I mean \emph{must}; -getting it wrong will often result in the interpreter crashing -with a segmentation fault.) - -You shouldn't use \function{c_char_p} with a Python string when the C function will be modifying the memory area, because Python strings are -supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area, -use \function{create_string_buffer()}: - -\begin{verbatim} -s = "this is a string" -buf = ctypes.create_string_buffer(s) -libc.strfry(buf) -\end{verbatim} - -C functions are assumed to return integers, but you can set -the \member{restype} attribute of the function object to -change this: - -\begin{verbatim} ->>> libc.atof('2.71828') --1783957616 ->>> libc.atof.restype = ctypes.c_double ->>> libc.atof('2.71828') -2.71828 -\end{verbatim} - -\module{ctypes} also provides a wrapper for Python's C API -as the \code{ctypes.pythonapi} object. This object does \emph{not} -release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter's code. -There's a \class{py_object()} type constructor that will create a -\ctype{PyObject *} pointer. A simple usage: - -\begin{verbatim} -import ctypes - -d = {} -ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d), - ctypes.py_object("abc"), ctypes.py_object(1)) -# d is now {'abc', 1}. -\end{verbatim} - -Don't forget to use \class{py_object()}; if it's omitted you end -up with a segmentation fault. - -\module{ctypes} has been around for a while, but people still write -and distribution hand-coded extension modules because you can't rely on \module{ctypes} being present. -Perhaps developers will begin to write -Python wrappers atop a library accessed through \module{ctypes} instead -of extension modules, now that \module{ctypes} is included with core Python. - -\begin{seealso} - -\seeurl{http://starship.python.net/crew/theller/ctypes/} -{The ctypes web page, with a tutorial, reference, and FAQ.} - -\seeurl{../lib/module-ctypes.html}{The documentation -for the \module{ctypes} module.} - -\end{seealso} - - -%====================================================================== -\subsection{The ElementTree package\label{module-etree}} - -A subset of Fredrik Lundh's ElementTree library for processing XML has -been added to the standard library as \module{xml.etree}. The -available modules are -\module{ElementTree}, \module{ElementPath}, and -\module{ElementInclude} from ElementTree 1.2.6. -The \module{cElementTree} accelerator module is also included. - -The rest of this section will provide a brief overview of using -ElementTree. Full documentation for ElementTree is available at -\url{http://effbot.org/zone/element-index.htm}. - -ElementTree represents an XML document as a tree of element nodes. -The text content of the document is stored as the \member{.text} -and \member{.tail} attributes of -(This is one of the major differences between ElementTree and -the Document Object Model; in the DOM there are many different -types of node, including \class{TextNode}.) - -The most commonly used parsing function is \function{parse()}, that -takes either a string (assumed to contain a filename) or a file-like -object and returns an \class{ElementTree} instance: - -\begin{verbatim} -from xml.etree import ElementTree as ET - -tree = ET.parse('ex-1.xml') - -feed = urllib.urlopen( - 'http://planet.python.org/rss10.xml') -tree = ET.parse(feed) -\end{verbatim} - -Once you have an \class{ElementTree} instance, you -can call its \method{getroot()} method to get the root \class{Element} node. - -There's also an \function{XML()} function that takes a string literal -and returns an \class{Element} node (not an \class{ElementTree}). -This function provides a tidy way to incorporate XML fragments, -approaching the convenience of an XML literal: - -\begin{verbatim} -svg = ET.XML("""<svg width="10px" version="1.0"> - </svg>""") -svg.set('height', '320px') -svg.append(elem1) -\end{verbatim} - -Each XML element supports some dictionary-like and some list-like -access methods. Dictionary-like operations are used to access attribute -values, and list-like operations are used to access child nodes. - -\begin{tableii}{c|l}{code}{Operation}{Result} - \lineii{elem[n]}{Returns n'th child element.} - \lineii{elem[m:n]}{Returns list of m'th through n'th child elements.} - \lineii{len(elem)}{Returns number of child elements.} - \lineii{list(elem)}{Returns list of child elements.} - \lineii{elem.append(elem2)}{Adds \var{elem2} as a child.} - \lineii{elem.insert(index, elem2)}{Inserts \var{elem2} at the specified location.} - \lineii{del elem[n]}{Deletes n'th child element.} - \lineii{elem.keys()}{Returns list of attribute names.} - \lineii{elem.get(name)}{Returns value of attribute \var{name}.} - \lineii{elem.set(name, value)}{Sets new value for attribute \var{name}.} - \lineii{elem.attrib}{Retrieves the dictionary containing attributes.} - \lineii{del elem.attrib[name]}{Deletes attribute \var{name}.} -\end{tableii} - -Comments and processing instructions are also represented as -\class{Element} nodes. To check if a node is a comment or processing -instructions: - -\begin{verbatim} -if elem.tag is ET.Comment: - ... -elif elem.tag is ET.ProcessingInstruction: - ... -\end{verbatim} - -To generate XML output, you should call the -\method{ElementTree.write()} method. Like \function{parse()}, -it can take either a string or a file-like object: - -\begin{verbatim} -# Encoding is US-ASCII -tree.write('output.xml') - -# Encoding is UTF-8 -f = open('output.xml', 'w') -tree.write(f, encoding='utf-8') -\end{verbatim} - -(Caution: the default encoding used for output is ASCII. For general -XML work, where an element's name may contain arbitrary Unicode -characters, ASCII isn't a very useful encoding because it will raise -an exception if an element's name contains any characters with values -greater than 127. Therefore, it's best to specify a different -encoding such as UTF-8 that can handle any Unicode character.) - -This section is only a partial description of the ElementTree interfaces. -Please read the package's official documentation for more details. - -\begin{seealso} - -\seeurl{http://effbot.org/zone/element-index.htm} -{Official documentation for ElementTree.} - -\end{seealso} - - -%====================================================================== -\subsection{The hashlib package\label{module-hashlib}} - -A new \module{hashlib} module, written by Gregory P. Smith, -has been added to replace the -\module{md5} and \module{sha} modules. \module{hashlib} adds support -for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). -When available, the module uses OpenSSL for fast platform optimized -implementations of algorithms. - -The old \module{md5} and \module{sha} modules still exist as wrappers -around hashlib to preserve backwards compatibility. The new module's -interface is very close to that of the old modules, but not identical. -The most significant difference is that the constructor functions -for creating new hashing objects are named differently. - -\begin{verbatim} -# Old versions -h = md5.md5() -h = md5.new() - -# New version -h = hashlib.md5() - -# Old versions -h = sha.sha() -h = sha.new() - -# New version -h = hashlib.sha1() - -# Hash that weren't previously available -h = hashlib.sha224() -h = hashlib.sha256() -h = hashlib.sha384() -h = hashlib.sha512() - -# Alternative form -h = hashlib.new('md5') # Provide algorithm as a string -\end{verbatim} - -Once a hash object has been created, its methods are the same as before: -\method{update(\var{string})} hashes the specified string into the -current digest state, \method{digest()} and \method{hexdigest()} -return the digest value as a binary string or a string of hex digits, -and \method{copy()} returns a new hashing object with the same digest state. - -\begin{seealso} - -\seeurl{../lib/module-hashlib.html}{The documentation -for the \module{hashlib} module.} - -\end{seealso} - - -%====================================================================== -\subsection{The sqlite3 package\label{module-sqlite}} - -The pysqlite module (\url{http://www.pysqlite.org}), a wrapper for the -SQLite embedded database, has been added to the standard library under -the package name \module{sqlite3}. - -SQLite is a C library that provides a lightweight disk-based database -that doesn't require a separate server process and allows accessing -the database using a nonstandard variant of the SQL query language. -Some applications can use SQLite for internal data storage. It's also -possible to prototype an application using SQLite and then port the -code to a larger database such as PostgreSQL or Oracle. - -pysqlite was written by Gerhard H\"aring and provides a SQL interface -compliant with the DB-API 2.0 specification described by -\pep{249}. - -If you're compiling the Python source yourself, note that the source -tree doesn't include the SQLite code, only the wrapper module. -You'll need to have the SQLite libraries and headers installed before -compiling Python, and the build process will compile the module when -the necessary headers are available. - -To use the module, you must first create a \class{Connection} object -that represents the database. Here the data will be stored in the -\file{/tmp/example} file: - -\begin{verbatim} -conn = sqlite3.connect('/tmp/example') -\end{verbatim} - -You can also supply the special name \samp{:memory:} to create -a database in RAM. - -Once you have a \class{Connection}, you can create a \class{Cursor} -object and call its \method{execute()} method to perform SQL commands: - -\begin{verbatim} -c = conn.cursor() - -# Create table -c.execute('''create table stocks -(date text, trans text, symbol text, - qty real, price real)''') - -# Insert a row of data -c.execute("""insert into stocks - values ('2006-01-05','BUY','RHAT',100,35.14)""") -\end{verbatim} - -Usually your SQL operations will need to use values from Python -variables. You shouldn't assemble your query using Python's string -operations because doing so is insecure; it makes your program -vulnerable to an SQL injection attack. - -Instead, use the DB-API's parameter substitution. Put \samp{?} as a -placeholder wherever you want to use a value, and then provide a tuple -of values as the second argument to the cursor's \method{execute()} -method. (Other database modules may use a different placeholder, -such as \samp{\%s} or \samp{:1}.) For example: - -\begin{verbatim} -# Never do this -- insecure! -symbol = 'IBM' -c.execute("... where symbol = '%s'" % symbol) - -# Do this instead -t = (symbol,) -c.execute('select * from stocks where symbol=?', t) - -# Larger example -for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), - ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), - ('2006-04-06', 'SELL', 'IBM', 500, 53.00), - ): - c.execute('insert into stocks values (?,?,?,?,?)', t) -\end{verbatim} - -To retrieve data after executing a SELECT statement, you can either -treat the cursor as an iterator, call the cursor's \method{fetchone()} -method to retrieve a single matching row, -or call \method{fetchall()} to get a list of the matching rows. - -This example uses the iterator form: - -\begin{verbatim} ->>> c = conn.cursor() ->>> c.execute('select * from stocks order by price') ->>> for row in c: -... print row -... -(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) -(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) -(u'2006-04-06', u'SELL', u'IBM', 500, 53.0) -(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0) ->>> -\end{verbatim} - -For more information about the SQL dialect supported by SQLite, see -\url{http://www.sqlite.org}. - -\begin{seealso} - -\seeurl{http://www.pysqlite.org} -{The pysqlite web page.} - -\seeurl{http://www.sqlite.org} -{The SQLite web page; the documentation describes the syntax and the -available data types for the supported SQL dialect.} - -\seeurl{../lib/module-sqlite3.html}{The documentation -for the \module{sqlite3} module.} - -\seepep{249}{Database API Specification 2.0}{PEP written by -Marc-Andr\'e Lemburg.} - -\end{seealso} - - -%====================================================================== -\subsection{The wsgiref package\label{module-wsgiref}} - -% XXX should this be in a PEP 333 section instead? - -The Web Server Gateway Interface (WSGI) v1.0 defines a standard -interface between web servers and Python web applications and is -described in \pep{333}. The \module{wsgiref} package is a reference -implementation of the WSGI specification. - -The package includes a basic HTTP server that will run a WSGI -application; this server is useful for debugging but isn't intended for -production use. Setting up a server takes only a few lines of code: - -\begin{verbatim} -from wsgiref import simple_server - -wsgi_app = ... - -host = '' -port = 8000 -httpd = simple_server.make_server(host, port, wsgi_app) -httpd.serve_forever() -\end{verbatim} - -% XXX discuss structure of WSGI applications? -% XXX provide an example using Django or some other framework? - -\begin{seealso} - -\seeurl{http://www.wsgi.org}{A central web site for WSGI-related resources.} - -\seepep{333}{Python Web Server Gateway Interface v1.0}{PEP written by -Phillip J. Eby.} - -\end{seealso} - - -% ====================================================================== -\section{Build and C API Changes\label{build-api}} - -Changes to Python's build process and to the C API include: - -\begin{itemize} - -\item The Python source tree was converted from CVS to Subversion, -in a complex migration procedure that was supervised and flawlessly -carried out by Martin von~L\"owis. The procedure was developed as -\pep{347}. - -\item Coverity, a company that markets a source code analysis tool -called Prevent, provided the results of their examination of the Python -source code. The analysis found about 60 bugs that -were quickly fixed. Many of the bugs were refcounting problems, often -occurring in error-handling code. See -\url{http://scan.coverity.com} for the statistics. - -\item The largest change to the C API came from \pep{353}, -which modifies the interpreter to use a \ctype{Py_ssize_t} type -definition instead of \ctype{int}. See the earlier -section~\ref{pep-353} for a discussion of this change. - -\item The design of the bytecode compiler has changed a great deal, -no longer generating bytecode by traversing the parse tree. Instead -the parse tree is converted to an abstract syntax tree (or AST), and it is -the abstract syntax tree that's traversed to produce the bytecode. - -It's possible for Python code to obtain AST objects by using the -\function{compile()} built-in and specifying \code{_ast.PyCF_ONLY_AST} -as the value of the -\var{flags} parameter: - -\begin{verbatim} -from _ast import PyCF_ONLY_AST -ast = compile("""a=0 -for i in range(10): - a += i -""", "<string>", 'exec', PyCF_ONLY_AST) - -assignment = ast.body[0] -for_loop = ast.body[1] -\end{verbatim} - -No official documentation has been written for the AST code yet, but -\pep{339} discusses the design. To start learning about the code, read the -definition of the various AST nodes in \file{Parser/Python.asdl}. A -Python script reads this file and generates a set of C structure -definitions in \file{Include/Python-ast.h}. The -\cfunction{PyParser_ASTFromString()} and -\cfunction{PyParser_ASTFromFile()}, defined in -\file{Include/pythonrun.h}, take Python source as input and return the -root of an AST representing the contents. This AST can then be turned -into a code object by \cfunction{PyAST_Compile()}. For more -information, read the source code, and then ask questions on -python-dev. - -% List of names taken from Jeremy's python-dev post at -% http://mail.python.org/pipermail/python-dev/2005-October/057500.html -The AST code was developed under Jeremy Hylton's management, and -implemented by (in alphabetical order) Brett Cannon, Nick Coghlan, -Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, -Armin Rigo, and Neil Schemenauer, plus the participants in a number of -AST sprints at conferences such as PyCon. - -\item Evan Jones's patch to obmalloc, first described in a talk -at PyCon DC 2005, was applied. Python 2.4 allocated small objects in -256K-sized arenas, but never freed arenas. With this patch, Python -will free arenas when they're empty. The net effect is that on some -platforms, when you allocate many objects, Python's memory usage may -actually drop when you delete them and the memory may be returned to -the operating system. (Implemented by Evan Jones, and reworked by Tim -Peters.) - -Note that this change means extension modules must be more careful -when allocating memory. Python's API has many different -functions for allocating memory that are grouped into families. For -example, \cfunction{PyMem_Malloc()}, \cfunction{PyMem_Realloc()}, and -\cfunction{PyMem_Free()} are one family that allocates raw memory, -while \cfunction{PyObject_Malloc()}, \cfunction{PyObject_Realloc()}, -and \cfunction{PyObject_Free()} are another family that's supposed to -be used for creating Python objects. - -Previously these different families all reduced to the platform's -\cfunction{malloc()} and \cfunction{free()} functions. This meant -it didn't matter if you got things wrong and allocated memory with the -\cfunction{PyMem} function but freed it with the \cfunction{PyObject} -function. With 2.5's changes to obmalloc, these families now do different -things and mismatches will probably result in a segfault. You should -carefully test your C extension modules with Python 2.5. - -\item The built-in set types now have an official C API. Call -\cfunction{PySet_New()} and \cfunction{PyFrozenSet_New()} to create a -new set, \cfunction{PySet_Add()} and \cfunction{PySet_Discard()} to -add and remove elements, and \cfunction{PySet_Contains} and -\cfunction{PySet_Size} to examine the set's state. -(Contributed by Raymond Hettinger.) - -\item C code can now obtain information about the exact revision -of the Python interpreter by calling the -\cfunction{Py_GetBuildInfo()} function that returns a -string of build information like this: -\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}. -(Contributed by Barry Warsaw.) - -\item Two new macros can be used to indicate C functions that are -local to the current file so that a faster calling convention can be -used. \cfunction{Py_LOCAL(\var{type})} declares the function as -returning a value of the specified \var{type} and uses a fast-calling -qualifier. \cfunction{Py_LOCAL_INLINE(\var{type})} does the same thing -and also requests the function be inlined. If -\cfunction{PY_LOCAL_AGGRESSIVE} is defined before \file{python.h} is -included, a set of more aggressive optimizations are enabled for the -module; you should benchmark the results to find out if these -optimizations actually make the code faster. (Contributed by Fredrik -Lundh at the NeedForSpeed sprint.) - -\item \cfunction{PyErr_NewException(\var{name}, \var{base}, -\var{dict})} can now accept a tuple of base classes as its \var{base} -argument. (Contributed by Georg Brandl.) - -\item The \cfunction{PyErr_Warn()} function for issuing warnings -is now deprecated in favour of \cfunction{PyErr_WarnEx(category, -message, stacklevel)} which lets you specify the number of stack -frames separating this function and the caller. A \var{stacklevel} of -1 is the function calling \cfunction{PyErr_WarnEx()}, 2 is the -function above that, and so forth. (Added by Neal Norwitz.) - -\item The CPython interpreter is still written in C, but -the code can now be compiled with a {\Cpp} compiler without errors. -(Implemented by Anthony Baxter, Martin von~L\"owis, Skip Montanaro.) - -\item The \cfunction{PyRange_New()} function was removed. It was -never documented, never used in the core code, and had dangerously lax -error checking. In the unlikely case that your extensions were using -it, you can replace it by something like the following: -\begin{verbatim} -range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll", - start, stop, step); -\end{verbatim} - -\end{itemize} - - -%====================================================================== -\subsection{Port-Specific Changes\label{ports}} - -\begin{itemize} - -\item MacOS X (10.3 and higher): dynamic loading of modules -now uses the \cfunction{dlopen()} function instead of MacOS-specific -functions. - -\item MacOS X: a \longprogramopt{enable-universalsdk} switch was added -to the \program{configure} script that compiles the interpreter as a -universal binary able to run on both PowerPC and Intel processors. -(Contributed by Ronald Oussoren.) - -\item Windows: \file{.dll} is no longer supported as a filename extension for -extension modules. \file{.pyd} is now the only filename extension that will -be searched for. - -\end{itemize} - - -%====================================================================== -\section{Porting to Python 2.5\label{porting}} - -This section lists previously described changes that may require -changes to your code: - -\begin{itemize} - -\item ASCII is now the default encoding for modules. It's now -a syntax error if a module contains string literals with 8-bit -characters but doesn't have an encoding declaration. In Python 2.4 -this triggered a warning, not a syntax error. - -\item Previously, the \member{gi_frame} attribute of a generator -was always a frame object. Because of the \pep{342} changes -described in section~\ref{pep-342}, it's now possible -for \member{gi_frame} to be \code{None}. - -\item A new warning, \class{UnicodeWarning}, is triggered when -you attempt to compare a Unicode string and an 8-bit string that can't -be converted to Unicode using the default ASCII encoding. Previously -such comparisons would raise a \class{UnicodeDecodeError} exception. - -\item Library: the \module{csv} module is now stricter about multi-line quoted -fields. If your files contain newlines embedded within fields, the -input should be split into lines in a manner which preserves the -newline characters. - -\item Library: the \module{locale} module's -\function{format()} function's would previously -accept any string as long as no more than one \%char specifier -appeared. In Python 2.5, the argument must be exactly one \%char -specifier with no surrounding text. - -\item Library: The \module{pickle} and \module{cPickle} modules no -longer accept a return value of \code{None} from the -\method{__reduce__()} method; the method must return a tuple of -arguments instead. The modules also no longer accept the deprecated -\var{bin} keyword parameter. - -\item Library: The \module{SimpleXMLRPCServer} and \module{DocXMLRPCServer} -classes now have a \member{rpc_paths} attribute that constrains -XML-RPC operations to a limited set of URL paths; the default is -to allow only \code{'/'} and \code{'/RPC2'}. Setting -\member{rpc_paths} to \code{None} or an empty tuple disables -this path checking. - -\item C API: Many functions now use \ctype{Py_ssize_t} -instead of \ctype{int} to allow processing more data on 64-bit -machines. Extension code may need to make the same change to avoid -warnings and to support 64-bit machines. See the earlier -section~\ref{pep-353} for a discussion of this change. - -\item C API: -The obmalloc changes mean that -you must be careful to not mix usage -of the \cfunction{PyMem_*()} and \cfunction{PyObject_*()} -families of functions. Memory allocated with -one family's \cfunction{*_Malloc()} must be -freed with the corresponding family's \cfunction{*_Free()} function. - -\end{itemize} - - -%====================================================================== -\section{Acknowledgements \label{acks}} - -The author would like to thank the following people for offering -suggestions, corrections and assistance with various drafts of this -article: Georg Brandl, Nick Coghlan, Phillip J. Eby, Lars Gust\"abel, -Raymond Hettinger, Ralf W. Grosse-Kunstleve, Kent Johnson, Iain Lowe, -Martin von~L\"owis, Fredrik Lundh, Andrew McNamara, Skip Montanaro, -Gustavo Niemeyer, Paul Prescod, James Pryor, Mike Rovner, Scott -Weikart, Barry Warsaw, Thomas Wouters. - -\end{document} diff --git a/Doc/whatsnew/whatsnew26.tex b/Doc/whatsnew/whatsnew26.tex deleted file mode 100644 index 5d2373f..0000000 --- a/Doc/whatsnew/whatsnew26.tex +++ /dev/null @@ -1,268 +0,0 @@ -\documentclass{howto} -\usepackage{distutils} -% $Id$ - -% Rules for maintenance: -% -% * Anyone can add text to this document. Do not spend very much time -% on the wording of your changes, because your text will probably -% get rewritten to some degree. -% -% * The maintainer will go through Misc/NEWS periodically and add -% changes; it's therefore more important to add your changes to -% Misc/NEWS than to this file. -% -% * This is not a complete list of every single change; completeness -% is the purpose of Misc/NEWS. Some changes I consider too small -% or esoteric to include. If such a change is added to the text, -% I'll just remove it. (This is another reason you shouldn't spend -% too much time on writing your addition.) -% -% * If you want to draw your new text to the attention of the -% maintainer, add 'XXX' to the beginning of the paragraph or -% section. -% -% * It's OK to just add a fragmentary note about a change. For -% example: "XXX Describe the transmogrify() function added to the -% socket module." The maintainer will research the change and -% write the necessary text. -% -% * You can comment out your additions if you like, but it's not -% necessary (especially when a final release is some months away). -% -% * Credit the author of a patch or bugfix. Just the name is -% sufficient; the e-mail address isn't necessary. -% -% * It's helpful to add the bug/patch number as a comment: -% -% % Patch 12345 -% XXX Describe the transmogrify() function added to the socket -% module. -% (Contributed by P.Y. Developer.) -% -% This saves the maintainer the effort of going through the SVN log -% when researching a change. - -\title{What's New in Python 2.6} -\release{0.0} -\author{A.M. Kuchling} -\authoraddress{\email{amk@amk.ca}} - -\begin{document} -\maketitle -\tableofcontents - -This article explains the new features in Python 2.6. No release date -for Python 2.6 has been set; it will probably be released in mid 2008. - -% Compare with previous release in 2 - 3 sentences here. - -This article doesn't attempt to provide a complete specification of -the new features, but instead provides a convenient overview. For -full details, you should refer to the documentation for Python 2.6. -% add hyperlink when the documentation becomes available online. -If you want to understand the complete implementation and design -rationale, refer to the PEP for a particular new feature. - - -%====================================================================== - -% Large, PEP-level features and changes should be described here. - -% Should there be a new section here for 3k migration? -% Or perhaps a more general section describing module changes/deprecation? -% sets module deprecated - -%====================================================================== -\section{Other Language Changes} - -Here are all of the changes that Python 2.6 makes to the core Python -language. - -\begin{itemize} - -% Bug 1569356 -\item An obscure change: when you use the the \function{locals()} -function inside a \keyword{class} statement, the resulting dictionary -no longer returns free variables. (Free variables, in this case, are -variables referred to in the \keyword{class} statement -that aren't attributes of the class.) - -\end{itemize} - - -%====================================================================== -\subsection{Optimizations} - -\begin{itemize} - -% Patch 1624059 -\item Internally, a bit is now set in type objects to indicate some of -the standard built-in types. This speeds up checking if an object is -a subclass of one of these types. (Contributed by Neal Norwitz.) - -\end{itemize} - -The net result of the 2.6 optimizations is that Python 2.6 runs the -pystone benchmark around XX\% faster than Python 2.5. - - -%====================================================================== -\section{New, Improved, and Deprecated Modules} - -As usual, Python's standard library received a number of enhancements and -bug fixes. Here's a partial list of the most notable changes, sorted -alphabetically by module name. Consult the -\file{Misc/NEWS} file in the source tree for a more -complete list of changes, or look through the CVS logs for all the -details. - -\begin{itemize} - -\item New data type in the \module{collections} module: -\class{NamedTuple(\var{typename}, \var{fieldnames})} is a factory function that -creates subclasses of the standard tuple whose fields are accessible -by name as well as index. For example: - -\begin{verbatim} -var_type = collections.NamedTuple('variable', - 'id name type size') -var = var_type(1, 'frequency', 'int', 4) - -print var[0], var.id # Equivalent -print var[2], var.type # Equivalent -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item New method in the \module{curses} module: -for a window, \method{chgat()} changes the display characters for a -certain number of characters on a single line. - -\begin{verbatim} -# Boldface text starting at y=0,x=21 -# and affecting the rest of the line. -stdscr.chgat(0,21, curses.A_BOLD) -\end{verbatim} - -(Contributed by Fabian Kreutz.) - -\item The \module{gopherlib} module has been removed. - -\item New function in the \module{heapq} module: -\function{merge(iter1, iter2, ...)} -takes any number of iterables that return data -\emph{in sorted order}, -and -returns a new iterator that returns the contents of -all the iterators, also in sorted order. For example: - -\begin{verbatim} -heapq.merge([1, 3, 5, 9], [2, 8, 16]) -> - [1, 2, 3, 5, 8, 9, 16] -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item New function in the \module{itertools} module: -\function{izip_longest(iter1, iter2, ...\optional{, fillvalue})} -makes tuples from each of the elements; if some of the iterables -are shorter than others, the missing values -are set to \var{fillvalue}. For example: - -\begin{verbatim} -itertools.izip_longest([1,2,3], [1,2,3,4,5]) -> - [(1, 1), (2, 2), (3, 3), (None, 4), (None, 5)] -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item The \module{macfs} module has been removed. This in turn -required the \function{macostools.touched()} function to be removed -because it depended on the \module{macfs} module. - -% Patch #1490190 -\item New functions in the \module{posix} module: \function{chflags()} -and \function{lchflags()} are wrappers for the corresponding system -calls (where they're available). Constants for the flag values are -defined in the \module{stat} module; some possible values include -\constant{UF_IMMUTABLE} to signal the file may not be changed and -\constant{UF_APPEND} to indicate that data can only be appended to the -file. (Contributed by M. Levinson.) - -\item The \module{rgbimg} module has been removed. - -\item The \module{smtplib} module now supports SMTP over -SSL thanks to the addition of the \class{SMTP_SSL} class. -This class supports an interface identical to the existing \class{SMTP} -class. (Contributed by Monty Taylor.) - -\item The \module{test.test_support} module now contains a -\function{EnvironmentVarGuard} context manager that -supports temporarily changing environment variables and -automatically restores them to their old values. -(Contributed by Brett Cannon.) - -\end{itemize} - - -%====================================================================== -% whole new modules get described in \subsections here - - -% ====================================================================== -\section{Build and C API Changes} - -Changes to Python's build process and to the C API include: - -\begin{itemize} - -\item Detailed changes are listed here. - -\end{itemize} - - -%====================================================================== -\subsection{Port-Specific Changes} - -Platform-specific changes go here. - - -%====================================================================== -\section{Other Changes and Fixes \label{section-other}} - -As usual, there were a bunch of other improvements and bugfixes -scattered throughout the source tree. A search through the change -logs finds there were XXX patches applied and YYY bugs fixed between -Python 2.5 and 2.6. Both figures are likely to be underestimates. - -Some of the more notable changes are: - -\begin{itemize} - -\item Details go here. - -\end{itemize} - - -%====================================================================== -\section{Porting to Python 2.6} - -This section lists previously described changes that may require -changes to your code: - -\begin{itemize} - -\item Everything is all in the details! - -\end{itemize} - - -%====================================================================== -\section{Acknowledgements \label{acks}} - -The author would like to thank the following people for offering -suggestions, corrections and assistance with various drafts of this -article: . - -\end{document} diff --git a/Doc/whatsnew/whatsnew30.tex b/Doc/whatsnew/whatsnew30.tex deleted file mode 100644 index f52ca37..0000000 --- a/Doc/whatsnew/whatsnew30.tex +++ /dev/null @@ -1,178 +0,0 @@ -\documentclass{howto} -\usepackage{distutils} -% $Id: whatsnew26.tex 55506 2007-05-22 07:43:29Z neal.norwitz $ - -% Rules for maintenance: -% -% * Anyone can add text to this document. Do not spend very much time -% on the wording of your changes, because your text will probably -% get rewritten to some degree. -% -% * The maintainer will go through Misc/NEWS periodically and add -% changes; it's therefore more important to add your changes to -% Misc/NEWS than to this file. -% -% * This is not a complete list of every single change; completeness -% is the purpose of Misc/NEWS. Some changes I consider too small -% or esoteric to include. If such a change is added to the text, -% I'll just remove it. (This is another reason you shouldn't spend -% too much time on writing your addition.) -% -% * If you want to draw your new text to the attention of the -% maintainer, add 'XXX' to the beginning of the paragraph or -% section. -% -% * It's OK to just add a fragmentary note about a change. For -% example: "XXX Describe the transmogrify() function added to the -% socket module." The maintainer will research the change and -% write the necessary text. -% -% * You can comment out your additions if you like, but it's not -% necessary (especially when a final release is some months away). -% -% * Credit the author of a patch or bugfix. Just the name is -% sufficient; the e-mail address isn't necessary. -% -% * It's helpful to add the bug/patch number as a comment: -% -% % Patch 12345 -% XXX Describe the transmogrify() function added to the socket -% module. -% (Contributed by P.Y. Developer.) -% -% This saves the maintainer the effort of going through the SVN log -% when researching a change. - -\title{What's New in Python 3.0} -\release{0.0} -\author{A.M. Kuchling} -\authoraddress{\email{amk@amk.ca}} - -\begin{document} -\maketitle -\tableofcontents - -This article explains the new features in Python 3.0. No release date -for Python 3.0 has been set; it will probably be released in mid 2008. - -% Compare with previous release in 2 - 3 sentences here. - -This article doesn't attempt to provide a complete specification of -the new features, but instead provides a convenient overview. For -full details, you should refer to the documentation for Python 3.0. -% add hyperlink when the documentation becomes available online. -If you want to understand the complete implementation and design -rationale, refer to the PEP for a particular new feature. - - -%====================================================================== - -% Large, PEP-level features and changes should be described here. - -% Should there be a new section here for 3k migration? -% Or perhaps a more general section describing module changes/deprecation? -% sets module deprecated - -%====================================================================== -\section{Other Language Changes} - -Here are all of the changes that Python 2.6 makes to the core Python -language. - -\begin{itemize} - -\item Detailed changes are listed here. - -\end{itemize} - - -%====================================================================== -\subsection{Optimizations} - -\begin{itemize} - -\item Detailed changes are listed here. - -\end{itemize} - -The net result of the 3.0 optimizations is that Python 3.0 runs the -pystone benchmark around XX\% slower than Python 2.6. - - -%====================================================================== -\section{New, Improved, and Deprecated Modules} - -As usual, Python's standard library received a number of enhancements and -bug fixes. Here's a partial list of the most notable changes, sorted -alphabetically by module name. Consult the -\file{Misc/NEWS} file in the source tree for a more -complete list of changes, or look through the CVS logs for all the -details. - -\begin{itemize} - -\item Detailed changes are listed here. - -\end{itemize} - - -%====================================================================== -% whole new modules get described in \subsections here - - -% ====================================================================== -\section{Build and C API Changes} - -Changes to Python's build process and to the C API include: - -\begin{itemize} - -\item Detailed changes are listed here. - -\end{itemize} - - -%====================================================================== -\subsection{Port-Specific Changes} - -Platform-specific changes go here. - - -%====================================================================== -\section{Other Changes and Fixes \label{section-other}} - -As usual, there were a bunch of other improvements and bugfixes -scattered throughout the source tree. A search through the change -logs finds there were XXX patches applied and YYY bugs fixed between -Python 2.6 and 3.0. Both figures are likely to be underestimates. - -Some of the more notable changes are: - -\begin{itemize} - -\item Details go here. - -\end{itemize} - - -%====================================================================== -\section{Porting to Python 3.0} - -This section lists previously described changes that may require -changes to your code: - -\begin{itemize} - -\item Everything is all in the details! - -\end{itemize} - - -%====================================================================== -\section{Acknowledgements \label{acks}} - -The author would like to thank the following people for offering -suggestions, corrections and assistance with various drafts of this -article: . - -\end{document} |