diff options
author | Georg Brandl <georg@python.org> | 2007-08-15 14:28:22 (GMT) |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2007-08-15 14:28:22 (GMT) |
commit | 116aa62bf54a39697e25f21d6cf6799f7faa1349 (patch) | |
tree | 8db5729518ed4ca88e26f1e26cc8695151ca3eb3 /Doc/whatsnew | |
parent | 739c01d47b9118d04e5722333f0e6b4d0c8bdd9e (diff) | |
download | cpython-116aa62bf54a39697e25f21d6cf6799f7faa1349.zip cpython-116aa62bf54a39697e25f21d6cf6799f7faa1349.tar.gz cpython-116aa62bf54a39697e25f21d6cf6799f7faa1349.tar.bz2 |
Move the 3k reST doc tree in place.
Diffstat (limited to 'Doc/whatsnew')
-rw-r--r-- | Doc/whatsnew/2.0.rst | 1207 | ||||
-rw-r--r-- | Doc/whatsnew/2.1.rst | 794 | ||||
-rw-r--r-- | Doc/whatsnew/2.2.rst | 1269 | ||||
-rw-r--r-- | Doc/whatsnew/2.3.rst | 2084 | ||||
-rw-r--r-- | Doc/whatsnew/2.4.rst | 1571 | ||||
-rw-r--r-- | Doc/whatsnew/2.5.rst | 2286 | ||||
-rw-r--r-- | Doc/whatsnew/2.6.rst | 236 | ||||
-rw-r--r-- | Doc/whatsnew/3.0.rst | 161 |
8 files changed, 9608 insertions, 0 deletions
diff --git a/Doc/whatsnew/2.0.rst b/Doc/whatsnew/2.0.rst new file mode 100644 index 0000000..302986c --- /dev/null +++ b/Doc/whatsnew/2.0.rst @@ -0,0 +1,1207 @@ +**************************** + What's New in Python 2.0 +**************************** + +:Author: A.M. Kuchling and Moshe Zadka + +.. |release| replace:: 1.02 + +.. % $Id: whatsnew20.tex 51211 2006-08-11 14:57:12Z thomas.wouters $ + + +Introduction +============ + +A new release of Python, version 2.0, was released on October 16, 2000. This +article covers the exciting new features in 2.0, highlights some other useful +changes, and points out a few incompatible changes that may require rewriting +code. + +Python's development never completely stops between releases, and a steady flow +of bug fixes and improvements are always being submitted. A host of minor fixes, +a few optimizations, additional docstrings, and better error messages went into +2.0; to list them all would be impossible, but they're certainly significant. +Consult the publicly-available CVS logs if you want to see the full list. This +progress is due to the five developers working for PythonLabs are now getting +paid to spend their days fixing bugs, and also due to the improved communication +resulting from moving to SourceForge. + +.. % ====================================================================== + + +What About Python 1.6? +====================== + +Python 1.6 can be thought of as the Contractual Obligations Python release. +After the core development team left CNRI in May 2000, CNRI requested that a 1.6 +release be created, containing all the work on Python that had been performed at +CNRI. Python 1.6 therefore represents the state of the CVS tree as of May 2000, +with the most significant new feature being Unicode support. Development +continued after May, of course, so the 1.6 tree received a few fixes to ensure +that it's forward-compatible with Python 2.0. 1.6 is therefore part of Python's +evolution, and not a side branch. + +So, should you take much interest in Python 1.6? Probably not. The 1.6final +and 2.0beta1 releases were made on the same day (September 5, 2000), the plan +being to finalize Python 2.0 within a month or so. If you have applications to +maintain, there seems little point in breaking things by moving to 1.6, fixing +them, and then having another round of breakage within a month by moving to 2.0; +you're better off just going straight to 2.0. Most of the really interesting +features described in this document are only in 2.0, because a lot of work was +done between May and September. + +.. % ====================================================================== + + +New Development Process +======================= + +The most important change in Python 2.0 may not be to the code at all, but to +how Python is developed: in May 2000 the Python developers began using the tools +made available by SourceForge for storing source code, tracking bug reports, +and managing the queue of patch submissions. To report bugs or submit patches +for Python 2.0, use the bug tracking and patch manager tools available from +Python's project page, located at http://sourceforge.net/projects/python/. + +The most important of the services now hosted at SourceForge is the Python CVS +tree, the version-controlled repository containing the source code for Python. +Previously, there were roughly 7 or so people who had write access to the CVS +tree, and all patches had to be inspected and checked in by one of the people on +this short list. Obviously, this wasn't very scalable. By moving the CVS tree +to SourceForge, it became possible to grant write access to more people; as of +September 2000 there were 27 people able to check in changes, a fourfold +increase. This makes possible large-scale changes that wouldn't be attempted if +they'd have to be filtered through the small group of core developers. For +example, one day Peter Schneider-Kamp took it into his head to drop K&R C +compatibility and convert the C source for Python to ANSI C. After getting +approval on the python-dev mailing list, he launched into a flurry of checkins +that lasted about a week, other developers joined in to help, and the job was +done. If there were only 5 people with write access, probably that task would +have been viewed as "nice, but not worth the time and effort needed" and it +would never have gotten done. + +The shift to using SourceForge's services has resulted in a remarkable increase +in the speed of development. Patches now get submitted, commented on, revised +by people other than the original submitter, and bounced back and forth between +people until the patch is deemed worth checking in. Bugs are tracked in one +central location and can be assigned to a specific person for fixing, and we can +count the number of open bugs to measure progress. This didn't come without a +cost: developers now have more e-mail to deal with, more mailing lists to +follow, and special tools had to be written for the new environment. For +example, SourceForge sends default patch and bug notification e-mail messages +that are completely unhelpful, so Ka-Ping Yee wrote an HTML screen-scraper that +sends more useful messages. + +The ease of adding code caused a few initial growing pains, such as code was +checked in before it was ready or without getting clear agreement from the +developer group. The approval process that has emerged is somewhat similar to +that used by the Apache group. Developers can vote +1, +0, -0, or -1 on a patch; ++1 and -1 denote acceptance or rejection, while +0 and -0 mean the developer is +mostly indifferent to the change, though with a slight positive or negative +slant. The most significant change from the Apache model is that the voting is +essentially advisory, letting Guido van Rossum, who has Benevolent Dictator For +Life status, know what the general opinion is. He can still ignore the result of +a vote, and approve or reject a change even if the community disagrees with him. + +Producing an actual patch is the last step in adding a new feature, and is +usually easy compared to the earlier task of coming up with a good design. +Discussions of new features can often explode into lengthy mailing list threads, +making the discussion hard to follow, and no one can read every posting to +python-dev. Therefore, a relatively formal process has been set up to write +Python Enhancement Proposals (PEPs), modelled on the Internet RFC process. PEPs +are draft documents that describe a proposed new feature, and are continually +revised until the community reaches a consensus, either accepting or rejecting +the proposal. Quoting from the introduction to PEP 1, "PEP Purpose and +Guidelines": + + +.. epigraph:: + + PEP stands for Python Enhancement Proposal. A PEP is a design document + providing information to the Python community, or describing a new feature for + Python. The PEP should provide a concise technical specification of the feature + and a rationale for the feature. + + We intend PEPs to be the primary mechanisms for proposing new features, for + collecting community input on an issue, and for documenting the design decisions + that have gone into Python. The PEP author is responsible for building + consensus within the community and documenting dissenting opinions. + +Read the rest of PEP 1 for the details of the PEP editorial process, style, and +format. PEPs are kept in the Python CVS tree on SourceForge, though they're not +part of the Python 2.0 distribution, and are also available in HTML form from +http://www.python.org/peps/. As of September 2000, there are 25 PEPS, ranging +from PEP 201, "Lockstep Iteration", to PEP 225, "Elementwise/Objectwise +Operators". + +.. % ====================================================================== + + +Unicode +======= + +The largest new feature in Python 2.0 is a new fundamental data type: Unicode +strings. Unicode uses 16-bit numbers to represent characters instead of the +8-bit number used by ASCII, meaning that 65,536 distinct characters can be +supported. + +The final interface for Unicode support was arrived at through countless often- +stormy discussions on the python-dev mailing list, and mostly implemented by +Marc-André Lemburg, based on a Unicode string type implementation by Fredrik +Lundh. A detailed explanation of the interface was written up as :pep:`100`, +"Python Unicode Integration". This article will simply cover the most +significant points about the Unicode interfaces. + +In Python source code, Unicode strings are written as ``u"string"``. Arbitrary +Unicode characters can be written using a new escape sequence, ``\uHHHH``, where +*HHHH* is a 4-digit hexadecimal number from 0000 to FFFF. The existing +``\xHHHH`` escape sequence can also be used, and octal escapes can be used for +characters up to U+01FF, which is represented by ``\777``. + +Unicode strings, just like regular strings, are an immutable sequence type. +They can be indexed and sliced, but not modified in place. Unicode strings have +an ``encode( [encoding] )`` method that returns an 8-bit string in the desired +encoding. Encodings are named by strings, such as ``'ascii'``, ``'utf-8'``, +``'iso-8859-1'``, or whatever. A codec API is defined for implementing and +registering new encodings that are then available throughout a Python program. +If an encoding isn't specified, the default encoding is usually 7-bit ASCII, +though it can be changed for your Python installation by calling the +:func:`sys.setdefaultencoding(encoding)` function in a customised version of +:file:`site.py`. + +Combining 8-bit and Unicode strings always coerces to Unicode, using the default +ASCII encoding; the result of ``'a' + u'bc'`` is ``u'abc'``. + +New built-in functions have been added, and existing built-ins modified to +support Unicode: + +* ``unichr(ch)`` returns a Unicode string 1 character long, containing the + character *ch*. + +* ``ord(u)``, where *u* is a 1-character regular or Unicode string, returns the + number of the character as an integer. + +* ``unicode(string [, encoding] [, errors] )`` creates a Unicode string + from an 8-bit string. ``encoding`` is a string naming the encoding to use. The + ``errors`` parameter specifies the treatment of characters that are invalid for + the current encoding; passing ``'strict'`` as the value causes an exception to + be raised on any encoding error, while ``'ignore'`` causes errors to be silently + ignored and ``'replace'`` uses U+FFFD, the official replacement character, in + case of any problems. + +* The :keyword:`exec` statement, and various built-ins such as ``eval()``, + ``getattr()``, and ``setattr()`` will also accept Unicode strings as well as + regular strings. (It's possible that the process of fixing this missed some + built-ins; if you find a built-in function that accepts strings but doesn't + accept Unicode strings at all, please report it as a bug.) + +A new module, :mod:`unicodedata`, provides an interface to Unicode character +properties. For example, ``unicodedata.category(u'A')`` returns the 2-character +string 'Lu', the 'L' denoting it's a letter, and 'u' meaning that it's +uppercase. ``unicodedata.bidirectional(u'\u0660')`` returns 'AN', meaning that +U+0660 is an Arabic number. + +The :mod:`codecs` module contains functions to look up existing encodings and +register new ones. Unless you want to implement a new encoding, you'll most +often use the :func:`codecs.lookup(encoding)` function, which returns a +4-element tuple: ``(encode_func, decode_func, stream_reader, stream_writer)``. + +* *encode_func* is a function that takes a Unicode string, and returns a 2-tuple + ``(string, length)``. *string* is an 8-bit string containing a portion (perhaps + all) of the Unicode string converted into the given encoding, and *length* tells + you how much of the Unicode string was converted. + +* *decode_func* is the opposite of *encode_func*, taking an 8-bit string and + returning a 2-tuple ``(ustring, length)``, consisting of the resulting Unicode + string *ustring* and the integer *length* telling how much of the 8-bit string + was consumed. + +* *stream_reader* is a class that supports decoding input from a stream. + *stream_reader(file_obj)* returns an object that supports the :meth:`read`, + :meth:`readline`, and :meth:`readlines` methods. These methods will all + translate from the given encoding and return Unicode strings. + +* *stream_writer*, similarly, is a class that supports encoding output to a + stream. *stream_writer(file_obj)* returns an object that supports the + :meth:`write` and :meth:`writelines` methods. These methods expect Unicode + strings, translating them to the given encoding on output. + +For example, the following code writes a Unicode string into a file, encoding +it as UTF-8:: + + import codecs + + unistr = u'\u0660\u2000ab ...' + + (UTF8_encode, UTF8_decode, + UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8') + + output = UTF8_streamwriter( open( '/tmp/output', 'wb') ) + output.write( unistr ) + output.close() + +The following code would then read UTF-8 input from the file:: + + input = UTF8_streamreader( open( '/tmp/output', 'rb') ) + print repr(input.read()) + input.close() + +Unicode-aware regular expressions are available through the :mod:`re` module, +which has a new underlying implementation called SRE written by Fredrik Lundh of +Secret Labs AB. + +A ``-U`` command line option was added which causes the Python compiler to +interpret all string literals as Unicode string literals. This is intended to be +used in testing and future-proofing your Python code, since some future version +of Python may drop support for 8-bit strings and provide only Unicode strings. + +.. % ====================================================================== + + +List Comprehensions +=================== + +Lists are a workhorse data type in Python, and many programs manipulate a list +at some point. Two common operations on lists are to loop over them, and either +pick out the elements that meet a certain criterion, or apply some function to +each element. For example, given a list of strings, you might want to pull out +all the strings containing a given substring, or strip off trailing whitespace +from each line. + +The existing :func:`map` and :func:`filter` functions can be used for this +purpose, but they require a function as one of their arguments. This is fine if +there's an existing built-in function that can be passed directly, but if there +isn't, you have to create a little function to do the required work, and +Python's scoping rules make the result ugly if the little function needs +additional information. Take the first example in the previous paragraph, +finding all the strings in the list containing a given substring. You could +write the following to do it:: + + # Given the list L, make a list of all strings + # containing the substring S. + sublist = filter( lambda s, substring=S: + string.find(s, substring) != -1, + L) + +Because of Python's scoping rules, a default argument is used so that the +anonymous function created by the :keyword:`lambda` statement knows what +substring is being searched for. List comprehensions make this cleaner:: + + sublist = [ s for s in L if string.find(s, S) != -1 ] + +List comprehensions have the form:: + + [ expression for expr in sequence1 + for expr2 in sequence2 ... + for exprN in sequenceN + if condition ] + +The :keyword:`for`...\ :keyword:`in` clauses contain the sequences to be +iterated over. The sequences do not have to be the same length, because they +are *not* iterated over in parallel, but from left to right; this is explained +more clearly in the following paragraphs. The elements of the generated list +will be the successive values of *expression*. The final :keyword:`if` clause +is optional; if present, *expression* is only evaluated and added to the result +if *condition* is true. + +To make the semantics very clear, a list comprehension is equivalent to the +following Python code:: + + for expr1 in sequence1: + for expr2 in sequence2: + ... + for exprN in sequenceN: + if (condition): + # Append the value of + # the expression to the + # resulting list. + +This means that when there are multiple :keyword:`for`...\ :keyword:`in` +clauses, the resulting list will be equal to the product of the lengths of all +the sequences. If you have two lists of length 3, the output list is 9 elements +long:: + + seq1 = 'abc' + seq2 = (1,2,3) + >>> [ (x,y) for x in seq1 for y in seq2] + [('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1), + ('c', 2), ('c', 3)] + +To avoid introducing an ambiguity into Python's grammar, if *expression* is +creating a tuple, it must be surrounded with parentheses. The first list +comprehension below is a syntax error, while the second one is correct:: + + # Syntax error + [ x,y for x in seq1 for y in seq2] + # Correct + [ (x,y) for x in seq1 for y in seq2] + +The idea of list comprehensions originally comes from the functional programming +language Haskell (http://www.haskell.org). Greg Ewing argued most effectively +for adding them to Python and wrote the initial list comprehension patch, which +was then discussed for a seemingly endless time on the python-dev mailing list +and kept up-to-date by Skip Montanaro. + +.. % ====================================================================== + + +Augmented Assignment +==================== + +Augmented assignment operators, another long-requested feature, have been added +to Python 2.0. Augmented assignment operators include ``+=``, ``-=``, ``*=``, +and so forth. For example, the statement ``a += 2`` increments the value of the +variable ``a`` by 2, equivalent to the slightly lengthier ``a = a + 2``. + +The full list of supported assignment operators is ``+=``, ``-=``, ``*=``, +``/=``, ``%=``, ``**=``, ``&=``, ``|=``, ``^=``, ``>>=``, and ``<<=``. Python +classes can override the augmented assignment operators by defining methods +named :meth:`__iadd__`, :meth:`__isub__`, etc. For example, the following +:class:`Number` class stores a number and supports using += to create a new +instance with an incremented value. + +.. % The empty groups below prevent conversion to guillemets. + +:: + + class Number: + def __init__(self, value): + self.value = value + def __iadd__(self, increment): + return Number( self.value + increment) + + n = Number(5) + n += 3 + print n.value + +The :meth:`__iadd__` special method is called with the value of the increment, +and should return a new instance with an appropriately modified value; this +return value is bound as the new value of the variable on the left-hand side. + +Augmented assignment operators were first introduced in the C programming +language, and most C-derived languages, such as :program:`awk`, C++, Java, Perl, +and PHP also support them. The augmented assignment patch was implemented by +Thomas Wouters. + +.. % ====================================================================== + + +String Methods +============== + +Until now string-manipulation functionality was in the :mod:`string` module, +which was usually a front-end for the :mod:`strop` module written in C. The +addition of Unicode posed a difficulty for the :mod:`strop` module, because the +functions would all need to be rewritten in order to accept either 8-bit or +Unicode strings. For functions such as :func:`string.replace`, which takes 3 +string arguments, that means eight possible permutations, and correspondingly +complicated code. + +Instead, Python 2.0 pushes the problem onto the string type, making string +manipulation functionality available through methods on both 8-bit strings and +Unicode strings. :: + + >>> 'andrew'.capitalize() + 'Andrew' + >>> 'hostname'.replace('os', 'linux') + 'hlinuxtname' + >>> 'moshe'.find('sh') + 2 + +One thing that hasn't changed, a noteworthy April Fools' joke notwithstanding, +is that Python strings are immutable. Thus, the string methods return new +strings, and do not modify the string on which they operate. + +The old :mod:`string` module is still around for backwards compatibility, but it +mostly acts as a front-end to the new string methods. + +Two methods which have no parallel in pre-2.0 versions, although they did exist +in JPython for quite some time, are :meth:`startswith` and :meth:`endswith`. +``s.startswith(t)`` is equivalent to ``s[:len(t)] == t``, while +``s.endswith(t)`` is equivalent to ``s[-len(t):] == t``. + +One other method which deserves special mention is :meth:`join`. The +:meth:`join` method of a string receives one parameter, a sequence of strings, +and is equivalent to the :func:`string.join` function from the old :mod:`string` +module, with the arguments reversed. In other words, ``s.join(seq)`` is +equivalent to the old ``string.join(seq, s)``. + +.. % ====================================================================== + + +Garbage Collection of Cycles +============================ + +The C implementation of Python uses reference counting to implement garbage +collection. Every Python object maintains a count of the number of references +pointing to itself, and adjusts the count as references are created or +destroyed. Once the reference count reaches zero, the object is no longer +accessible, since you need to have a reference to an object to access it, and if +the count is zero, no references exist any longer. + +Reference counting has some pleasant properties: it's easy to understand and +implement, and the resulting implementation is portable, fairly fast, and reacts +well with other libraries that implement their own memory handling schemes. The +major problem with reference counting is that it sometimes doesn't realise that +objects are no longer accessible, resulting in a memory leak. This happens when +there are cycles of references. + +Consider the simplest possible cycle, a class instance which has a reference to +itself:: + + instance = SomeClass() + instance.myself = instance + +After the above two lines of code have been executed, the reference count of +``instance`` is 2; one reference is from the variable named ``'instance'``, and +the other is from the ``myself`` attribute of the instance. + +If the next line of code is ``del instance``, what happens? The reference count +of ``instance`` is decreased by 1, so it has a reference count of 1; the +reference in the ``myself`` attribute still exists. Yet the instance is no +longer accessible through Python code, and it could be deleted. Several objects +can participate in a cycle if they have references to each other, causing all of +the objects to be leaked. + +Python 2.0 fixes this problem by periodically executing a cycle detection +algorithm which looks for inaccessible cycles and deletes the objects involved. +A new :mod:`gc` module provides functions to perform a garbage collection, +obtain debugging statistics, and tuning the collector's parameters. + +Running the cycle detection algorithm takes some time, and therefore will result +in some additional overhead. It is hoped that after we've gotten experience +with the cycle collection from using 2.0, Python 2.1 will be able to minimize +the overhead with careful tuning. It's not yet obvious how much performance is +lost, because benchmarking this is tricky and depends crucially on how often the +program creates and destroys objects. The detection of cycles can be disabled +when Python is compiled, if you can't afford even a tiny speed penalty or +suspect that the cycle collection is buggy, by specifying the +:option:`--without-cycle-gc` switch when running the :program:`configure` +script. + +Several people tackled this problem and contributed to a solution. An early +implementation of the cycle detection approach was written by Toby Kelsey. The +current algorithm was suggested by Eric Tiedemann during a visit to CNRI, and +Guido van Rossum and Neil Schemenauer wrote two different implementations, which +were later integrated by Neil. Lots of other people offered suggestions along +the way; the March 2000 archives of the python-dev mailing list contain most of +the relevant discussion, especially in the threads titled "Reference cycle +collection for Python" and "Finalization again". + +.. % ====================================================================== + + +Other Core Changes +================== + +Various minor changes have been made to Python's syntax and built-in functions. +None of the changes are very far-reaching, but they're handy conveniences. + + +Minor Language Changes +---------------------- + +A new syntax makes it more convenient to call a given function with a tuple of +arguments and/or a dictionary of keyword arguments. In Python 1.5 and earlier, +you'd use the :func:`apply` built-in function: ``apply(f, args, kw)`` calls the +function :func:`f` with the argument tuple *args* and the keyword arguments in +the dictionary *kw*. :func:`apply` is the same in 2.0, but thanks to a patch +from Greg Ewing, ``f(*args, **kw)`` as a shorter and clearer way to achieve the +same effect. This syntax is symmetrical with the syntax for defining +functions:: + + def f(*args, **kw): + # args is a tuple of positional args, + # kw is a dictionary of keyword args + ... + +The :keyword:`print` statement can now have its output directed to a file-like +object by following the :keyword:`print` with ``>> file``, similar to the +redirection operator in Unix shells. Previously you'd either have to use the +:meth:`write` method of the file-like object, which lacks the convenience and +simplicity of :keyword:`print`, or you could assign a new value to +``sys.stdout`` and then restore the old value. For sending output to standard +error, it's much easier to write this:: + + print >> sys.stderr, "Warning: action field not supplied" + +Modules can now be renamed on importing them, using the syntax ``import module +as name`` or ``from module import name as othername``. The patch was submitted +by Thomas Wouters. + +A new format style is available when using the ``%`` operator; '%r' will insert +the :func:`repr` of its argument. This was also added from symmetry +considerations, this time for symmetry with the existing '%s' format style, +which inserts the :func:`str` of its argument. For example, ``'%r %s' % ('abc', +'abc')`` returns a string containing ``'abc' abc``. + +Previously there was no way to implement a class that overrode Python's built-in +:keyword:`in` operator and implemented a custom version. ``obj in seq`` returns +true if *obj* is present in the sequence *seq*; Python computes this by simply +trying every index of the sequence until either *obj* is found or an +:exc:`IndexError` is encountered. Moshe Zadka contributed a patch which adds a +:meth:`__contains__` magic method for providing a custom implementation for +:keyword:`in`. Additionally, new built-in objects written in C can define what +:keyword:`in` means for them via a new slot in the sequence protocol. + +Earlier versions of Python used a recursive algorithm for deleting objects. +Deeply nested data structures could cause the interpreter to fill up the C stack +and crash; Christian Tismer rewrote the deletion logic to fix this problem. On +a related note, comparing recursive objects recursed infinitely and crashed; +Jeremy Hylton rewrote the code to no longer crash, producing a useful result +instead. For example, after this code:: + + a = [] + b = [] + a.append(a) + b.append(b) + +The comparison ``a==b`` returns true, because the two recursive data structures +are isomorphic. See the thread "trashcan and PR#7" in the April 2000 archives of +the python-dev mailing list for the discussion leading up to this +implementation, and some useful relevant links. Note that comparisons can now +also raise exceptions. In earlier versions of Python, a comparison operation +such as ``cmp(a,b)`` would always produce an answer, even if a user-defined +:meth:`__cmp__` method encountered an error, since the resulting exception would +simply be silently swallowed. + +.. % Starting URL: +.. % http://www.python.org/pipermail/python-dev/2000-April/004834.html + +Work has been done on porting Python to 64-bit Windows on the Itanium processor, +mostly by Trent Mick of ActiveState. (Confusingly, ``sys.platform`` is still +``'win32'`` on Win64 because it seems that for ease of porting, MS Visual C++ +treats code as 32 bit on Itanium.) PythonWin also supports Windows CE; see the +Python CE page at http://starship.python.net/crew/mhammond/ce/ for more +information. + +Another new platform is Darwin/MacOS X; initial support for it is in Python 2.0. +Dynamic loading works, if you specify "configure --with-dyld --with-suffix=.x". +Consult the README in the Python source distribution for more instructions. + +An attempt has been made to alleviate one of Python's warts, the often-confusing +:exc:`NameError` exception when code refers to a local variable before the +variable has been assigned a value. For example, the following code raises an +exception on the :keyword:`print` statement in both 1.5.2 and 2.0; in 1.5.2 a +:exc:`NameError` exception is raised, while 2.0 raises a new +:exc:`UnboundLocalError` exception. :exc:`UnboundLocalError` is a subclass of +:exc:`NameError`, so any existing code that expects :exc:`NameError` to be +raised should still work. :: + + def f(): + print "i=",i + i = i + 1 + f() + +Two new exceptions, :exc:`TabError` and :exc:`IndentationError`, have been +introduced. They're both subclasses of :exc:`SyntaxError`, and are raised when +Python code is found to be improperly indented. + + +Changes to Built-in Functions +----------------------------- + +A new built-in, :func:`zip(seq1, seq2, ...)`, has been added. :func:`zip` +returns a list of tuples where each tuple contains the i-th element from each of +the argument sequences. The difference between :func:`zip` and ``map(None, +seq1, seq2)`` is that :func:`map` pads the sequences with ``None`` if the +sequences aren't all of the same length, while :func:`zip` truncates the +returned list to the length of the shortest argument sequence. + +The :func:`int` and :func:`long` functions now accept an optional "base" +parameter when the first argument is a string. ``int('123', 10)`` returns 123, +while ``int('123', 16)`` returns 291. ``int(123, 16)`` raises a +:exc:`TypeError` exception with the message "can't convert non-string with +explicit base". + +A new variable holding more detailed version information has been added to the +:mod:`sys` module. ``sys.version_info`` is a tuple ``(major, minor, micro, +level, serial)`` For example, in a hypothetical 2.0.1beta1, ``sys.version_info`` +would be ``(2, 0, 1, 'beta', 1)``. *level* is a string such as ``"alpha"``, +``"beta"``, or ``"final"`` for a final release. + +Dictionaries have an odd new method, :meth:`setdefault(key, default)`, which +behaves similarly to the existing :meth:`get` method. However, if the key is +missing, :meth:`setdefault` both returns the value of *default* as :meth:`get` +would do, and also inserts it into the dictionary as the value for *key*. Thus, +the following lines of code:: + + if dict.has_key( key ): return dict[key] + else: + dict[key] = [] + return dict[key] + +can be reduced to a single ``return dict.setdefault(key, [])`` statement. + +The interpreter sets a maximum recursion depth in order to catch runaway +recursion before filling the C stack and causing a core dump or GPF.. +Previously this limit was fixed when you compiled Python, but in 2.0 the maximum +recursion depth can be read and modified using :func:`sys.getrecursionlimit` and +:func:`sys.setrecursionlimit`. The default value is 1000, and a rough maximum +value for a given platform can be found by running a new script, +:file:`Misc/find_recursionlimit.py`. + +.. % ====================================================================== + + +Porting to 2.0 +============== + +New Python releases try hard to be compatible with previous releases, and the +record has been pretty good. However, some changes are considered useful +enough, usually because they fix initial design decisions that turned out to be +actively mistaken, that breaking backward compatibility can't always be avoided. +This section lists the changes in Python 2.0 that may cause old Python code to +break. + +The change which will probably break the most code is tightening up the +arguments accepted by some methods. Some methods would take multiple arguments +and treat them as a tuple, particularly various list methods such as +:meth:`.append` and :meth:`.insert`. In earlier versions of Python, if ``L`` is +a list, ``L.append( 1,2 )`` appends the tuple ``(1,2)`` to the list. In Python +2.0 this causes a :exc:`TypeError` exception to be raised, with the message: +'append requires exactly 1 argument; 2 given'. The fix is to simply add an +extra set of parentheses to pass both values as a tuple: ``L.append( (1,2) )``. + +The earlier versions of these methods were more forgiving because they used an +old function in Python's C interface to parse their arguments; 2.0 modernizes +them to use :func:`PyArg_ParseTuple`, the current argument parsing function, +which provides more helpful error messages and treats multi-argument calls as +errors. If you absolutely must use 2.0 but can't fix your code, you can edit +:file:`Objects/listobject.c` and define the preprocessor symbol +``NO_STRICT_LIST_APPEND`` to preserve the old behaviour; this isn't recommended. + +Some of the functions in the :mod:`socket` module are still forgiving in this +way. For example, :func:`socket.connect( ('hostname', 25) )` is the correct +form, passing a tuple representing an IP address, but :func:`socket.connect( +'hostname', 25 )` also works. :func:`socket.connect_ex` and :func:`socket.bind` +are similarly easy-going. 2.0alpha1 tightened these functions up, but because +the documentation actually used the erroneous multiple argument form, many +people wrote code which would break with the stricter checking. GvR backed out +the changes in the face of public reaction, so for the :mod:`socket` module, the +documentation was fixed and the multiple argument form is simply marked as +deprecated; it *will* be tightened up again in a future Python version. + +The ``\x`` escape in string literals now takes exactly 2 hex digits. Previously +it would consume all the hex digits following the 'x' and take the lowest 8 bits +of the result, so ``\x123456`` was equivalent to ``\x56``. + +The :exc:`AttributeError` and :exc:`NameError` exceptions have a more friendly +error message, whose text will be something like ``'Spam' instance has no +attribute 'eggs'`` or ``name 'eggs' is not defined``. Previously the error +message was just the missing attribute name ``eggs``, and code written to take +advantage of this fact will break in 2.0. + +Some work has been done to make integers and long integers a bit more +interchangeable. In 1.5.2, large-file support was added for Solaris, to allow +reading files larger than 2 GiB; this made the :meth:`tell` method of file +objects return a long integer instead of a regular integer. Some code would +subtract two file offsets and attempt to use the result to multiply a sequence +or slice a string, but this raised a :exc:`TypeError`. In 2.0, long integers +can be used to multiply or slice a sequence, and it'll behave as you'd +intuitively expect it to; ``3L * 'abc'`` produces 'abcabcabc', and +``(0,1,2,3)[2L:4L]`` produces (2,3). Long integers can also be used in various +contexts where previously only integers were accepted, such as in the +:meth:`seek` method of file objects, and in the formats supported by the ``%`` +operator (``%d``, ``%i``, ``%x``, etc.). For example, ``"%d" % 2L**64`` will +produce the string ``18446744073709551616``. + +The subtlest long integer change of all is that the :func:`str` of a long +integer no longer has a trailing 'L' character, though :func:`repr` still +includes it. The 'L' annoyed many people who wanted to print long integers that +looked just like regular integers, since they had to go out of their way to chop +off the character. This is no longer a problem in 2.0, but code which does +``str(longval)[:-1]`` and assumes the 'L' is there, will now lose the final +digit. + +Taking the :func:`repr` of a float now uses a different formatting precision +than :func:`str`. :func:`repr` uses ``%.17g`` format string for C's +:func:`sprintf`, while :func:`str` uses ``%.12g`` as before. The effect is that +:func:`repr` may occasionally show more decimal places than :func:`str`, for +certain numbers. For example, the number 8.1 can't be represented exactly in +binary, so ``repr(8.1)`` is ``'8.0999999999999996'``, while str(8.1) is +``'8.1'``. + +The ``-X`` command-line option, which turned all standard exceptions into +strings instead of classes, has been removed; the standard exceptions will now +always be classes. The :mod:`exceptions` module containing the standard +exceptions was translated from Python to a built-in C module, written by Barry +Warsaw and Fredrik Lundh. + +.. % Commented out for now -- I don't think anyone will care. +.. % The pattern and match objects provided by SRE are C types, not Python +.. % class instances as in 1.5. This means you can no longer inherit from +.. % \class{RegexObject} or \class{MatchObject}, but that shouldn't be much +.. % of a problem since no one should have been doing that in the first +.. % place. +.. % ====================================================================== + + +Extending/Embedding Changes +=========================== + +Some of the changes are under the covers, and will only be apparent to people +writing C extension modules or embedding a Python interpreter in a larger +application. If you aren't dealing with Python's C API, you can safely skip +this section. + +The version number of the Python C API was incremented, so C extensions compiled +for 1.5.2 must be recompiled in order to work with 2.0. On Windows, it's not +possible for Python 2.0 to import a third party extension built for Python 1.5.x +due to how Windows DLLs work, so Python will raise an exception and the import +will fail. + +Users of Jim Fulton's ExtensionClass module will be pleased to find out that +hooks have been added so that ExtensionClasses are now supported by +:func:`isinstance` and :func:`issubclass`. This means you no longer have to +remember to write code such as ``if type(obj) == myExtensionClass``, but can use +the more natural ``if isinstance(obj, myExtensionClass)``. + +The :file:`Python/importdl.c` file, which was a mass of #ifdefs to support +dynamic loading on many different platforms, was cleaned up and reorganised by +Greg Stein. :file:`importdl.c` is now quite small, and platform-specific code +has been moved into a bunch of :file:`Python/dynload_\*.c` files. Another +cleanup: there were also a number of :file:`my\*.h` files in the Include/ +directory that held various portability hacks; they've been merged into a single +file, :file:`Include/pyport.h`. + +Vladimir Marangozov's long-awaited malloc restructuring was completed, to make +it easy to have the Python interpreter use a custom allocator instead of C's +standard :func:`malloc`. For documentation, read the comments in +:file:`Include/pymem.h` and :file:`Include/objimpl.h`. For the lengthy +discussions during which the interface was hammered out, see the Web archives of +the 'patches' and 'python-dev' lists at python.org. + +Recent versions of the GUSI development environment for MacOS support POSIX +threads. Therefore, Python's POSIX threading support now works on the +Macintosh. Threading support using the user-space GNU ``pth`` library was also +contributed. + +Threading support on Windows was enhanced, too. Windows supports thread locks +that use kernel objects only in case of contention; in the common case when +there's no contention, they use simpler functions which are an order of +magnitude faster. A threaded version of Python 1.5.2 on NT is twice as slow as +an unthreaded version; with the 2.0 changes, the difference is only 10%. These +improvements were contributed by Yakov Markovitch. + +Python 2.0's source now uses only ANSI C prototypes, so compiling Python now +requires an ANSI C compiler, and can no longer be done using a compiler that +only supports K&R C. + +Previously the Python virtual machine used 16-bit numbers in its bytecode, +limiting the size of source files. In particular, this affected the maximum +size of literal lists and dictionaries in Python source; occasionally people who +are generating Python code would run into this limit. A patch by Charles G. +Waldman raises the limit from ``2^16`` to ``2^{32}``. + +Three new convenience functions intended for adding constants to a module's +dictionary at module initialization time were added: :func:`PyModule_AddObject`, +:func:`PyModule_AddIntConstant`, and :func:`PyModule_AddStringConstant`. Each +of these functions takes a module object, a null-terminated C string containing +the name to be added, and a third argument for the value to be assigned to the +name. This third argument is, respectively, a Python object, a C long, or a C +string. + +A wrapper API was added for Unix-style signal handlers. :func:`PyOS_getsig` gets +a signal handler and :func:`PyOS_setsig` will set a new handler. + +.. % ====================================================================== + + +Distutils: Making Modules Easy to Install +========================================= + +Before Python 2.0, installing modules was a tedious affair -- there was no way +to figure out automatically where Python is installed, or what compiler options +to use for extension modules. Software authors had to go through an arduous +ritual of editing Makefiles and configuration files, which only really work on +Unix and leave Windows and MacOS unsupported. Python users faced wildly +differing installation instructions which varied between different extension +packages, which made administering a Python installation something of a chore. + +The SIG for distribution utilities, shepherded by Greg Ward, has created the +Distutils, a system to make package installation much easier. They form the +:mod:`distutils` package, a new part of Python's standard library. In the best +case, installing a Python module from source will require the same steps: first +you simply mean unpack the tarball or zip archive, and the run "``python +setup.py install``". The platform will be automatically detected, the compiler +will be recognized, C extension modules will be compiled, and the distribution +installed into the proper directory. Optional command-line arguments provide +more control over the installation process, the distutils package offers many +places to override defaults -- separating the build from the install, building +or installing in non-default directories, and more. + +In order to use the Distutils, you need to write a :file:`setup.py` script. For +the simple case, when the software contains only .py files, a minimal +:file:`setup.py` can be just a few lines long:: + + from distutils.core import setup + setup (name = "foo", version = "1.0", + py_modules = ["module1", "module2"]) + +The :file:`setup.py` file isn't much more complicated if the software consists +of a few packages:: + + from distutils.core import setup + setup (name = "foo", version = "1.0", + packages = ["package", "package.subpackage"]) + +A C extension can be the most complicated case; here's an example taken from +the PyXML package:: + + from distutils.core import setup, Extension + + expat_extension = Extension('xml.parsers.pyexpat', + define_macros = [('XML_NS', None)], + include_dirs = [ 'extensions/expat/xmltok', + 'extensions/expat/xmlparse' ], + sources = [ 'extensions/pyexpat.c', + 'extensions/expat/xmltok/xmltok.c', + 'extensions/expat/xmltok/xmlrole.c', + ] + ) + setup (name = "PyXML", version = "0.5.4", + ext_modules =[ expat_extension ] ) + +The Distutils can also take care of creating source and binary distributions. +The "sdist" command, run by "``python setup.py sdist``', builds a source +distribution such as :file:`foo-1.0.tar.gz`. Adding new commands isn't +difficult, "bdist_rpm" and "bdist_wininst" commands have already been +contributed to create an RPM distribution and a Windows installer for the +software, respectively. Commands to create other distribution formats such as +Debian packages and Solaris :file:`.pkg` files are in various stages of +development. + +All this is documented in a new manual, *Distributing Python Modules*, that +joins the basic set of Python documentation. + +.. % ====================================================================== + + +XML Modules +=========== + +Python 1.5.2 included a simple XML parser in the form of the :mod:`xmllib` +module, contributed by Sjoerd Mullender. Since 1.5.2's release, two different +interfaces for processing XML have become common: SAX2 (version 2 of the Simple +API for XML) provides an event-driven interface with some similarities to +:mod:`xmllib`, and the DOM (Document Object Model) provides a tree-based +interface, transforming an XML document into a tree of nodes that can be +traversed and modified. Python 2.0 includes a SAX2 interface and a stripped- +down DOM interface as part of the :mod:`xml` package. Here we will give a brief +overview of these new interfaces; consult the Python documentation or the source +code for complete details. The Python XML SIG is also working on improved +documentation. + + +SAX2 Support +------------ + +SAX defines an event-driven interface for parsing XML. To use SAX, you must +write a SAX handler class. Handler classes inherit from various classes +provided by SAX, and override various methods that will then be called by the +XML parser. For example, the :meth:`startElement` and :meth:`endElement` +methods are called for every starting and end tag encountered by the parser, the +:meth:`characters` method is called for every chunk of character data, and so +forth. + +The advantage of the event-driven approach is that the whole document doesn't +have to be resident in memory at any one time, which matters if you are +processing really huge documents. However, writing the SAX handler class can +get very complicated if you're trying to modify the document structure in some +elaborate way. + +For example, this little example program defines a handler that prints a message +for every starting and ending tag, and then parses the file :file:`hamlet.xml` +using it:: + + from xml import sax + + class SimpleHandler(sax.ContentHandler): + def startElement(self, name, attrs): + print 'Start of element:', name, attrs.keys() + + def endElement(self, name): + print 'End of element:', name + + # Create a parser object + parser = sax.make_parser() + + # Tell it what handler to use + handler = SimpleHandler() + parser.setContentHandler( handler ) + + # Parse a file! + parser.parse( 'hamlet.xml' ) + +For more information, consult the Python documentation, or the XML HOWTO at +http://pyxml.sourceforge.net/topics/howto/xml-howto.html. + + +DOM Support +----------- + +The Document Object Model is a tree-based representation for an XML document. A +top-level :class:`Document` instance is the root of the tree, and has a single +child which is the top-level :class:`Element` instance. This :class:`Element` +has children nodes representing character data and any sub-elements, which may +have further children of their own, and so forth. Using the DOM you can +traverse the resulting tree any way you like, access element and attribute +values, insert and delete nodes, and convert the tree back into XML. + +The DOM is useful for modifying XML documents, because you can create a DOM +tree, modify it by adding new nodes or rearranging subtrees, and then produce a +new XML document as output. You can also construct a DOM tree manually and +convert it to XML, which can be a more flexible way of producing XML output than +simply writing ``<tag1>``...\ ``</tag1>`` to a file. + +The DOM implementation included with Python lives in the :mod:`xml.dom.minidom` +module. It's a lightweight implementation of the Level 1 DOM with support for +XML namespaces. The :func:`parse` and :func:`parseString` convenience +functions are provided for generating a DOM tree:: + + from xml.dom import minidom + doc = minidom.parse('hamlet.xml') + +``doc`` is a :class:`Document` instance. :class:`Document`, like all the other +DOM classes such as :class:`Element` and :class:`Text`, is a subclass of the +:class:`Node` base class. All the nodes in a DOM tree therefore support certain +common methods, such as :meth:`toxml` which returns a string containing the XML +representation of the node and its children. Each class also has special +methods of its own; for example, :class:`Element` and :class:`Document` +instances have a method to find all child elements with a given tag name. +Continuing from the previous 2-line example:: + + perslist = doc.getElementsByTagName( 'PERSONA' ) + print perslist[0].toxml() + print perslist[1].toxml() + +For the *Hamlet* XML file, the above few lines output:: + + <PERSONA>CLAUDIUS, king of Denmark. </PERSONA> + <PERSONA>HAMLET, son to the late, and nephew to the present king.</PERSONA> + +The root element of the document is available as ``doc.documentElement``, and +its children can be easily modified by deleting, adding, or removing nodes:: + + root = doc.documentElement + + # Remove the first child + root.removeChild( root.childNodes[0] ) + + # Move the new first child to the end + root.appendChild( root.childNodes[0] ) + + # Insert the new first child (originally, + # the third child) before the 20th child. + root.insertBefore( root.childNodes[0], root.childNodes[20] ) + +Again, I will refer you to the Python documentation for a complete listing of +the different :class:`Node` classes and their various methods. + + +Relationship to PyXML +--------------------- + +The XML Special Interest Group has been working on XML-related Python code for a +while. Its code distribution, called PyXML, is available from the SIG's Web +pages at http://www.python.org/sigs/xml-sig/. The PyXML distribution also used +the package name ``xml``. If you've written programs that used PyXML, you're +probably wondering about its compatibility with the 2.0 :mod:`xml` package. + +The answer is that Python 2.0's :mod:`xml` package isn't compatible with PyXML, +but can be made compatible by installing a recent version PyXML. Many +applications can get by with the XML support that is included with Python 2.0, +but more complicated applications will require that the full PyXML package will +be installed. When installed, PyXML versions 0.6.0 or greater will replace the +:mod:`xml` package shipped with Python, and will be a strict superset of the +standard package, adding a bunch of additional features. Some of the additional +features in PyXML include: + +* 4DOM, a full DOM implementation from FourThought, Inc. + +* The xmlproc validating parser, written by Lars Marius Garshol. + +* The :mod:`sgmlop` parser accelerator module, written by Fredrik Lundh. + +.. % ====================================================================== + + +Module changes +============== + +Lots of improvements and bugfixes were made to Python's extensive standard +library; some of the affected modules include :mod:`readline`, +:mod:`ConfigParser`, :mod:`cgi`, :mod:`calendar`, :mod:`posix`, :mod:`readline`, +:mod:`xmllib`, :mod:`aifc`, :mod:`chunk, wave`, :mod:`random`, :mod:`shelve`, +and :mod:`nntplib`. Consult the CVS logs for the exact patch-by-patch details. + +Brian Gallew contributed OpenSSL support for the :mod:`socket` module. OpenSSL +is an implementation of the Secure Socket Layer, which encrypts the data being +sent over a socket. When compiling Python, you can edit :file:`Modules/Setup` +to include SSL support, which adds an additional function to the :mod:`socket` +module: :func:`socket.ssl(socket, keyfile, certfile)`, which takes a socket +object and returns an SSL socket. The :mod:`httplib` and :mod:`urllib` modules +were also changed to support "https://" URLs, though no one has implemented FTP +or SMTP over SSL. + +The :mod:`httplib` module has been rewritten by Greg Stein to support HTTP/1.1. +Backward compatibility with the 1.5 version of :mod:`httplib` is provided, +though using HTTP/1.1 features such as pipelining will require rewriting code to +use a different set of interfaces. + +The :mod:`Tkinter` module now supports Tcl/Tk version 8.1, 8.2, or 8.3, and +support for the older 7.x versions has been dropped. The Tkinter module now +supports displaying Unicode strings in Tk widgets. Also, Fredrik Lundh +contributed an optimization which makes operations like ``create_line`` and +``create_polygon`` much faster, especially when using lots of coordinates. + +The :mod:`curses` module has been greatly extended, starting from Oliver +Andrich's enhanced version, to provide many additional functions from ncurses +and SYSV curses, such as colour, alternative character set support, pads, and +mouse support. This means the module is no longer compatible with operating +systems that only have BSD curses, but there don't seem to be any currently +maintained OSes that fall into this category. + +As mentioned in the earlier discussion of 2.0's Unicode support, the underlying +implementation of the regular expressions provided by the :mod:`re` module has +been changed. SRE, a new regular expression engine written by Fredrik Lundh and +partially funded by Hewlett Packard, supports matching against both 8-bit +strings and Unicode strings. + +.. % ====================================================================== + + +New modules +=========== + +A number of new modules were added. We'll simply list them with brief +descriptions; consult the 2.0 documentation for the details of a particular +module. + +* :mod:`atexit`: For registering functions to be called before the Python + interpreter exits. Code that currently sets ``sys.exitfunc`` directly should be + changed to use the :mod:`atexit` module instead, importing :mod:`atexit` and + calling :func:`atexit.register` with the function to be called on exit. + (Contributed by Skip Montanaro.) + +* :mod:`codecs`, :mod:`encodings`, :mod:`unicodedata`: Added as part of the new + Unicode support. + +* :mod:`filecmp`: Supersedes the old :mod:`cmp`, :mod:`cmpcache` and + :mod:`dircmp` modules, which have now become deprecated. (Contributed by Gordon + MacMillan and Moshe Zadka.) + +* :mod:`gettext`: This module provides internationalization (I18N) and + localization (L10N) support for Python programs by providing an interface to the + GNU gettext message catalog library. (Integrated by Barry Warsaw, from separate + contributions by Martin von Löwis, Peter Funk, and James Henstridge.) + +* :mod:`linuxaudiodev`: Support for the :file:`/dev/audio` device on Linux, a + twin to the existing :mod:`sunaudiodev` module. (Contributed by Peter Bosch, + with fixes by Jeremy Hylton.) + +* :mod:`mmap`: An interface to memory-mapped files on both Windows and Unix. A + file's contents can be mapped directly into memory, at which point it behaves + like a mutable string, so its contents can be read and modified. They can even + be passed to functions that expect ordinary strings, such as the :mod:`re` + module. (Contributed by Sam Rushing, with some extensions by A.M. Kuchling.) + +* :mod:`pyexpat`: An interface to the Expat XML parser. (Contributed by Paul + Prescod.) + +* :mod:`robotparser`: Parse a :file:`robots.txt` file, which is used for writing + Web spiders that politely avoid certain areas of a Web site. The parser accepts + the contents of a :file:`robots.txt` file, builds a set of rules from it, and + can then answer questions about the fetchability of a given URL. (Contributed + by Skip Montanaro.) + +* :mod:`tabnanny`: A module/script to check Python source code for ambiguous + indentation. (Contributed by Tim Peters.) + +* :mod:`UserString`: A base class useful for deriving objects that behave like + strings. + +* :mod:`webbrowser`: A module that provides a platform independent way to launch + a web browser on a specific URL. For each platform, various browsers are tried + in a specific order. The user can alter which browser is launched by setting the + *BROWSER* environment variable. (Originally inspired by Eric S. Raymond's patch + to :mod:`urllib` which added similar functionality, but the final module comes + from code originally implemented by Fred Drake as + :file:`Tools/idle/BrowserControl.py`, and adapted for the standard library by + Fred.) + +* :mod:`_winreg`: An interface to the Windows registry. :mod:`_winreg` is an + adaptation of functions that have been part of PythonWin since 1995, but has now + been added to the core distribution, and enhanced to support Unicode. + :mod:`_winreg` was written by Bill Tutt and Mark Hammond. + +* :mod:`zipfile`: A module for reading and writing ZIP-format archives. These + are archives produced by :program:`PKZIP` on DOS/Windows or :program:`zip` on + Unix, not to be confused with :program:`gzip`\ -format files (which are + supported by the :mod:`gzip` module) (Contributed by James C. Ahlstrom.) + +* :mod:`imputil`: A module that provides a simpler way for writing customised + import hooks, in comparison to the existing :mod:`ihooks` module. (Implemented + by Greg Stein, with much discussion on python-dev along the way.) + +.. % ====================================================================== + + +IDLE Improvements +================= + +IDLE is the official Python cross-platform IDE, written using Tkinter. Python +2.0 includes IDLE 0.6, which adds a number of new features and improvements. A +partial list: + +* UI improvements and optimizations, especially in the area of syntax + highlighting and auto-indentation. + +* The class browser now shows more information, such as the top level functions + in a module. + +* Tab width is now a user settable option. When opening an existing Python file, + IDLE automatically detects the indentation conventions, and adapts. + +* There is now support for calling browsers on various platforms, used to open + the Python documentation in a browser. + +* IDLE now has a command line, which is largely similar to the vanilla Python + interpreter. + +* Call tips were added in many places. + +* IDLE can now be installed as a package. + +* In the editor window, there is now a line/column bar at the bottom. + +* Three new keystroke commands: Check module (Alt-F5), Import module (F5) and + Run script (Ctrl-F5). + +.. % ====================================================================== + + +Deleted and Deprecated Modules +============================== + +A few modules have been dropped because they're obsolete, or because there are +now better ways to do the same thing. The :mod:`stdwin` module is gone; it was +for a platform-independent windowing toolkit that's no longer developed. + +A number of modules have been moved to the :file:`lib-old` subdirectory: +:mod:`cmp`, :mod:`cmpcache`, :mod:`dircmp`, :mod:`dump`, :mod:`find`, +:mod:`grep`, :mod:`packmail`, :mod:`poly`, :mod:`util`, :mod:`whatsound`, +:mod:`zmod`. If you have code which relies on a module that's been moved to +:file:`lib-old`, you can simply add that directory to ``sys.path`` to get them +back, but you're encouraged to update any code that uses these modules. + + +Acknowledgements +================ + +The authors would like to thank the following people for offering suggestions on +various drafts of this article: David Bolen, Mark Hammond, Gregg Hauser, Jeremy +Hylton, Fredrik Lundh, Detlef Lannert, Aahz Maruch, Skip Montanaro, Vladimir +Marangozov, Tobias Polzin, Guido van Rossum, Neil Schemenauer, and Russ Schmidt. + diff --git a/Doc/whatsnew/2.1.rst b/Doc/whatsnew/2.1.rst new file mode 100644 index 0000000..2be11ba --- /dev/null +++ b/Doc/whatsnew/2.1.rst @@ -0,0 +1,794 @@ +**************************** + What's New in Python 2.1 +**************************** + +:Author: A.M. Kuchling + +.. |release| replace:: 1.01 + +.. % $Id: whatsnew21.tex 51211 2006-08-11 14:57:12Z thomas.wouters $ + + +Introduction +============ + +This article explains the new features in Python 2.1. While there aren't as +many changes in 2.1 as there were in Python 2.0, there are still some pleasant +surprises in store. 2.1 is the first release to be steered through the use of +Python Enhancement Proposals, or PEPs, so most of the sizable changes have +accompanying PEPs that provide more complete documentation and a design +rationale for the change. This article doesn't attempt to document the new +features completely, but simply provides an overview of the new features for +Python programmers. Refer to the Python 2.1 documentation, or to the specific +PEP, for more details about any new feature that particularly interests you. + +One recent goal of the Python development team has been to accelerate the pace +of new releases, with a new release coming every 6 to 9 months. 2.1 is the first +release to come out at this faster pace, with the first alpha appearing in +January, 3 months after the final version of 2.0 was released. + +The final release of Python 2.1 was made on April 17, 2001. + +.. % ====================================================================== + + +PEP 227: Nested Scopes +====================== + +The largest change in Python 2.1 is to Python's scoping rules. In Python 2.0, +at any given time there are at most three namespaces used to look up variable +names: local, module-level, and the built-in namespace. This often surprised +people because it didn't match their intuitive expectations. For example, a +nested recursive function definition doesn't work:: + + def f(): + ... + def g(value): + ... + return g(value-1) + 1 + ... + +The function :func:`g` will always raise a :exc:`NameError` exception, because +the binding of the name ``g`` isn't in either its local namespace or in the +module-level namespace. This isn't much of a problem in practice (how often do +you recursively define interior functions like this?), but this also made using +the :keyword:`lambda` statement clumsier, and this was a problem in practice. +In code which uses :keyword:`lambda` you can often find local variables being +copied by passing them as the default values of arguments. :: + + def find(self, name): + "Return list of any entries equal to 'name'" + L = filter(lambda x, name=name: x == name, + self.list_attribute) + return L + +The readability of Python code written in a strongly functional style suffers +greatly as a result. + +The most significant change to Python 2.1 is that static scoping has been added +to the language to fix this problem. As a first effect, the ``name=name`` +default argument is now unnecessary in the above example. Put simply, when a +given variable name is not assigned a value within a function (by an assignment, +or the :keyword:`def`, :keyword:`class`, or :keyword:`import` statements), +references to the variable will be looked up in the local namespace of the +enclosing scope. A more detailed explanation of the rules, and a dissection of +the implementation, can be found in the PEP. + +This change may cause some compatibility problems for code where the same +variable name is used both at the module level and as a local variable within a +function that contains further function definitions. This seems rather unlikely +though, since such code would have been pretty confusing to read in the first +place. + +One side effect of the change is that the ``from module import *`` and +:keyword:`exec` statements have been made illegal inside a function scope under +certain conditions. The Python reference manual has said all along that ``from +module import *`` is only legal at the top level of a module, but the CPython +interpreter has never enforced this before. As part of the implementation of +nested scopes, the compiler which turns Python source into bytecodes has to +generate different code to access variables in a containing scope. ``from +module import *`` and :keyword:`exec` make it impossible for the compiler to +figure this out, because they add names to the local namespace that are +unknowable at compile time. Therefore, if a function contains function +definitions or :keyword:`lambda` expressions with free variables, the compiler +will flag this by raising a :exc:`SyntaxError` exception. + +To make the preceding explanation a bit clearer, here's an example:: + + x = 1 + def f(): + # The next line is a syntax error + exec 'x=2' + def g(): + return x + +Line 4 containing the :keyword:`exec` statement is a syntax error, since +:keyword:`exec` would define a new local variable named ``x`` whose value should +be accessed by :func:`g`. + +This shouldn't be much of a limitation, since :keyword:`exec` is rarely used in +most Python code (and when it is used, it's often a sign of a poor design +anyway). + +Compatibility concerns have led to nested scopes being introduced gradually; in +Python 2.1, they aren't enabled by default, but can be turned on within a module +by using a future statement as described in PEP 236. (See the following section +for further discussion of PEP 236.) In Python 2.2, nested scopes will become +the default and there will be no way to turn them off, but users will have had +all of 2.1's lifetime to fix any breakage resulting from their introduction. + + +.. seealso:: + + :pep:`227` - Statically Nested Scopes + Written and implemented by Jeremy Hylton. + +.. % ====================================================================== + + +PEP 236: __future__ Directives +============================== + +The reaction to nested scopes was widespread concern about the dangers of +breaking code with the 2.1 release, and it was strong enough to make the +Pythoneers take a more conservative approach. This approach consists of +introducing a convention for enabling optional functionality in release N that +will become compulsory in release N+1. + +The syntax uses a ``from...import`` statement using the reserved module name +:mod:`__future__`. Nested scopes can be enabled by the following statement:: + + from __future__ import nested_scopes + +While it looks like a normal :keyword:`import` statement, it's not; there are +strict rules on where such a future statement can be put. They can only be at +the top of a module, and must precede any Python code or regular +:keyword:`import` statements. This is because such statements can affect how +the Python bytecode compiler parses code and generates bytecode, so they must +precede any statement that will result in bytecodes being produced. + + +.. seealso:: + + :pep:`236` - Back to the :mod:`__future__` + Written by Tim Peters, and primarily implemented by Jeremy Hylton. + +.. % ====================================================================== + + +PEP 207: Rich Comparisons +========================= + +In earlier versions, Python's support for implementing comparisons on user- +defined classes and extension types was quite simple. Classes could implement a +:meth:`__cmp__` method that was given two instances of a class, and could only +return 0 if they were equal or +1 or -1 if they weren't; the method couldn't +raise an exception or return anything other than a Boolean value. Users of +Numeric Python often found this model too weak and restrictive, because in the +number-crunching programs that numeric Python is used for, it would be more +useful to be able to perform elementwise comparisons of two matrices, returning +a matrix containing the results of a given comparison for each element. If the +two matrices are of different sizes, then the compare has to be able to raise an +exception to signal the error. + +In Python 2.1, rich comparisons were added in order to support this need. +Python classes can now individually overload each of the ``<``, ``<=``, ``>``, +``>=``, ``==``, and ``!=`` operations. The new magic method names are: + ++-----------+----------------+ +| Operation | Method name | ++===========+================+ +| ``<`` | :meth:`__lt__` | ++-----------+----------------+ +| ``<=`` | :meth:`__le__` | ++-----------+----------------+ +| ``>`` | :meth:`__gt__` | ++-----------+----------------+ +| ``>=`` | :meth:`__ge__` | ++-----------+----------------+ +| ``==`` | :meth:`__eq__` | ++-----------+----------------+ +| ``!=`` | :meth:`__ne__` | ++-----------+----------------+ + +(The magic methods are named after the corresponding Fortran operators ``.LT.``. +``.LE.``, &c. Numeric programmers are almost certainly quite familiar with +these names and will find them easy to remember.) + +Each of these magic methods is of the form ``method(self, other)``, where +``self`` will be the object on the left-hand side of the operator, while +``other`` will be the object on the right-hand side. For example, the +expression ``A < B`` will cause ``A.__lt__(B)`` to be called. + +Each of these magic methods can return anything at all: a Boolean, a matrix, a +list, or any other Python object. Alternatively they can raise an exception if +the comparison is impossible, inconsistent, or otherwise meaningless. + +The built-in :func:`cmp(A,B)` function can use the rich comparison machinery, +and now accepts an optional argument specifying which comparison operation to +use; this is given as one of the strings ``"<"``, ``"<="``, ``">"``, ``">="``, +``"=="``, or ``"!="``. If called without the optional third argument, +:func:`cmp` will only return -1, 0, or +1 as in previous versions of Python; +otherwise it will call the appropriate method and can return any Python object. + +There are also corresponding changes of interest to C programmers; there's a new +slot ``tp_richcmp`` in type objects and an API for performing a given rich +comparison. I won't cover the C API here, but will refer you to PEP 207, or to +2.1's C API documentation, for the full list of related functions. + + +.. seealso:: + + :pep:`207` - Rich Comparisions + Written by Guido van Rossum, heavily based on earlier work by David Ascher, and + implemented by Guido van Rossum. + +.. % ====================================================================== + + +PEP 230: Warning Framework +========================== + +Over its 10 years of existence, Python has accumulated a certain number of +obsolete modules and features along the way. It's difficult to know when a +feature is safe to remove, since there's no way of knowing how much code uses it +--- perhaps no programs depend on the feature, or perhaps many do. To enable +removing old features in a more structured way, a warning framework was added. +When the Python developers want to get rid of a feature, it will first trigger a +warning in the next version of Python. The following Python version can then +drop the feature, and users will have had a full release cycle to remove uses of +the old feature. + +Python 2.1 adds the warning framework to be used in this scheme. It adds a +:mod:`warnings` module that provide functions to issue warnings, and to filter +out warnings that you don't want to be displayed. Third-party modules can also +use this framework to deprecate old features that they no longer wish to +support. + +For example, in Python 2.1 the :mod:`regex` module is deprecated, so importing +it causes a warning to be printed:: + + >>> import regex + __main__:1: DeprecationWarning: the regex module + is deprecated; please use the re module + >>> + +Warnings can be issued by calling the :func:`warnings.warn` function:: + + warnings.warn("feature X no longer supported") + +The first parameter is the warning message; an additional optional parameters +can be used to specify a particular warning category. + +Filters can be added to disable certain warnings; a regular expression pattern +can be applied to the message or to the module name in order to suppress a +warning. For example, you may have a program that uses the :mod:`regex` module +and not want to spare the time to convert it to use the :mod:`re` module right +now. The warning can be suppressed by calling :: + + import warnings + warnings.filterwarnings(action = 'ignore', + message='.*regex module is deprecated', + category=DeprecationWarning, + module = '__main__') + +This adds a filter that will apply only to warnings of the class +:class:`DeprecationWarning` triggered in the :mod:`__main__` module, and applies +a regular expression to only match the message about the :mod:`regex` module +being deprecated, and will cause such warnings to be ignored. Warnings can also +be printed only once, printed every time the offending code is executed, or +turned into exceptions that will cause the program to stop (unless the +exceptions are caught in the usual way, of course). + +Functions were also added to Python's C API for issuing warnings; refer to PEP +230 or to Python's API documentation for the details. + + +.. seealso:: + + :pep:`5` - Guidelines for Language Evolution + Written by Paul Prescod, to specify procedures to be followed when removing old + features from Python. The policy described in this PEP hasn't been officially + adopted, but the eventual policy probably won't be too different from Prescod's + proposal. + + :pep:`230` - Warning Framework + Written and implemented by Guido van Rossum. + +.. % ====================================================================== + + +PEP 229: New Build System +========================= + +When compiling Python, the user had to go in and edit the :file:`Modules/Setup` +file in order to enable various additional modules; the default set is +relatively small and limited to modules that compile on most Unix platforms. +This means that on Unix platforms with many more features, most notably Linux, +Python installations often don't contain all useful modules they could. + +Python 2.0 added the Distutils, a set of modules for distributing and installing +extensions. In Python 2.1, the Distutils are used to compile much of the +standard library of extension modules, autodetecting which ones are supported on +the current machine. It's hoped that this will make Python installations easier +and more featureful. + +Instead of having to edit the :file:`Modules/Setup` file in order to enable +modules, a :file:`setup.py` script in the top directory of the Python source +distribution is run at build time, and attempts to discover which modules can be +enabled by examining the modules and header files on the system. If a module is +configured in :file:`Modules/Setup`, the :file:`setup.py` script won't attempt +to compile that module and will defer to the :file:`Modules/Setup` file's +contents. This provides a way to specific any strange command-line flags or +libraries that are required for a specific platform. + +In another far-reaching change to the build mechanism, Neil Schemenauer +restructured things so Python now uses a single makefile that isn't recursive, +instead of makefiles in the top directory and in each of the :file:`Python/`, +:file:`Parser/`, :file:`Objects/`, and :file:`Modules/` subdirectories. This +makes building Python faster and also makes hacking the Makefiles clearer and +simpler. + + +.. seealso:: + + :pep:`229` - Using Distutils to Build Python + Written and implemented by A.M. Kuchling. + +.. % ====================================================================== + + +PEP 205: Weak References +======================== + +Weak references, available through the :mod:`weakref` module, are a minor but +useful new data type in the Python programmer's toolbox. + +Storing a reference to an object (say, in a dictionary or a list) has the side +effect of keeping that object alive forever. There are a few specific cases +where this behaviour is undesirable, object caches being the most common one, +and another being circular references in data structures such as trees. + +For example, consider a memoizing function that caches the results of another +function :func:`f(x)` by storing the function's argument and its result in a +dictionary:: + + _cache = {} + def memoize(x): + if _cache.has_key(x): + return _cache[x] + + retval = f(x) + + # Cache the returned object + _cache[x] = retval + + return retval + +This version works for simple things such as integers, but it has a side effect; +the ``_cache`` dictionary holds a reference to the return values, so they'll +never be deallocated until the Python process exits and cleans up This isn't +very noticeable for integers, but if :func:`f` returns an object, or a data +structure that takes up a lot of memory, this can be a problem. + +Weak references provide a way to implement a cache that won't keep objects alive +beyond their time. If an object is only accessible through weak references, the +object will be deallocated and the weak references will now indicate that the +object it referred to no longer exists. A weak reference to an object *obj* is +created by calling ``wr = weakref.ref(obj)``. The object being referred to is +returned by calling the weak reference as if it were a function: ``wr()``. It +will return the referenced object, or ``None`` if the object no longer exists. + +This makes it possible to write a :func:`memoize` function whose cache doesn't +keep objects alive, by storing weak references in the cache. :: + + _cache = {} + def memoize(x): + if _cache.has_key(x): + obj = _cache[x]() + # If weak reference object still exists, + # return it + if obj is not None: return obj + + retval = f(x) + + # Cache a weak reference + _cache[x] = weakref.ref(retval) + + return retval + +The :mod:`weakref` module also allows creating proxy objects which behave like +weak references --- an object referenced only by proxy objects is deallocated -- +but instead of requiring an explicit call to retrieve the object, the proxy +transparently forwards all operations to the object as long as the object still +exists. If the object is deallocated, attempting to use a proxy will cause a +:exc:`weakref.ReferenceError` exception to be raised. :: + + proxy = weakref.proxy(obj) + proxy.attr # Equivalent to obj.attr + proxy.meth() # Equivalent to obj.meth() + del obj + proxy.attr # raises weakref.ReferenceError + + +.. seealso:: + + :pep:`205` - Weak References + Written and implemented by Fred L. Drake, Jr. + +.. % ====================================================================== + + +PEP 232: Function Attributes +============================ + +In Python 2.1, functions can now have arbitrary information attached to them. +People were often using docstrings to hold information about functions and +methods, because the ``__doc__`` attribute was the only way of attaching any +information to a function. For example, in the Zope Web application server, +functions are marked as safe for public access by having a docstring, and in +John Aycock's SPARK parsing framework, docstrings hold parts of the BNF grammar +to be parsed. This overloading is unfortunate, since docstrings are really +intended to hold a function's documentation; for example, it means you can't +properly document functions intended for private use in Zope. + +Arbitrary attributes can now be set and retrieved on functions using the regular +Python syntax:: + + def f(): pass + + f.publish = 1 + f.secure = 1 + f.grammar = "A ::= B (C D)*" + +The dictionary containing attributes can be accessed as the function's +:attr:`__dict__`. Unlike the :attr:`__dict__` attribute of class instances, in +functions you can actually assign a new dictionary to :attr:`__dict__`, though +the new value is restricted to a regular Python dictionary; you *can't* be +tricky and set it to a :class:`UserDict` instance, or any other random object +that behaves like a mapping. + + +.. seealso:: + + :pep:`232` - Function Attributes + Written and implemented by Barry Warsaw. + +.. % ====================================================================== + + +PEP 235: Importing Modules on Case-Insensitive Platforms +======================================================== + +Some operating systems have filesystems that are case-insensitive, MacOS and +Windows being the primary examples; on these systems, it's impossible to +distinguish the filenames ``FILE.PY`` and ``file.py``, even though they do store +the file's name in its original case (they're case-preserving, too). + +In Python 2.1, the :keyword:`import` statement will work to simulate case- +sensitivity on case-insensitive platforms. Python will now search for the first +case-sensitive match by default, raising an :exc:`ImportError` if no such file +is found, so ``import file`` will not import a module named ``FILE.PY``. Case- +insensitive matching can be requested by setting the :envvar:`PYTHONCASEOK` +environment variable before starting the Python interpreter. + +.. % ====================================================================== + + +PEP 217: Interactive Display Hook +================================= + +When using the Python interpreter interactively, the output of commands is +displayed using the built-in :func:`repr` function. In Python 2.1, the variable +:func:`sys.displayhook` can be set to a callable object which will be called +instead of :func:`repr`. For example, you can set it to a special pretty- +printing function:: + + >>> # Create a recursive data structure + ... L = [1,2,3] + >>> L.append(L) + >>> L # Show Python's default output + [1, 2, 3, [...]] + >>> # Use pprint.pprint() as the display function + ... import sys, pprint + >>> sys.displayhook = pprint.pprint + >>> L + [1, 2, 3, <Recursion on list with id=135143996>] + >>> + + +.. seealso:: + + :pep:`217` - Display Hook for Interactive Use + Written and implemented by Moshe Zadka. + +.. % ====================================================================== + + +PEP 208: New Coercion Model +=========================== + +How numeric coercion is done at the C level was significantly modified. This +will only affect the authors of C extensions to Python, allowing them more +flexibility in writing extension types that support numeric operations. + +Extension types can now set the type flag ``Py_TPFLAGS_CHECKTYPES`` in their +``PyTypeObject`` structure to indicate that they support the new coercion model. +In such extension types, the numeric slot functions can no longer assume that +they'll be passed two arguments of the same type; instead they may be passed two +arguments of differing types, and can then perform their own internal coercion. +If the slot function is passed a type it can't handle, it can indicate the +failure by returning a reference to the ``Py_NotImplemented`` singleton value. +The numeric functions of the other type will then be tried, and perhaps they can +handle the operation; if the other type also returns ``Py_NotImplemented``, then +a :exc:`TypeError` will be raised. Numeric methods written in Python can also +return ``Py_NotImplemented``, causing the interpreter to act as if the method +did not exist (perhaps raising a :exc:`TypeError`, perhaps trying another +object's numeric methods). + + +.. seealso:: + + :pep:`208` - Reworking the Coercion Model + Written and implemented by Neil Schemenauer, heavily based upon earlier work by + Marc-André Lemburg. Read this to understand the fine points of how numeric + operations will now be processed at the C level. + +.. % ====================================================================== + + +PEP 241: Metadata in Python Packages +==================================== + +A common complaint from Python users is that there's no single catalog of all +the Python modules in existence. T. Middleton's Vaults of Parnassus at +http://www.vex.net/parnassus/ are the largest catalog of Python modules, but +registering software at the Vaults is optional, and many people don't bother. + +As a first small step toward fixing the problem, Python software packaged using +the Distutils :command:`sdist` command will include a file named +:file:`PKG-INFO` containing information about the package such as its name, +version, and author (metadata, in cataloguing terminology). PEP 241 contains +the full list of fields that can be present in the :file:`PKG-INFO` file. As +people began to package their software using Python 2.1, more and more packages +will include metadata, making it possible to build automated cataloguing systems +and experiment with them. With the result experience, perhaps it'll be possible +to design a really good catalog and then build support for it into Python 2.2. +For example, the Distutils :command:`sdist` and :command:`bdist_\*` commands +could support a :option:`upload` option that would automatically upload your +package to a catalog server. + +You can start creating packages containing :file:`PKG-INFO` even if you're not +using Python 2.1, since a new release of the Distutils will be made for users of +earlier Python versions. Version 1.0.2 of the Distutils includes the changes +described in PEP 241, as well as various bugfixes and enhancements. It will be +available from the Distutils SIG at http://www.python.org/sigs/distutils-sig/. + + +.. seealso:: + + :pep:`241` - Metadata for Python Software Packages + Written and implemented by A.M. Kuchling. + + :pep:`243` - Module Repository Upload Mechanism + Written by Sean Reifschneider, this draft PEP describes a proposed mechanism for + uploading Python packages to a central server. + +.. % ====================================================================== + + +New and Improved Modules +======================== + +* Ka-Ping Yee contributed two new modules: :mod:`inspect.py`, a module for + getting information about live Python code, and :mod:`pydoc.py`, a module for + interactively converting docstrings to HTML or text. As a bonus, + :file:`Tools/scripts/pydoc`, which is now automatically installed, uses + :mod:`pydoc.py` to display documentation given a Python module, package, or + class name. For example, ``pydoc xml.dom`` displays the following:: + + Python Library Documentation: package xml.dom in xml + + NAME + xml.dom - W3C Document Object Model implementation for Python. + + FILE + /usr/local/lib/python2.1/xml/dom/__init__.pyc + + DESCRIPTION + The Python mapping of the Document Object Model is documented in the + Python Library Reference in the section on the xml.dom package. + + This package contains the following modules: + ... + + :file:`pydoc` also includes a Tk-based interactive help browser. :file:`pydoc` + quickly becomes addictive; try it out! + +* Two different modules for unit testing were added to the standard library. + The :mod:`doctest` module, contributed by Tim Peters, provides a testing + framework based on running embedded examples in docstrings and comparing the + results against the expected output. PyUnit, contributed by Steve Purcell, is a + unit testing framework inspired by JUnit, which was in turn an adaptation of + Kent Beck's Smalltalk testing framework. See http://pyunit.sourceforge.net/ for + more information about PyUnit. + +* The :mod:`difflib` module contains a class, :class:`SequenceMatcher`, which + compares two sequences and computes the changes required to transform one + sequence into the other. For example, this module can be used to write a tool + similar to the Unix :program:`diff` program, and in fact the sample program + :file:`Tools/scripts/ndiff.py` demonstrates how to write such a script. + +* :mod:`curses.panel`, a wrapper for the panel library, part of ncurses and of + SYSV curses, was contributed by Thomas Gellekum. The panel library provides + windows with the additional feature of depth. Windows can be moved higher or + lower in the depth ordering, and the panel library figures out where panels + overlap and which sections are visible. + +* The PyXML package has gone through a few releases since Python 2.0, and Python + 2.1 includes an updated version of the :mod:`xml` package. Some of the + noteworthy changes include support for Expat 1.2 and later versions, the ability + for Expat parsers to handle files in any encoding supported by Python, and + various bugfixes for SAX, DOM, and the :mod:`minidom` module. + +* Ping also contributed another hook for handling uncaught exceptions. + :func:`sys.excepthook` can be set to a callable object. When an exception isn't + caught by any :keyword:`try`...\ :keyword:`except` blocks, the exception will be + passed to :func:`sys.excepthook`, which can then do whatever it likes. At the + Ninth Python Conference, Ping demonstrated an application for this hook: + printing an extended traceback that not only lists the stack frames, but also + lists the function arguments and the local variables for each frame. + +* Various functions in the :mod:`time` module, such as :func:`asctime` and + :func:`localtime`, require a floating point argument containing the time in + seconds since the epoch. The most common use of these functions is to work with + the current time, so the floating point argument has been made optional; when a + value isn't provided, the current time will be used. For example, log file + entries usually need a string containing the current time; in Python 2.1, + ``time.asctime()`` can be used, instead of the lengthier + ``time.asctime(time.localtime(time.time()))`` that was previously required. + + This change was proposed and implemented by Thomas Wouters. + +* The :mod:`ftplib` module now defaults to retrieving files in passive mode, + because passive mode is more likely to work from behind a firewall. This + request came from the Debian bug tracking system, since other Debian packages + use :mod:`ftplib` to retrieve files and then don't work from behind a firewall. + It's deemed unlikely that this will cause problems for anyone, because Netscape + defaults to passive mode and few people complain, but if passive mode is + unsuitable for your application or network setup, call :meth:`set_pasv(0)` on + FTP objects to disable passive mode. + +* Support for raw socket access has been added to the :mod:`socket` module, + contributed by Grant Edwards. + +* The :mod:`pstats` module now contains a simple interactive statistics browser + for displaying timing profiles for Python programs, invoked when the module is + run as a script. Contributed by Eric S. Raymond. + +* A new implementation-dependent function, :func:`sys._getframe([depth])`, has + been added to return a given frame object from the current call stack. + :func:`sys._getframe` returns the frame at the top of the call stack; if the + optional integer argument *depth* is supplied, the function returns the frame + that is *depth* calls below the top of the stack. For example, + ``sys._getframe(1)`` returns the caller's frame object. + + This function is only present in CPython, not in Jython or the .NET + implementation. Use it for debugging, and resist the temptation to put it into + production code. + +.. % ====================================================================== + + +Other Changes and Fixes +======================= + +There were relatively few smaller changes made in Python 2.1 due to the shorter +release cycle. A search through the CVS change logs turns up 117 patches +applied, and 136 bugs fixed; both figures are likely to be underestimates. Some +of the more notable changes are: + +* A specialized object allocator is now optionally available, that should be + faster than the system :func:`malloc` and have less memory overhead. The + allocator uses C's :func:`malloc` function to get large pools of memory, and + then fulfills smaller memory requests from these pools. It can be enabled by + providing the :option:`--with-pymalloc` option to the :program:`configure` + script; see :file:`Objects/obmalloc.c` for the implementation details. + + Authors of C extension modules should test their code with the object allocator + enabled, because some incorrect code may break, causing core dumps at runtime. + There are a bunch of memory allocation functions in Python's C API that have + previously been just aliases for the C library's :func:`malloc` and + :func:`free`, meaning that if you accidentally called mismatched functions, the + error wouldn't be noticeable. When the object allocator is enabled, these + functions aren't aliases of :func:`malloc` and :func:`free` any more, and + calling the wrong function to free memory will get you a core dump. For + example, if memory was allocated using :func:`PyMem_New`, it has to be freed + using :func:`PyMem_Del`, not :func:`free`. A few modules included with Python + fell afoul of this and had to be fixed; doubtless there are more third-party + modules that will have the same problem. + + The object allocator was contributed by Vladimir Marangozov. + +* The speed of line-oriented file I/O has been improved because people often + complain about its lack of speed, and because it's often been used as a naïve + benchmark. The :meth:`readline` method of file objects has therefore been + rewritten to be much faster. The exact amount of the speedup will vary from + platform to platform depending on how slow the C library's :func:`getc` was, but + is around 66%, and potentially much faster on some particular operating systems. + Tim Peters did much of the benchmarking and coding for this change, motivated by + a discussion in comp.lang.python. + + A new module and method for file objects was also added, contributed by Jeff + Epler. The new method, :meth:`xreadlines`, is similar to the existing + :func:`xrange` built-in. :func:`xreadlines` returns an opaque sequence object + that only supports being iterated over, reading a line on every iteration but + not reading the entire file into memory as the existing :meth:`readlines` method + does. You'd use it like this:: + + for line in sys.stdin.xreadlines(): + # ... do something for each line ... + ... + + For a fuller discussion of the line I/O changes, see the python-dev summary for + January 1-15, 2001 at http://www.python.org/dev/summary/2001-01-1.html. + +* A new method, :meth:`popitem`, was added to dictionaries to enable + destructively iterating through the contents of a dictionary; this can be faster + for large dictionaries because there's no need to construct a list containing + all the keys or values. ``D.popitem()`` removes a random ``(key, value)`` pair + from the dictionary ``D`` and returns it as a 2-tuple. This was implemented + mostly by Tim Peters and Guido van Rossum, after a suggestion and preliminary + patch by Moshe Zadka. + +* Modules can now control which names are imported when ``from module import *`` + is used, by defining an ``__all__`` attribute containing a list of names that + will be imported. One common complaint is that if the module imports other + modules such as :mod:`sys` or :mod:`string`, ``from module import *`` will add + them to the importing module's namespace. To fix this, simply list the public + names in ``__all__``:: + + # List public names + __all__ = ['Database', 'open'] + + A stricter version of this patch was first suggested and implemented by Ben + Wolfson, but after some python-dev discussion, a weaker final version was + checked in. + +* Applying :func:`repr` to strings previously used octal escapes for + non-printable characters; for example, a newline was ``'\012'``. This was a + vestigial trace of Python's C ancestry, but today octal is of very little + practical use. Ka-Ping Yee suggested using hex escapes instead of octal ones, + and using the ``\n``, ``\t``, ``\r`` escapes for the appropriate characters, + and implemented this new formatting. + +* Syntax errors detected at compile-time can now raise exceptions containing the + filename and line number of the error, a pleasant side effect of the compiler + reorganization done by Jeremy Hylton. + +* C extensions which import other modules have been changed to use + :func:`PyImport_ImportModule`, which means that they will use any import hooks + that have been installed. This is also encouraged for third-party extensions + that need to import some other module from C code. + +* The size of the Unicode character database was shrunk by another 340K thanks + to Fredrik Lundh. + +* Some new ports were contributed: MacOS X (by Steven Majewski), Cygwin (by + Jason Tishler); RISCOS (by Dietmar Schwertberger); Unixware 7 (by Billy G. + Allie). + +And there's the usual list of minor bugfixes, minor memory leaks, docstring +edits, and other tweaks, too lengthy to be worth itemizing; see the CVS logs for +the full details if you want them. + +.. % ====================================================================== + + +Acknowledgements +================ + +The author would like to thank the following people for offering suggestions on +various drafts of this article: Graeme Cross, David Goodger, Jay Graves, Michael +Hudson, Marc-André Lemburg, Fredrik Lundh, Neil Schemenauer, Thomas Wouters. + diff --git a/Doc/whatsnew/2.2.rst b/Doc/whatsnew/2.2.rst new file mode 100644 index 0000000..6a7e0e8 --- /dev/null +++ b/Doc/whatsnew/2.2.rst @@ -0,0 +1,1269 @@ +**************************** + What's New in Python 2.2 +**************************** + +:Author: A.M. Kuchling + +.. |release| replace:: 1.02 + +.. % $Id: whatsnew22.tex 37315 2004-09-10 19:33:00Z akuchling $ + + +Introduction +============ + +This article explains the new features in Python 2.2.2, released on October 14, +2002. Python 2.2.2 is a bugfix release of Python 2.2, originally released on +December 21, 2001. + +Python 2.2 can be thought of as the "cleanup release". There are some features +such as generators and iterators that are completely new, but most of the +changes, significant and far-reaching though they may be, are aimed at cleaning +up irregularities and dark corners of the language design. + +This article doesn't attempt to provide a complete specification of the new +features, but instead provides a convenient overview. For full details, you +should refer to the documentation for Python 2.2, such as the `Python Library +Reference <http://www.python.org/doc/2.2/lib/lib.html>`_ and the `Python +Reference Manual <http://www.python.org/doc/2.2/ref/ref.html>`_. If you want to +understand the complete implementation and design rationale for a change, refer +to the PEP for a particular new feature. + + +.. seealso:: + + http://www.unixreview.com/documents/s=1356/urm0109h/0109h.htm + "What's So Special About Python 2.2?" is also about the new 2.2 features, and + was written by Cameron Laird and Kathryn Soraiz. + +.. % ====================================================================== + + +PEPs 252 and 253: Type and Class Changes +======================================== + +The largest and most far-reaching changes in Python 2.2 are to Python's model of +objects and classes. The changes should be backward compatible, so it's likely +that your code will continue to run unchanged, but the changes provide some +amazing new capabilities. Before beginning this, the longest and most +complicated section of this article, I'll provide an overview of the changes and +offer some comments. + +A long time ago I wrote a Web page (http://www.amk.ca/python/writing/warts.html) +listing flaws in Python's design. One of the most significant flaws was that +it's impossible to subclass Python types implemented in C. In particular, it's +not possible to subclass built-in types, so you can't just subclass, say, lists +in order to add a single useful method to them. The :mod:`UserList` module +provides a class that supports all of the methods of lists and that can be +subclassed further, but there's lots of C code that expects a regular Python +list and won't accept a :class:`UserList` instance. + +Python 2.2 fixes this, and in the process adds some exciting new capabilities. +A brief summary: + +* You can subclass built-in types such as lists and even integers, and your + subclasses should work in every place that requires the original type. + +* It's now possible to define static and class methods, in addition to the + instance methods available in previous versions of Python. + +* It's also possible to automatically call methods on accessing or setting an + instance attribute by using a new mechanism called :dfn:`properties`. Many uses + of :meth:`__getattr__` can be rewritten to use properties instead, making the + resulting code simpler and faster. As a small side benefit, attributes can now + have docstrings, too. + +* The list of legal attributes for an instance can be limited to a particular + set using :dfn:`slots`, making it possible to safeguard against typos and + perhaps make more optimizations possible in future versions of Python. + +Some users have voiced concern about all these changes. Sure, they say, the new +features are neat and lend themselves to all sorts of tricks that weren't +possible in previous versions of Python, but they also make the language more +complicated. Some people have said that they've always recommended Python for +its simplicity, and feel that its simplicity is being lost. + +Personally, I think there's no need to worry. Many of the new features are +quite esoteric, and you can write a lot of Python code without ever needed to be +aware of them. Writing a simple class is no more difficult than it ever was, so +you don't need to bother learning or teaching them unless they're actually +needed. Some very complicated tasks that were previously only possible from C +will now be possible in pure Python, and to my mind that's all for the better. + +I'm not going to attempt to cover every single corner case and small change that +were required to make the new features work. Instead this section will paint +only the broad strokes. See section :ref:`sect-rellinks`, "Related Links", for +further sources of information about Python 2.2's new object model. + + +Old and New Classes +------------------- + +First, you should know that Python 2.2 really has two kinds of classes: classic +or old-style classes, and new-style classes. The old-style class model is +exactly the same as the class model in earlier versions of Python. All the new +features described in this section apply only to new-style classes. This +divergence isn't intended to last forever; eventually old-style classes will be +dropped, possibly in Python 3.0. + +So how do you define a new-style class? You do it by subclassing an existing +new-style class. Most of Python's built-in types, such as integers, lists, +dictionaries, and even files, are new-style classes now. A new-style class +named :class:`object`, the base class for all built-in types, has also been +added so if no built-in type is suitable, you can just subclass +:class:`object`:: + + class C(object): + def __init__ (self): + ... + ... + +This means that :keyword:`class` statements that don't have any base classes are +always classic classes in Python 2.2. (Actually you can also change this by +setting a module-level variable named :attr:`__metaclass__` --- see :pep:`253` +for the details --- but it's easier to just subclass :keyword:`object`.) + +The type objects for the built-in types are available as built-ins, named using +a clever trick. Python has always had built-in functions named :func:`int`, +:func:`float`, and :func:`str`. In 2.2, they aren't functions any more, but +type objects that behave as factories when called. :: + + >>> int + <type 'int'> + >>> int('123') + 123 + +To make the set of types complete, new type objects such as :func:`dict` and +:func:`file` have been added. Here's a more interesting example, adding a +:meth:`lock` method to file objects:: + + class LockableFile(file): + def lock (self, operation, length=0, start=0, whence=0): + import fcntl + return fcntl.lockf(self.fileno(), operation, + length, start, whence) + +The now-obsolete :mod:`posixfile` module contained a class that emulated all of +a file object's methods and also added a :meth:`lock` method, but this class +couldn't be passed to internal functions that expected a built-in file, +something which is possible with our new :class:`LockableFile`. + + +Descriptors +----------- + +In previous versions of Python, there was no consistent way to discover what +attributes and methods were supported by an object. There were some informal +conventions, such as defining :attr:`__members__` and :attr:`__methods__` +attributes that were lists of names, but often the author of an extension type +or a class wouldn't bother to define them. You could fall back on inspecting +the :attr:`__dict__` of an object, but when class inheritance or an arbitrary +:meth:`__getattr__` hook were in use this could still be inaccurate. + +The one big idea underlying the new class model is that an API for describing +the attributes of an object using :dfn:`descriptors` has been formalized. +Descriptors specify the value of an attribute, stating whether it's a method or +a field. With the descriptor API, static methods and class methods become +possible, as well as more exotic constructs. + +Attribute descriptors are objects that live inside class objects, and have a few +attributes of their own: + +* :attr:`__name__` is the attribute's name. + +* :attr:`__doc__` is the attribute's docstring. + +* :meth:`__get__(object)` is a method that retrieves the attribute value from + *object*. + +* :meth:`__set__(object, value)` sets the attribute on *object* to *value*. + +* :meth:`__delete__(object, value)` deletes the *value* attribute of *object*. + +For example, when you write ``obj.x``, the steps that Python actually performs +are:: + + descriptor = obj.__class__.x + descriptor.__get__(obj) + +For methods, :meth:`descriptor.__get__` returns a temporary object that's +callable, and wraps up the instance and the method to be called on it. This is +also why static methods and class methods are now possible; they have +descriptors that wrap up just the method, or the method and the class. As a +brief explanation of these new kinds of methods, static methods aren't passed +the instance, and therefore resemble regular functions. Class methods are +passed the class of the object, but not the object itself. Static and class +methods are defined like this:: + + class C(object): + def f(arg1, arg2): + ... + f = staticmethod(f) + + def g(cls, arg1, arg2): + ... + g = classmethod(g) + +The :func:`staticmethod` function takes the function :func:`f`, and returns it +wrapped up in a descriptor so it can be stored in the class object. You might +expect there to be special syntax for creating such methods (``def static f``, +``defstatic f()``, or something like that) but no such syntax has been defined +yet; that's been left for future versions of Python. + +More new features, such as slots and properties, are also implemented as new +kinds of descriptors, and it's not difficult to write a descriptor class that +does something novel. For example, it would be possible to write a descriptor +class that made it possible to write Eiffel-style preconditions and +postconditions for a method. A class that used this feature might be defined +like this:: + + from eiffel import eiffelmethod + + class C(object): + def f(self, arg1, arg2): + # The actual function + ... + def pre_f(self): + # Check preconditions + ... + def post_f(self): + # Check postconditions + ... + + f = eiffelmethod(f, pre_f, post_f) + +Note that a person using the new :func:`eiffelmethod` doesn't have to understand +anything about descriptors. This is why I think the new features don't increase +the basic complexity of the language. There will be a few wizards who need to +know about it in order to write :func:`eiffelmethod` or the ZODB or whatever, +but most users will just write code on top of the resulting libraries and ignore +the implementation details. + + +Multiple Inheritance: The Diamond Rule +-------------------------------------- + +Multiple inheritance has also been made more useful through changing the rules +under which names are resolved. Consider this set of classes (diagram taken +from :pep:`253` by Guido van Rossum):: + + class A: + ^ ^ def save(self): ... + / \ + / \ + / \ + / \ + class B class C: + ^ ^ def save(self): ... + \ / + \ / + \ / + \ / + class D + +The lookup rule for classic classes is simple but not very smart; the base +classes are searched depth-first, going from left to right. A reference to +:meth:`D.save` will search the classes :class:`D`, :class:`B`, and then +:class:`A`, where :meth:`save` would be found and returned. :meth:`C.save` +would never be found at all. This is bad, because if :class:`C`'s :meth:`save` +method is saving some internal state specific to :class:`C`, not calling it will +result in that state never getting saved. + +New-style classes follow a different algorithm that's a bit more complicated to +explain, but does the right thing in this situation. (Note that Python 2.3 +changes this algorithm to one that produces the same results in most cases, but +produces more useful results for really complicated inheritance graphs.) + +#. List all the base classes, following the classic lookup rule and include a + class multiple times if it's visited repeatedly. In the above example, the list + of visited classes is [:class:`D`, :class:`B`, :class:`A`, :class:`C`, + :class:`A`]. + +#. Scan the list for duplicated classes. If any are found, remove all but one + occurrence, leaving the *last* one in the list. In the above example, the list + becomes [:class:`D`, :class:`B`, :class:`C`, :class:`A`] after dropping + duplicates. + +Following this rule, referring to :meth:`D.save` will return :meth:`C.save`, +which is the behaviour we're after. This lookup rule is the same as the one +followed by Common Lisp. A new built-in function, :func:`super`, provides a way +to get at a class's superclasses without having to reimplement Python's +algorithm. The most commonly used form will be :func:`super(class, obj)`, which +returns a bound superclass object (not the actual class object). This form +will be used in methods to call a method in the superclass; for example, +:class:`D`'s :meth:`save` method would look like this:: + + class D (B,C): + def save (self): + # Call superclass .save() + super(D, self).save() + # Save D's private information here + ... + +:func:`super` can also return unbound superclass objects when called as +:func:`super(class)` or :func:`super(class1, class2)`, but this probably won't +often be useful. + + +Attribute Access +---------------- + +A fair number of sophisticated Python classes define hooks for attribute access +using :meth:`__getattr__`; most commonly this is done for convenience, to make +code more readable by automatically mapping an attribute access such as +``obj.parent`` into a method call such as ``obj.get_parent``. Python 2.2 adds +some new ways of controlling attribute access. + +First, :meth:`__getattr__(attr_name)` is still supported by new-style classes, +and nothing about it has changed. As before, it will be called when an attempt +is made to access ``obj.foo`` and no attribute named ``foo`` is found in the +instance's dictionary. + +New-style classes also support a new method, +:meth:`__getattribute__(attr_name)`. The difference between the two methods is +that :meth:`__getattribute__` is *always* called whenever any attribute is +accessed, while the old :meth:`__getattr__` is only called if ``foo`` isn't +found in the instance's dictionary. + +However, Python 2.2's support for :dfn:`properties` will often be a simpler way +to trap attribute references. Writing a :meth:`__getattr__` method is +complicated because to avoid recursion you can't use regular attribute accesses +inside them, and instead have to mess around with the contents of +:attr:`__dict__`. :meth:`__getattr__` methods also end up being called by Python +when it checks for other methods such as :meth:`__repr__` or :meth:`__coerce__`, +and so have to be written with this in mind. Finally, calling a function on +every attribute access results in a sizable performance loss. + +:class:`property` is a new built-in type that packages up three functions that +get, set, or delete an attribute, and a docstring. For example, if you want to +define a :attr:`size` attribute that's computed, but also settable, you could +write:: + + class C(object): + def get_size (self): + result = ... computation ... + return result + def set_size (self, size): + ... compute something based on the size + and set internal state appropriately ... + + # Define a property. The 'delete this attribute' + # method is defined as None, so the attribute + # can't be deleted. + size = property(get_size, set_size, + None, + "Storage size of this instance") + +That is certainly clearer and easier to write than a pair of +:meth:`__getattr__`/:meth:`__setattr__` methods that check for the :attr:`size` +attribute and handle it specially while retrieving all other attributes from the +instance's :attr:`__dict__`. Accesses to :attr:`size` are also the only ones +which have to perform the work of calling a function, so references to other +attributes run at their usual speed. + +Finally, it's possible to constrain the list of attributes that can be +referenced on an object using the new :attr:`__slots__` class attribute. Python +objects are usually very dynamic; at any time it's possible to define a new +attribute on an instance by just doing ``obj.new_attr=1``. A new-style class +can define a class attribute named :attr:`__slots__` to limit the legal +attributes to a particular set of names. An example will make this clear:: + + >>> class C(object): + ... __slots__ = ('template', 'name') + ... + >>> obj = C() + >>> print obj.template + None + >>> obj.template = 'Test' + >>> print obj.template + Test + >>> obj.newattr = None + Traceback (most recent call last): + File "<stdin>", line 1, in ? + AttributeError: 'C' object has no attribute 'newattr' + +Note how you get an :exc:`AttributeError` on the attempt to assign to an +attribute not listed in :attr:`__slots__`. + + +.. _sect-rellinks: + +Related Links +------------- + +This section has just been a quick overview of the new features, giving enough +of an explanation to start you programming, but many details have been +simplified or ignored. Where should you go to get a more complete picture? + +http://www.python.org/2.2/descrintro.html is a lengthy tutorial introduction to +the descriptor features, written by Guido van Rossum. If my description has +whetted your appetite, go read this tutorial next, because it goes into much +more detail about the new features while still remaining quite easy to read. + +Next, there are two relevant PEPs, :pep:`252` and :pep:`253`. :pep:`252` is +titled "Making Types Look More Like Classes", and covers the descriptor API. +:pep:`253` is titled "Subtyping Built-in Types", and describes the changes to +type objects that make it possible to subtype built-in objects. :pep:`253` is +the more complicated PEP of the two, and at a few points the necessary +explanations of types and meta-types may cause your head to explode. Both PEPs +were written and implemented by Guido van Rossum, with substantial assistance +from the rest of the Zope Corp. team. + +Finally, there's the ultimate authority: the source code. Most of the machinery +for the type handling is in :file:`Objects/typeobject.c`, but you should only +resort to it after all other avenues have been exhausted, including posting a +question to python-list or python-dev. + +.. % ====================================================================== + + +PEP 234: Iterators +================== + +Another significant addition to 2.2 is an iteration interface at both the C and +Python levels. Objects can define how they can be looped over by callers. + +In Python versions up to 2.1, the usual way to make ``for item in obj`` work is +to define a :meth:`__getitem__` method that looks something like this:: + + def __getitem__(self, index): + return <next item> + +:meth:`__getitem__` is more properly used to define an indexing operation on an +object so that you can write ``obj[5]`` to retrieve the sixth element. It's a +bit misleading when you're using this only to support :keyword:`for` loops. +Consider some file-like object that wants to be looped over; the *index* +parameter is essentially meaningless, as the class probably assumes that a +series of :meth:`__getitem__` calls will be made with *index* incrementing by +one each time. In other words, the presence of the :meth:`__getitem__` method +doesn't mean that using ``file[5]`` to randomly access the sixth element will +work, though it really should. + +In Python 2.2, iteration can be implemented separately, and :meth:`__getitem__` +methods can be limited to classes that really do support random access. The +basic idea of iterators is simple. A new built-in function, :func:`iter(obj)` +or ``iter(C, sentinel)``, is used to get an iterator. :func:`iter(obj)` returns +an iterator for the object *obj*, while ``iter(C, sentinel)`` returns an +iterator that will invoke the callable object *C* until it returns *sentinel* to +signal that the iterator is done. + +Python classes can define an :meth:`__iter__` method, which should create and +return a new iterator for the object; if the object is its own iterator, this +method can just return ``self``. In particular, iterators will usually be their +own iterators. Extension types implemented in C can implement a :attr:`tp_iter` +function in order to return an iterator, and extension types that want to behave +as iterators can define a :attr:`tp_iternext` function. + +So, after all this, what do iterators actually do? They have one required +method, :meth:`next`, which takes no arguments and returns the next value. When +there are no more values to be returned, calling :meth:`next` should raise the +:exc:`StopIteration` exception. :: + + >>> L = [1,2,3] + >>> i = iter(L) + >>> print i + <iterator object at 0x8116870> + >>> i.next() + 1 + >>> i.next() + 2 + >>> i.next() + 3 + >>> i.next() + Traceback (most recent call last): + File "<stdin>", line 1, in ? + StopIteration + >>> + +In 2.2, Python's :keyword:`for` statement no longer expects a sequence; it +expects something for which :func:`iter` will return an iterator. For backward +compatibility and convenience, an iterator is automatically constructed for +sequences that don't implement :meth:`__iter__` or a :attr:`tp_iter` slot, so +``for i in [1,2,3]`` will still work. Wherever the Python interpreter loops +over a sequence, it's been changed to use the iterator protocol. This means you +can do things like this:: + + >>> L = [1,2,3] + >>> i = iter(L) + >>> a,b,c = i + >>> a,b,c + (1, 2, 3) + +Iterator support has been added to some of Python's basic types. Calling +:func:`iter` on a dictionary will return an iterator which loops over its keys:: + + >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6, + ... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12} + >>> for key in m: print key, m[key] + ... + Mar 3 + Feb 2 + Aug 8 + Sep 9 + May 5 + Jun 6 + Jul 7 + Jan 1 + Apr 4 + Nov 11 + Dec 12 + Oct 10 + +That's just the default behaviour. If you want to iterate over keys, values, or +key/value pairs, you can explicitly call the :meth:`iterkeys`, +:meth:`itervalues`, or :meth:`iteritems` methods to get an appropriate iterator. +In a minor related change, the :keyword:`in` operator now works on dictionaries, +so ``key in dict`` is now equivalent to ``dict.has_key(key)``. + +Files also provide an iterator, which calls the :meth:`readline` method until +there are no more lines in the file. This means you can now read each line of a +file using code like this:: + + for line in file: + # do something for each line + ... + +Note that you can only go forward in an iterator; there's no way to get the +previous element, reset the iterator, or make a copy of it. An iterator object +could provide such additional capabilities, but the iterator protocol only +requires a :meth:`next` method. + + +.. seealso:: + + :pep:`234` - Iterators + Written by Ka-Ping Yee and GvR; implemented by the Python Labs crew, mostly by + GvR and Tim Peters. + +.. % ====================================================================== + + +PEP 255: Simple Generators +========================== + +Generators are another new feature, one that interacts with the introduction of +iterators. + +You're doubtless familiar with how function calls work in Python or C. When you +call a function, it gets a private namespace where its local variables are +created. When the function reaches a :keyword:`return` statement, the local +variables are destroyed and the resulting value is returned to the caller. A +later call to the same function will get a fresh new set of local variables. +But, what if the local variables weren't thrown away on exiting a function? +What if you could later resume the function where it left off? This is what +generators provide; they can be thought of as resumable functions. + +Here's the simplest example of a generator function:: + + def generate_ints(N): + for i in range(N): + yield i + +A new keyword, :keyword:`yield`, was introduced for generators. Any function +containing a :keyword:`yield` statement is a generator function; this is +detected by Python's bytecode compiler which compiles the function specially as +a result. Because a new keyword was introduced, generators must be explicitly +enabled in a module by including a ``from __future__ import generators`` +statement near the top of the module's source code. In Python 2.3 this +statement will become unnecessary. + +When you call a generator function, it doesn't return a single value; instead it +returns a generator object that supports the iterator protocol. On executing +the :keyword:`yield` statement, the generator outputs the value of ``i``, +similar to a :keyword:`return` statement. The big difference between +:keyword:`yield` and a :keyword:`return` statement is that on reaching a +:keyword:`yield` the generator's state of execution is suspended and local +variables are preserved. On the next call to the generator's ``next()`` method, +the function will resume executing immediately after the :keyword:`yield` +statement. (For complicated reasons, the :keyword:`yield` statement isn't +allowed inside the :keyword:`try` block of a :keyword:`try`...\ +:keyword:`finally` statement; read :pep:`255` for a full explanation of the +interaction between :keyword:`yield` and exceptions.) + +Here's a sample usage of the :func:`generate_ints` generator:: + + >>> gen = generate_ints(3) + >>> gen + <generator object at 0x8117f90> + >>> gen.next() + 0 + >>> gen.next() + 1 + >>> gen.next() + 2 + >>> gen.next() + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "<stdin>", line 2, in generate_ints + StopIteration + +You could equally write ``for i in generate_ints(5)``, or ``a,b,c = +generate_ints(3)``. + +Inside a generator function, the :keyword:`return` statement can only be used +without a value, and signals the end of the procession of values; afterwards the +generator cannot return any further values. :keyword:`return` with a value, such +as ``return 5``, is a syntax error inside a generator function. The end of the +generator's results can also be indicated by raising :exc:`StopIteration` +manually, or by just letting the flow of execution fall off the bottom of the +function. + +You could achieve the effect of generators manually by writing your own class +and storing all the local variables of the generator as instance variables. For +example, returning a list of integers could be done by setting ``self.count`` to +0, and having the :meth:`next` method increment ``self.count`` and return it. +However, for a moderately complicated generator, writing a corresponding class +would be much messier. :file:`Lib/test/test_generators.py` contains a number of +more interesting examples. The simplest one implements an in-order traversal of +a tree using generators recursively. :: + + # A recursive generator that generates Tree leaves in in-order. + def inorder(t): + if t: + for x in inorder(t.left): + yield x + yield t.label + for x in inorder(t.right): + yield x + +Two other examples in :file:`Lib/test/test_generators.py` produce solutions for +the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no +queen threatens another) and the Knight's Tour (a route that takes a knight to +every square of an $NxN$ chessboard without visiting any square twice). + +The idea of generators comes from other programming languages, especially Icon +(http://www.cs.arizona.edu/icon/), where the idea of generators is central. In +Icon, every expression and function call behaves like a generator. One example +from "An Overview of the Icon Programming Language" at +http://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks +like:: + + sentence := "Store it in the neighboring harbor" + if (i := find("or", sentence)) > 5 then write(i) + +In Icon the :func:`find` function returns the indexes at which the substring +"or" is found: 3, 23, 33. In the :keyword:`if` statement, ``i`` is first +assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon +retries it with the second value of 23. 23 is greater than 5, so the comparison +now succeeds, and the code prints the value 23 to the screen. + +Python doesn't go nearly as far as Icon in adopting generators as a central +concept. Generators are considered a new part of the core Python language, but +learning or using them isn't compulsory; if they don't solve any problems that +you have, feel free to ignore them. One novel feature of Python's interface as +compared to Icon's is that a generator's state is represented as a concrete +object (the iterator) that can be passed around to other functions or stored in +a data structure. + + +.. seealso:: + + :pep:`255` - Simple Generators + Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland. Implemented mostly + by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew. + +.. % ====================================================================== + + +PEP 237: Unifying Long Integers and Integers +============================================ + +In recent versions, the distinction between regular integers, which are 32-bit +values on most machines, and long integers, which can be of arbitrary size, was +becoming an annoyance. For example, on platforms that support files larger than +``2**32`` bytes, the :meth:`tell` method of file objects has to return a long +integer. However, there were various bits of Python that expected plain integers +and would raise an error if a long integer was provided instead. For example, +in Python 1.5, only regular integers could be used as a slice index, and +``'abc'[1L:]`` would raise a :exc:`TypeError` exception with the message 'slice +index must be int'. + +Python 2.2 will shift values from short to long integers as required. The 'L' +suffix is no longer needed to indicate a long integer literal, as now the +compiler will choose the appropriate type. (Using the 'L' suffix will be +discouraged in future 2.x versions of Python, triggering a warning in Python +2.4, and probably dropped in Python 3.0.) Many operations that used to raise an +:exc:`OverflowError` will now return a long integer as their result. For +example:: + + >>> 1234567890123 + 1234567890123L + >>> 2 ** 64 + 18446744073709551616L + +In most cases, integers and long integers will now be treated identically. You +can still distinguish them with the :func:`type` built-in function, but that's +rarely needed. + + +.. seealso:: + + :pep:`237` - Unifying Long Integers and Integers + Written by Moshe Zadka and Guido van Rossum. Implemented mostly by Guido van + Rossum. + +.. % ====================================================================== + + +PEP 238: Changing the Division Operator +======================================= + +The most controversial change in Python 2.2 heralds the start of an effort to +fix an old design flaw that's been in Python from the beginning. Currently +Python's division operator, ``/``, behaves like C's division operator when +presented with two integer arguments: it returns an integer result that's +truncated down when there would be a fractional part. For example, ``3/2`` is +1, not 1.5, and ``(-1)/2`` is -1, not -0.5. This means that the results of +divison can vary unexpectedly depending on the type of the two operands and +because Python is dynamically typed, it can be difficult to determine the +possible types of the operands. + +(The controversy is over whether this is *really* a design flaw, and whether +it's worth breaking existing code to fix this. It's caused endless discussions +on python-dev, and in July 2001 erupted into an storm of acidly sarcastic +postings on :newsgroup:`comp.lang.python`. I won't argue for either side here +and will stick to describing what's implemented in 2.2. Read :pep:`238` for a +summary of arguments and counter-arguments.) + +Because this change might break code, it's being introduced very gradually. +Python 2.2 begins the transition, but the switch won't be complete until Python +3.0. + +First, I'll borrow some terminology from :pep:`238`. "True division" is the +division that most non-programmers are familiar with: 3/2 is 1.5, 1/4 is 0.25, +and so forth. "Floor division" is what Python's ``/`` operator currently does +when given integer operands; the result is the floor of the value returned by +true division. "Classic division" is the current mixed behaviour of ``/``; it +returns the result of floor division when the operands are integers, and returns +the result of true division when one of the operands is a floating-point number. + +Here are the changes 2.2 introduces: + +* A new operator, ``//``, is the floor division operator. (Yes, we know it looks + like C++'s comment symbol.) ``//`` *always* performs floor division no matter + what the types of its operands are, so ``1 // 2`` is 0 and ``1.0 // 2.0`` is + also 0.0. + + ``//`` is always available in Python 2.2; you don't need to enable it using a + ``__future__`` statement. + +* By including a ``from __future__ import division`` in a module, the ``/`` + operator will be changed to return the result of true division, so ``1/2`` is + 0.5. Without the ``__future__`` statement, ``/`` still means classic division. + The default meaning of ``/`` will not change until Python 3.0. + +* Classes can define methods called :meth:`__truediv__` and :meth:`__floordiv__` + to overload the two division operators. At the C level, there are also slots in + the :ctype:`PyNumberMethods` structure so extension types can define the two + operators. + +* Python 2.2 supports some command-line arguments for testing whether code will + works with the changed division semantics. Running python with :option:`-Q + warn` will cause a warning to be issued whenever division is applied to two + integers. You can use this to find code that's affected by the change and fix + it. By default, Python 2.2 will simply perform classic division without a + warning; the warning will be turned on by default in Python 2.3. + + +.. seealso:: + + :pep:`238` - Changing the Division Operator + Written by Moshe Zadka and Guido van Rossum. Implemented by Guido van Rossum.. + +.. % ====================================================================== + + +Unicode Changes +=============== + +Python's Unicode support has been enhanced a bit in 2.2. Unicode strings are +usually stored as UCS-2, as 16-bit unsigned integers. Python 2.2 can also be +compiled to use UCS-4, 32-bit unsigned integers, as its internal encoding by +supplying :option:`--enable-unicode=ucs4` to the configure script. (It's also +possible to specify :option:`--disable-unicode` to completely disable Unicode +support.) + +When built to use UCS-4 (a "wide Python"), the interpreter can natively handle +Unicode characters from U+000000 to U+110000, so the range of legal values for +the :func:`unichr` function is expanded accordingly. Using an interpreter +compiled to use UCS-2 (a "narrow Python"), values greater than 65535 will still +cause :func:`unichr` to raise a :exc:`ValueError` exception. This is all +described in :pep:`261`, "Support for 'wide' Unicode characters"; consult it for +further details. + +Another change is simpler to explain. Since their introduction, Unicode strings +have supported an :meth:`encode` method to convert the string to a selected +encoding such as UTF-8 or Latin-1. A symmetric :meth:`decode([*encoding*])` +method has been added to 8-bit strings (though not to Unicode strings) in 2.2. +:meth:`decode` assumes that the string is in the specified encoding and decodes +it, returning whatever is returned by the codec. + +Using this new feature, codecs have been added for tasks not directly related to +Unicode. For example, codecs have been added for uu-encoding, MIME's base64 +encoding, and compression with the :mod:`zlib` module:: + + >>> s = """Here is a lengthy piece of redundant, overly verbose, + ... and repetitive text. + ... """ + >>> data = s.encode('zlib') + >>> data + 'x\x9c\r\xc9\xc1\r\x80 \x10\x04\xc0?Ul...' + >>> data.decode('zlib') + 'Here is a lengthy piece of redundant, overly verbose,\nand repetitive text.\n' + >>> print s.encode('uu') + begin 666 <data> + M2&5R92!I<R!A(&QE;F=T:'D@<&EE8V4@;V8@<F5D=6YD86YT+"!O=F5R;'D@ + >=F5R8F]S92P*86YD(')E<&5T:71I=F4@=&5X="X* + + end + >>> "sheesh".encode('rot-13') + 'furrfu' + +To convert a class instance to Unicode, a :meth:`__unicode__` method can be +defined by a class, analogous to :meth:`__str__`. + +:meth:`encode`, :meth:`decode`, and :meth:`__unicode__` were implemented by +Marc-André Lemburg. The changes to support using UCS-4 internally were +implemented by Fredrik Lundh and Martin von Löwis. + + +.. seealso:: + + :pep:`261` - Support for 'wide' Unicode characters + Written by Paul Prescod. + +.. % ====================================================================== + + +PEP 227: Nested Scopes +====================== + +In Python 2.1, statically nested scopes were added as an optional feature, to be +enabled by a ``from __future__ import nested_scopes`` directive. In 2.2 nested +scopes no longer need to be specially enabled, and are now always present. The +rest of this section is a copy of the description of nested scopes from my +"What's New in Python 2.1" document; if you read it when 2.1 came out, you can +skip the rest of this section. + +The largest change introduced in Python 2.1, and made complete in 2.2, is to +Python's scoping rules. In Python 2.0, at any given time there are at most +three namespaces used to look up variable names: local, module-level, and the +built-in namespace. This often surprised people because it didn't match their +intuitive expectations. For example, a nested recursive function definition +doesn't work:: + + def f(): + ... + def g(value): + ... + return g(value-1) + 1 + ... + +The function :func:`g` will always raise a :exc:`NameError` exception, because +the binding of the name ``g`` isn't in either its local namespace or in the +module-level namespace. This isn't much of a problem in practice (how often do +you recursively define interior functions like this?), but this also made using +the :keyword:`lambda` statement clumsier, and this was a problem in practice. +In code which uses :keyword:`lambda` you can often find local variables being +copied by passing them as the default values of arguments. :: + + def find(self, name): + "Return list of any entries equal to 'name'" + L = filter(lambda x, name=name: x == name, + self.list_attribute) + return L + +The readability of Python code written in a strongly functional style suffers +greatly as a result. + +The most significant change to Python 2.2 is that static scoping has been added +to the language to fix this problem. As a first effect, the ``name=name`` +default argument is now unnecessary in the above example. Put simply, when a +given variable name is not assigned a value within a function (by an assignment, +or the :keyword:`def`, :keyword:`class`, or :keyword:`import` statements), +references to the variable will be looked up in the local namespace of the +enclosing scope. A more detailed explanation of the rules, and a dissection of +the implementation, can be found in the PEP. + +This change may cause some compatibility problems for code where the same +variable name is used both at the module level and as a local variable within a +function that contains further function definitions. This seems rather unlikely +though, since such code would have been pretty confusing to read in the first +place. + +One side effect of the change is that the ``from module import *`` and +:keyword:`exec` statements have been made illegal inside a function scope under +certain conditions. The Python reference manual has said all along that ``from +module import *`` is only legal at the top level of a module, but the CPython +interpreter has never enforced this before. As part of the implementation of +nested scopes, the compiler which turns Python source into bytecodes has to +generate different code to access variables in a containing scope. ``from +module import *`` and :keyword:`exec` make it impossible for the compiler to +figure this out, because they add names to the local namespace that are +unknowable at compile time. Therefore, if a function contains function +definitions or :keyword:`lambda` expressions with free variables, the compiler +will flag this by raising a :exc:`SyntaxError` exception. + +To make the preceding explanation a bit clearer, here's an example:: + + x = 1 + def f(): + # The next line is a syntax error + exec 'x=2' + def g(): + return x + +Line 4 containing the :keyword:`exec` statement is a syntax error, since +:keyword:`exec` would define a new local variable named ``x`` whose value should +be accessed by :func:`g`. + +This shouldn't be much of a limitation, since :keyword:`exec` is rarely used in +most Python code (and when it is used, it's often a sign of a poor design +anyway). + + +.. seealso:: + + :pep:`227` - Statically Nested Scopes + Written and implemented by Jeremy Hylton. + +.. % ====================================================================== + + +New and Improved Modules +======================== + +* The :mod:`xmlrpclib` module was contributed to the standard library by Fredrik + Lundh, providing support for writing XML-RPC clients. XML-RPC is a simple + remote procedure call protocol built on top of HTTP and XML. For example, the + following snippet retrieves a list of RSS channels from the O'Reilly Network, + and then lists the recent headlines for one channel:: + + import xmlrpclib + s = xmlrpclib.Server( + 'http://www.oreillynet.com/meerkat/xml-rpc/server.php') + channels = s.meerkat.getChannels() + # channels is a list of dictionaries, like this: + # [{'id': 4, 'title': 'Freshmeat Daily News'} + # {'id': 190, 'title': '32Bits Online'}, + # {'id': 4549, 'title': '3DGamers'}, ... ] + + # Get the items for one channel + items = s.meerkat.getItems( {'channel': 4} ) + + # 'items' is another list of dictionaries, like this: + # [{'link': 'http://freshmeat.net/releases/52719/', + # 'description': 'A utility which converts HTML to XSL FO.', + # 'title': 'html2fo 0.3 (Default)'}, ... ] + + The :mod:`SimpleXMLRPCServer` module makes it easy to create straightforward + XML-RPC servers. See http://www.xmlrpc.com/ for more information about XML-RPC. + +* The new :mod:`hmac` module implements the HMAC algorithm described by + :rfc:`2104`. (Contributed by Gerhard Häring.) + +* Several functions that originally returned lengthy tuples now return pseudo- + sequences that still behave like tuples but also have mnemonic attributes such + as memberst_mtime or :attr:`tm_year`. The enhanced functions include + :func:`stat`, :func:`fstat`, :func:`statvfs`, and :func:`fstatvfs` in the + :mod:`os` module, and :func:`localtime`, :func:`gmtime`, and :func:`strptime` in + the :mod:`time` module. + + For example, to obtain a file's size using the old tuples, you'd end up writing + something like ``file_size = os.stat(filename)[stat.ST_SIZE]``, but now this can + be written more clearly as ``file_size = os.stat(filename).st_size``. + + The original patch for this feature was contributed by Nick Mathewson. + +* The Python profiler has been extensively reworked and various errors in its + output have been corrected. (Contributed by Fred L. Drake, Jr. and Tim Peters.) + +* The :mod:`socket` module can be compiled to support IPv6; specify the + :option:`--enable-ipv6` option to Python's configure script. (Contributed by + Jun-ichiro "itojun" Hagino.) + +* Two new format characters were added to the :mod:`struct` module for 64-bit + integers on platforms that support the C :ctype:`long long` type. ``q`` is for + a signed 64-bit integer, and ``Q`` is for an unsigned one. The value is + returned in Python's long integer type. (Contributed by Tim Peters.) + +* In the interpreter's interactive mode, there's a new built-in function + :func:`help` that uses the :mod:`pydoc` module introduced in Python 2.1 to + provide interactive help. ``help(object)`` displays any available help text + about *object*. :func:`help` with no argument puts you in an online help + utility, where you can enter the names of functions, classes, or modules to read + their help text. (Contributed by Guido van Rossum, using Ka-Ping Yee's + :mod:`pydoc` module.) + +* Various bugfixes and performance improvements have been made to the SRE engine + underlying the :mod:`re` module. For example, the :func:`re.sub` and + :func:`re.split` functions have been rewritten in C. Another contributed patch + speeds up certain Unicode character ranges by a factor of two, and a new + :meth:`finditer` method that returns an iterator over all the non-overlapping + matches in a given string. (SRE is maintained by Fredrik Lundh. The + BIGCHARSET patch was contributed by Martin von Löwis.) + +* The :mod:`smtplib` module now supports :rfc:`2487`, "Secure SMTP over TLS", so + it's now possible to encrypt the SMTP traffic between a Python program and the + mail transport agent being handed a message. :mod:`smtplib` also supports SMTP + authentication. (Contributed by Gerhard Häring.) + +* The :mod:`imaplib` module, maintained by Piers Lauder, has support for several + new extensions: the NAMESPACE extension defined in :rfc:`2342`, SORT, GETACL and + SETACL. (Contributed by Anthony Baxter and Michel Pelletier.) + +* The :mod:`rfc822` module's parsing of email addresses is now compliant with + :rfc:`2822`, an update to :rfc:`822`. (The module's name is *not* going to be + changed to ``rfc2822``.) A new package, :mod:`email`, has also been added for + parsing and generating e-mail messages. (Contributed by Barry Warsaw, and + arising out of his work on Mailman.) + +* The :mod:`difflib` module now contains a new :class:`Differ` class for + producing human-readable lists of changes (a "delta") between two sequences of + lines of text. There are also two generator functions, :func:`ndiff` and + :func:`restore`, which respectively return a delta from two sequences, or one of + the original sequences from a delta. (Grunt work contributed by David Goodger, + from ndiff.py code by Tim Peters who then did the generatorization.) + +* New constants :const:`ascii_letters`, :const:`ascii_lowercase`, and + :const:`ascii_uppercase` were added to the :mod:`string` module. There were + several modules in the standard library that used :const:`string.letters` to + mean the ranges A-Za-z, but that assumption is incorrect when locales are in + use, because :const:`string.letters` varies depending on the set of legal + characters defined by the current locale. The buggy modules have all been fixed + to use :const:`ascii_letters` instead. (Reported by an unknown person; fixed by + Fred L. Drake, Jr.) + +* The :mod:`mimetypes` module now makes it easier to use alternative MIME-type + databases by the addition of a :class:`MimeTypes` class, which takes a list of + filenames to be parsed. (Contributed by Fred L. Drake, Jr.) + +* A :class:`Timer` class was added to the :mod:`threading` module that allows + scheduling an activity to happen at some future time. (Contributed by Itamar + Shtull-Trauring.) + +.. % ====================================================================== + + +Interpreter Changes and Fixes +============================= + +Some of the changes only affect people who deal with the Python interpreter at +the C level because they're writing Python extension modules, embedding the +interpreter, or just hacking on the interpreter itself. If you only write Python +code, none of the changes described here will affect you very much. + +* Profiling and tracing functions can now be implemented in C, which can operate + at much higher speeds than Python-based functions and should reduce the overhead + of profiling and tracing. This will be of interest to authors of development + environments for Python. Two new C functions were added to Python's API, + :cfunc:`PyEval_SetProfile` and :cfunc:`PyEval_SetTrace`. The existing + :func:`sys.setprofile` and :func:`sys.settrace` functions still exist, and have + simply been changed to use the new C-level interface. (Contributed by Fred L. + Drake, Jr.) + +* Another low-level API, primarily of interest to implementors of Python + debuggers and development tools, was added. :cfunc:`PyInterpreterState_Head` and + :cfunc:`PyInterpreterState_Next` let a caller walk through all the existing + interpreter objects; :cfunc:`PyInterpreterState_ThreadHead` and + :cfunc:`PyThreadState_Next` allow looping over all the thread states for a given + interpreter. (Contributed by David Beazley.) + +* The C-level interface to the garbage collector has been changed to make it + easier to write extension types that support garbage collection and to debug + misuses of the functions. Various functions have slightly different semantics, + so a bunch of functions had to be renamed. Extensions that use the old API will + still compile but will *not* participate in garbage collection, so updating them + for 2.2 should be considered fairly high priority. + + To upgrade an extension module to the new API, perform the following steps: + +* Rename :cfunc:`Py_TPFLAGS_GC` to :cfunc:`PyTPFLAGS_HAVE_GC`. + +* Use :cfunc:`PyObject_GC_New` or :cfunc:`PyObject_GC_NewVar` to allocate + objects, and :cfunc:`PyObject_GC_Del` to deallocate them. + +* Rename :cfunc:`PyObject_GC_Init` to :cfunc:`PyObject_GC_Track` and + :cfunc:`PyObject_GC_Fini` to :cfunc:`PyObject_GC_UnTrack`. + +* Remove :cfunc:`PyGC_HEAD_SIZE` from object size calculations. + +* Remove calls to :cfunc:`PyObject_AS_GC` and :cfunc:`PyObject_FROM_GC`. + +* A new ``et`` format sequence was added to :cfunc:`PyArg_ParseTuple`; ``et`` + takes both a parameter and an encoding name, and converts the parameter to the + given encoding if the parameter turns out to be a Unicode string, or leaves it + alone if it's an 8-bit string, assuming it to already be in the desired + encoding. This differs from the ``es`` format character, which assumes that + 8-bit strings are in Python's default ASCII encoding and converts them to the + specified new encoding. (Contributed by M.-A. Lemburg, and used for the MBCS + support on Windows described in the following section.) + +* A different argument parsing function, :cfunc:`PyArg_UnpackTuple`, has been + added that's simpler and presumably faster. Instead of specifying a format + string, the caller simply gives the minimum and maximum number of arguments + expected, and a set of pointers to :ctype:`PyObject\*` variables that will be + filled in with argument values. + +* Two new flags :const:`METH_NOARGS` and :const:`METH_O` are available in method + definition tables to simplify implementation of methods with no arguments or a + single untyped argument. Calling such methods is more efficient than calling a + corresponding method that uses :const:`METH_VARARGS`. Also, the old + :const:`METH_OLDARGS` style of writing C methods is now officially deprecated. + +* Two new wrapper functions, :cfunc:`PyOS_snprintf` and :cfunc:`PyOS_vsnprintf` + were added to provide cross-platform implementations for the relatively new + :cfunc:`snprintf` and :cfunc:`vsnprintf` C lib APIs. In contrast to the standard + :cfunc:`sprintf` and :cfunc:`vsprintf` functions, the Python versions check the + bounds of the buffer used to protect against buffer overruns. (Contributed by + M.-A. Lemburg.) + +* The :cfunc:`_PyTuple_Resize` function has lost an unused parameter, so now it + takes 2 parameters instead of 3. The third argument was never used, and can + simply be discarded when porting code from earlier versions to Python 2.2. + +.. % ====================================================================== + + +Other Changes and Fixes +======================= + +As usual there were a bunch of other improvements and bugfixes scattered +throughout the source tree. A search through the CVS change logs finds there +were 527 patches applied and 683 bugs fixed between Python 2.1 and 2.2; 2.2.1 +applied 139 patches and fixed 143 bugs; 2.2.2 applied 106 patches and fixed 82 +bugs. These figures are likely to be underestimates. + +Some of the more notable changes are: + +* The code for the MacOS port for Python, maintained by Jack Jansen, is now kept + in the main Python CVS tree, and many changes have been made to support MacOS X. + + The most significant change is the ability to build Python as a framework, + enabled by supplying the :option:`--enable-framework` option to the configure + script when compiling Python. According to Jack Jansen, "This installs a self- + contained Python installation plus the OS X framework "glue" into + :file:`/Library/Frameworks/Python.framework` (or another location of choice). + For now there is little immediate added benefit to this (actually, there is the + disadvantage that you have to change your PATH to be able to find Python), but + it is the basis for creating a full-blown Python application, porting the + MacPython IDE, possibly using Python as a standard OSA scripting language and + much more." + + Most of the MacPython toolbox modules, which interface to MacOS APIs such as + windowing, QuickTime, scripting, etc. have been ported to OS X, but they've been + left commented out in :file:`setup.py`. People who want to experiment with + these modules can uncomment them manually. + + .. % Jack's original comments: + .. % The main change is the possibility to build Python as a + .. % framework. This installs a self-contained Python installation plus the + .. % OSX framework "glue" into /Library/Frameworks/Python.framework (or + .. % another location of choice). For now there is little immedeate added + .. % benefit to this (actually, there is the disadvantage that you have to + .. % change your PATH to be able to find Python), but it is the basis for + .. % creating a fullblown Python application, porting the MacPython IDE, + .. % possibly using Python as a standard OSA scripting language and much + .. % more. You enable this with "configure --enable-framework". + .. % The other change is that most MacPython toolbox modules, which + .. % interface to all the MacOS APIs such as windowing, quicktime, + .. % scripting, etc. have been ported. Again, most of these are not of + .. % immedeate use, as they need a full application to be really useful, so + .. % they have been commented out in setup.py. People wanting to experiment + .. % can uncomment them. Gestalt and Internet Config modules are enabled by + .. % default. + +* Keyword arguments passed to builtin functions that don't take them now cause a + :exc:`TypeError` exception to be raised, with the message "*function* takes no + keyword arguments". + +* Weak references, added in Python 2.1 as an extension module, are now part of + the core because they're used in the implementation of new-style classes. The + :exc:`ReferenceError` exception has therefore moved from the :mod:`weakref` + module to become a built-in exception. + +* A new script, :file:`Tools/scripts/cleanfuture.py` by Tim Peters, + automatically removes obsolete ``__future__`` statements from Python source + code. + +* An additional *flags* argument has been added to the built-in function + :func:`compile`, so the behaviour of ``__future__`` statements can now be + correctly observed in simulated shells, such as those presented by IDLE and + other development environments. This is described in :pep:`264`. (Contributed + by Michael Hudson.) + +* The new license introduced with Python 1.6 wasn't GPL-compatible. This is + fixed by some minor textual changes to the 2.2 license, so it's now legal to + embed Python inside a GPLed program again. Note that Python itself is not + GPLed, but instead is under a license that's essentially equivalent to the BSD + license, same as it always was. The license changes were also applied to the + Python 2.0.1 and 2.1.1 releases. + +* When presented with a Unicode filename on Windows, Python will now convert it + to an MBCS encoded string, as used by the Microsoft file APIs. As MBCS is + explicitly used by the file APIs, Python's choice of ASCII as the default + encoding turns out to be an annoyance. On Unix, the locale's character set is + used if :func:`locale.nl_langinfo(CODESET)` is available. (Windows support was + contributed by Mark Hammond with assistance from Marc-André Lemburg. Unix + support was added by Martin von Löwis.) + +* Large file support is now enabled on Windows. (Contributed by Tim Peters.) + +* The :file:`Tools/scripts/ftpmirror.py` script now parses a :file:`.netrc` + file, if you have one. (Contributed by Mike Romberg.) + +* Some features of the object returned by the :func:`xrange` function are now + deprecated, and trigger warnings when they're accessed; they'll disappear in + Python 2.3. :class:`xrange` objects tried to pretend they were full sequence + types by supporting slicing, sequence multiplication, and the :keyword:`in` + operator, but these features were rarely used and therefore buggy. The + :meth:`tolist` method and the :attr:`start`, :attr:`stop`, and :attr:`step` + attributes are also being deprecated. At the C level, the fourth argument to + the :cfunc:`PyRange_New` function, ``repeat``, has also been deprecated. + +* There were a bunch of patches to the dictionary implementation, mostly to fix + potential core dumps if a dictionary contains objects that sneakily changed + their hash value, or mutated the dictionary they were contained in. For a while + python-dev fell into a gentle rhythm of Michael Hudson finding a case that + dumped core, Tim Peters fixing the bug, Michael finding another case, and round + and round it went. + +* On Windows, Python can now be compiled with Borland C thanks to a number of + patches contributed by Stephen Hansen, though the result isn't fully functional + yet. (But this *is* progress...) + +* Another Windows enhancement: Wise Solutions generously offered PythonLabs use + of their InstallerMaster 8.1 system. Earlier PythonLabs Windows installers used + Wise 5.0a, which was beginning to show its age. (Packaged up by Tim Peters.) + +* Files ending in ``.pyw`` can now be imported on Windows. ``.pyw`` is a + Windows-only thing, used to indicate that a script needs to be run using + PYTHONW.EXE instead of PYTHON.EXE in order to prevent a DOS console from popping + up to display the output. This patch makes it possible to import such scripts, + in case they're also usable as modules. (Implemented by David Bolen.) + +* On platforms where Python uses the C :cfunc:`dlopen` function to load + extension modules, it's now possible to set the flags used by :cfunc:`dlopen` + using the :func:`sys.getdlopenflags` and :func:`sys.setdlopenflags` functions. + (Contributed by Bram Stolk.) + +* The :func:`pow` built-in function no longer supports 3 arguments when + floating-point numbers are supplied. ``pow(x, y, z)`` returns ``(x**y) % z``, + but this is never useful for floating point numbers, and the final result varies + unpredictably depending on the platform. A call such as ``pow(2.0, 8.0, 7.0)`` + will now raise a :exc:`TypeError` exception. + +.. % ====================================================================== + + +Acknowledgements +================ + +The author would like to thank the following people for offering suggestions, +corrections and assistance with various drafts of this article: Fred Bremmer, +Keith Briggs, Andrew Dalke, Fred L. Drake, Jr., Carel Fellinger, David Goodger, +Mark Hammond, Stephen Hansen, Michael Hudson, Jack Jansen, Marc-André Lemburg, +Martin von Löwis, Fredrik Lundh, Michael McLay, Nick Mathewson, Paul Moore, +Gustavo Niemeyer, Don O'Donnell, Joonas Paalasma, Tim Peters, Jens Quade, Tom +Reinhardt, Neil Schemenauer, Guido van Rossum, Greg Ward, Edward Welbourne. + diff --git a/Doc/whatsnew/2.3.rst b/Doc/whatsnew/2.3.rst new file mode 100644 index 0000000..7dd4930 --- /dev/null +++ b/Doc/whatsnew/2.3.rst @@ -0,0 +1,2084 @@ +**************************** + What's New in Python 2.3 +**************************** + +:Author: A.M. Kuchling + +.. |release| replace:: 1.01 + +.. % $Id: whatsnew23.tex 55005 2007-04-27 19:54:29Z guido.van.rossum $ + +This article explains the new features in Python 2.3. Python 2.3 was released +on July 29, 2003. + +The main themes for Python 2.3 are polishing some of the features added in 2.2, +adding various small but useful enhancements to the core language, and expanding +the standard library. The new object model introduced in the previous version +has benefited from 18 months of bugfixes and from optimization efforts that have +improved the performance of new-style classes. A few new built-in functions +have been added such as :func:`sum` and :func:`enumerate`. The :keyword:`in` +operator can now be used for substring searches (e.g. ``"ab" in "abc"`` returns +:const:`True`). + +Some of the many new library features include Boolean, set, heap, and date/time +data types, the ability to import modules from ZIP-format archives, metadata +support for the long-awaited Python catalog, an updated version of IDLE, and +modules for logging messages, wrapping text, parsing CSV files, processing +command-line options, using BerkeleyDB databases... the list of new and +enhanced modules is lengthy. + +This article doesn't attempt to provide a complete specification of the new +features, but instead provides a convenient overview. For full details, you +should refer to the documentation for Python 2.3, such as the Python Library +Reference and the Python Reference Manual. If you want to understand the +complete implementation and design rationale, refer to the PEP for a particular +new feature. + +.. % ====================================================================== + + +PEP 218: A Standard Set Datatype +================================ + +The new :mod:`sets` module contains an implementation of a set datatype. The +:class:`Set` class is for mutable sets, sets that can have members added and +removed. The :class:`ImmutableSet` class is for sets that can't be modified, +and instances of :class:`ImmutableSet` can therefore be used as dictionary keys. +Sets are built on top of dictionaries, so the elements within a set must be +hashable. + +Here's a simple example:: + + >>> import sets + >>> S = sets.Set([1,2,3]) + >>> S + Set([1, 2, 3]) + >>> 1 in S + True + >>> 0 in S + False + >>> S.add(5) + >>> S.remove(3) + >>> S + Set([1, 2, 5]) + >>> + +The union and intersection of sets can be computed with the :meth:`union` and +:meth:`intersection` methods; an alternative notation uses the bitwise operators +``&`` and ``|``. Mutable sets also have in-place versions of these methods, +:meth:`union_update` and :meth:`intersection_update`. :: + + >>> S1 = sets.Set([1,2,3]) + >>> S2 = sets.Set([4,5,6]) + >>> S1.union(S2) + Set([1, 2, 3, 4, 5, 6]) + >>> S1 | S2 # Alternative notation + Set([1, 2, 3, 4, 5, 6]) + >>> S1.intersection(S2) + Set([]) + >>> S1 & S2 # Alternative notation + Set([]) + >>> S1.union_update(S2) + >>> S1 + Set([1, 2, 3, 4, 5, 6]) + >>> + +It's also possible to take the symmetric difference of two sets. This is the +set of all elements in the union that aren't in the intersection. Another way +of putting it is that the symmetric difference contains all elements that are in +exactly one set. Again, there's an alternative notation (``^``), and an in- +place version with the ungainly name :meth:`symmetric_difference_update`. :: + + >>> S1 = sets.Set([1,2,3,4]) + >>> S2 = sets.Set([3,4,5,6]) + >>> S1.symmetric_difference(S2) + Set([1, 2, 5, 6]) + >>> S1 ^ S2 + Set([1, 2, 5, 6]) + >>> + +There are also :meth:`issubset` and :meth:`issuperset` methods for checking +whether one set is a subset or superset of another:: + + >>> S1 = sets.Set([1,2,3]) + >>> S2 = sets.Set([2,3]) + >>> S2.issubset(S1) + True + >>> S1.issubset(S2) + False + >>> S1.issuperset(S2) + True + >>> + + +.. seealso:: + + :pep:`218` - Adding a Built-In Set Object Type + PEP written by Greg V. Wilson. Implemented by Greg V. Wilson, Alex Martelli, and + GvR. + +.. % ====================================================================== + + +.. _section-generators: + +PEP 255: Simple Generators +========================== + +In Python 2.2, generators were added as an optional feature, to be enabled by a +``from __future__ import generators`` directive. In 2.3 generators no longer +need to be specially enabled, and are now always present; this means that +:keyword:`yield` is now always a keyword. The rest of this section is a copy of +the description of generators from the "What's New in Python 2.2" document; if +you read it back when Python 2.2 came out, you can skip the rest of this +section. + +You're doubtless familiar with how function calls work in Python or C. When you +call a function, it gets a private namespace where its local variables are +created. When the function reaches a :keyword:`return` statement, the local +variables are destroyed and the resulting value is returned to the caller. A +later call to the same function will get a fresh new set of local variables. +But, what if the local variables weren't thrown away on exiting a function? +What if you could later resume the function where it left off? This is what +generators provide; they can be thought of as resumable functions. + +Here's the simplest example of a generator function:: + + def generate_ints(N): + for i in range(N): + yield i + +A new keyword, :keyword:`yield`, was introduced for generators. Any function +containing a :keyword:`yield` statement is a generator function; this is +detected by Python's bytecode compiler which compiles the function specially as +a result. + +When you call a generator function, it doesn't return a single value; instead it +returns a generator object that supports the iterator protocol. On executing +the :keyword:`yield` statement, the generator outputs the value of ``i``, +similar to a :keyword:`return` statement. The big difference between +:keyword:`yield` and a :keyword:`return` statement is that on reaching a +:keyword:`yield` the generator's state of execution is suspended and local +variables are preserved. On the next call to the generator's ``.next()`` +method, the function will resume executing immediately after the +:keyword:`yield` statement. (For complicated reasons, the :keyword:`yield` +statement isn't allowed inside the :keyword:`try` block of a :keyword:`try`...\ +:keyword:`finally` statement; read :pep:`255` for a full explanation of the +interaction between :keyword:`yield` and exceptions.) + +Here's a sample usage of the :func:`generate_ints` generator:: + + >>> gen = generate_ints(3) + >>> gen + <generator object at 0x8117f90> + >>> gen.next() + 0 + >>> gen.next() + 1 + >>> gen.next() + 2 + >>> gen.next() + Traceback (most recent call last): + File "stdin", line 1, in ? + File "stdin", line 2, in generate_ints + StopIteration + +You could equally write ``for i in generate_ints(5)``, or ``a,b,c = +generate_ints(3)``. + +Inside a generator function, the :keyword:`return` statement can only be used +without a value, and signals the end of the procession of values; afterwards the +generator cannot return any further values. :keyword:`return` with a value, such +as ``return 5``, is a syntax error inside a generator function. The end of the +generator's results can also be indicated by raising :exc:`StopIteration` +manually, or by just letting the flow of execution fall off the bottom of the +function. + +You could achieve the effect of generators manually by writing your own class +and storing all the local variables of the generator as instance variables. For +example, returning a list of integers could be done by setting ``self.count`` to +0, and having the :meth:`next` method increment ``self.count`` and return it. +However, for a moderately complicated generator, writing a corresponding class +would be much messier. :file:`Lib/test/test_generators.py` contains a number of +more interesting examples. The simplest one implements an in-order traversal of +a tree using generators recursively. :: + + # A recursive generator that generates Tree leaves in in-order. + def inorder(t): + if t: + for x in inorder(t.left): + yield x + yield t.label + for x in inorder(t.right): + yield x + +Two other examples in :file:`Lib/test/test_generators.py` produce solutions for +the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no +queen threatens another) and the Knight's Tour (a route that takes a knight to +every square of an $NxN$ chessboard without visiting any square twice). + +The idea of generators comes from other programming languages, especially Icon +(http://www.cs.arizona.edu/icon/), where the idea of generators is central. In +Icon, every expression and function call behaves like a generator. One example +from "An Overview of the Icon Programming Language" at +http://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks +like:: + + sentence := "Store it in the neighboring harbor" + if (i := find("or", sentence)) > 5 then write(i) + +In Icon the :func:`find` function returns the indexes at which the substring +"or" is found: 3, 23, 33. In the :keyword:`if` statement, ``i`` is first +assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon +retries it with the second value of 23. 23 is greater than 5, so the comparison +now succeeds, and the code prints the value 23 to the screen. + +Python doesn't go nearly as far as Icon in adopting generators as a central +concept. Generators are considered part of the core Python language, but +learning or using them isn't compulsory; if they don't solve any problems that +you have, feel free to ignore them. One novel feature of Python's interface as +compared to Icon's is that a generator's state is represented as a concrete +object (the iterator) that can be passed around to other functions or stored in +a data structure. + + +.. seealso:: + + :pep:`255` - Simple Generators + Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland. Implemented mostly + by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew. + +.. % ====================================================================== + + +.. _section-encodings: + +PEP 263: Source Code Encodings +============================== + +Python source files can now be declared as being in different character set +encodings. Encodings are declared by including a specially formatted comment in +the first or second line of the source file. For example, a UTF-8 file can be +declared with:: + + #!/usr/bin/env python + # -*- coding: UTF-8 -*- + +Without such an encoding declaration, the default encoding used is 7-bit ASCII. +Executing or importing modules that contain string literals with 8-bit +characters and have no encoding declaration will result in a +:exc:`DeprecationWarning` being signalled by Python 2.3; in 2.4 this will be a +syntax error. + +The encoding declaration only affects Unicode string literals, which will be +converted to Unicode using the specified encoding. Note that Python identifiers +are still restricted to ASCII characters, so you can't have variable names that +use characters outside of the usual alphanumerics. + + +.. seealso:: + + :pep:`263` - Defining Python Source Code Encodings + Written by Marc-André Lemburg and Martin von Löwis; implemented by Suzuki Hisao + and Martin von Löwis. + +.. % ====================================================================== + + +PEP 273: Importing Modules from ZIP Archives +============================================ + +The new :mod:`zipimport` module adds support for importing modules from a ZIP- +format archive. You don't need to import the module explicitly; it will be +automatically imported if a ZIP archive's filename is added to ``sys.path``. +For example:: + + amk@nyman:~/src/python$ unzip -l /tmp/example.zip + Archive: /tmp/example.zip + Length Date Time Name + -------- ---- ---- ---- + 8467 11-26-02 22:30 jwzthreading.py + -------- ------- + 8467 1 file + amk@nyman:~/src/python$ ./python + Python 2.3 (#1, Aug 1 2003, 19:54:32) + >>> import sys + >>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path + >>> import jwzthreading + >>> jwzthreading.__file__ + '/tmp/example.zip/jwzthreading.py' + >>> + +An entry in ``sys.path`` can now be the filename of a ZIP archive. The ZIP +archive can contain any kind of files, but only files named :file:`\*.py`, +:file:`\*.pyc`, or :file:`\*.pyo` can be imported. If an archive only contains +:file:`\*.py` files, Python will not attempt to modify the archive by adding the +corresponding :file:`\*.pyc` file, meaning that if a ZIP archive doesn't contain +:file:`\*.pyc` files, importing may be rather slow. + +A path within the archive can also be specified to only import from a +subdirectory; for example, the path :file:`/tmp/example.zip/lib/` would only +import from the :file:`lib/` subdirectory within the archive. + + +.. seealso:: + + :pep:`273` - Import Modules from Zip Archives + Written by James C. Ahlstrom, who also provided an implementation. Python 2.3 + follows the specification in :pep:`273`, but uses an implementation written by + Just van Rossum that uses the import hooks described in :pep:`302`. See section + :ref:`section-pep302` for a description of the new import hooks. + +.. % ====================================================================== + + +PEP 277: Unicode file name support for Windows NT +================================================= + +On Windows NT, 2000, and XP, the system stores file names as Unicode strings. +Traditionally, Python has represented file names as byte strings, which is +inadequate because it renders some file names inaccessible. + +Python now allows using arbitrary Unicode strings (within the limitations of the +file system) for all functions that expect file names, most notably the +:func:`open` built-in function. If a Unicode string is passed to +:func:`os.listdir`, Python now returns a list of Unicode strings. A new +function, :func:`os.getcwdu`, returns the current directory as a Unicode string. + +Byte strings still work as file names, and on Windows Python will transparently +convert them to Unicode using the ``mbcs`` encoding. + +Other systems also allow Unicode strings as file names but convert them to byte +strings before passing them to the system, which can cause a :exc:`UnicodeError` +to be raised. Applications can test whether arbitrary Unicode strings are +supported as file names by checking :attr:`os.path.supports_unicode_filenames`, +a Boolean value. + +Under MacOS, :func:`os.listdir` may now return Unicode filenames. + + +.. seealso:: + + :pep:`277` - Unicode file name support for Windows NT + Written by Neil Hodgson; implemented by Neil Hodgson, Martin von Löwis, and Mark + Hammond. + +.. % ====================================================================== + + +PEP 278: Universal Newline Support +================================== + +The three major operating systems used today are Microsoft Windows, Apple's +Macintosh OS, and the various Unix derivatives. A minor irritation of cross- +platform work is that these three platforms all use different characters to +mark the ends of lines in text files. Unix uses the linefeed (ASCII character +10), MacOS uses the carriage return (ASCII character 13), and Windows uses a +two-character sequence of a carriage return plus a newline. + +Python's file objects can now support end of line conventions other than the one +followed by the platform on which Python is running. Opening a file with the +mode ``'U'`` or ``'rU'`` will open a file for reading in universal newline mode. +All three line ending conventions will be translated to a ``'\n'`` in the +strings returned by the various file methods such as :meth:`read` and +:meth:`readline`. + +Universal newline support is also used when importing modules and when executing +a file with the :func:`execfile` function. This means that Python modules can +be shared between all three operating systems without needing to convert the +line-endings. + +This feature can be disabled when compiling Python by specifying the +:option:`--without-universal-newlines` switch when running Python's +:program:`configure` script. + + +.. seealso:: + + :pep:`278` - Universal Newline Support + Written and implemented by Jack Jansen. + +.. % ====================================================================== + + +.. _section-enumerate: + +PEP 279: enumerate() +==================== + +A new built-in function, :func:`enumerate`, will make certain loops a bit +clearer. ``enumerate(thing)``, where *thing* is either an iterator or a +sequence, returns a iterator that will return ``(0, thing[0])``, ``(1, +thing[1])``, ``(2, thing[2])``, and so forth. + +A common idiom to change every element of a list looks like this:: + + for i in range(len(L)): + item = L[i] + # ... compute some result based on item ... + L[i] = result + +This can be rewritten using :func:`enumerate` as:: + + for i, item in enumerate(L): + # ... compute some result based on item ... + L[i] = result + + +.. seealso:: + + :pep:`279` - The enumerate() built-in function + Written and implemented by Raymond D. Hettinger. + +.. % ====================================================================== + + +PEP 282: The logging Package +============================ + +A standard package for writing logs, :mod:`logging`, has been added to Python +2.3. It provides a powerful and flexible mechanism for generating logging +output which can then be filtered and processed in various ways. A +configuration file written in a standard format can be used to control the +logging behavior of a program. Python includes handlers that will write log +records to standard error or to a file or socket, send them to the system log, +or even e-mail them to a particular address; of course, it's also possible to +write your own handler classes. + +The :class:`Logger` class is the primary class. Most application code will deal +with one or more :class:`Logger` objects, each one used by a particular +subsystem of the application. Each :class:`Logger` is identified by a name, and +names are organized into a hierarchy using ``.`` as the component separator. +For example, you might have :class:`Logger` instances named ``server``, +``server.auth`` and ``server.network``. The latter two instances are below +``server`` in the hierarchy. This means that if you turn up the verbosity for +``server`` or direct ``server`` messages to a different handler, the changes +will also apply to records logged to ``server.auth`` and ``server.network``. +There's also a root :class:`Logger` that's the parent of all other loggers. + +For simple uses, the :mod:`logging` package contains some convenience functions +that always use the root log:: + + import logging + + logging.debug('Debugging information') + logging.info('Informational message') + logging.warning('Warning:config file %s not found', 'server.conf') + logging.error('Error occurred') + logging.critical('Critical error -- shutting down') + +This produces the following output:: + + WARNING:root:Warning:config file server.conf not found + ERROR:root:Error occurred + CRITICAL:root:Critical error -- shutting down + +In the default configuration, informational and debugging messages are +suppressed and the output is sent to standard error. You can enable the display +of informational and debugging messages by calling the :meth:`setLevel` method +on the root logger. + +Notice the :func:`warning` call's use of string formatting operators; all of the +functions for logging messages take the arguments ``(msg, arg1, arg2, ...)`` and +log the string resulting from ``msg % (arg1, arg2, ...)``. + +There's also an :func:`exception` function that records the most recent +traceback. Any of the other functions will also record the traceback if you +specify a true value for the keyword argument *exc_info*. :: + + def f(): + try: 1/0 + except: logging.exception('Problem recorded') + + f() + +This produces the following output:: + + ERROR:root:Problem recorded + Traceback (most recent call last): + File "t.py", line 6, in f + 1/0 + ZeroDivisionError: integer division or modulo by zero + +Slightly more advanced programs will use a logger other than the root logger. +The :func:`getLogger(name)` function is used to get a particular log, creating +it if it doesn't exist yet. :func:`getLogger(None)` returns the root logger. :: + + log = logging.getLogger('server') + ... + log.info('Listening on port %i', port) + ... + log.critical('Disk full') + ... + +Log records are usually propagated up the hierarchy, so a message logged to +``server.auth`` is also seen by ``server`` and ``root``, but a :class:`Logger` +can prevent this by setting its :attr:`propagate` attribute to :const:`False`. + +There are more classes provided by the :mod:`logging` package that can be +customized. When a :class:`Logger` instance is told to log a message, it +creates a :class:`LogRecord` instance that is sent to any number of different +:class:`Handler` instances. Loggers and handlers can also have an attached list +of filters, and each filter can cause the :class:`LogRecord` to be ignored or +can modify the record before passing it along. When they're finally output, +:class:`LogRecord` instances are converted to text by a :class:`Formatter` +class. All of these classes can be replaced by your own specially-written +classes. + +With all of these features the :mod:`logging` package should provide enough +flexibility for even the most complicated applications. This is only an +incomplete overview of its features, so please see the package's reference +documentation for all of the details. Reading :pep:`282` will also be helpful. + + +.. seealso:: + + :pep:`282` - A Logging System + Written by Vinay Sajip and Trent Mick; implemented by Vinay Sajip. + +.. % ====================================================================== + + +.. _section-bool: + +PEP 285: A Boolean Type +======================= + +A Boolean type was added to Python 2.3. Two new constants were added to the +:mod:`__builtin__` module, :const:`True` and :const:`False`. (:const:`True` and +:const:`False` constants were added to the built-ins in Python 2.2.1, but the +2.2.1 versions are simply set to integer values of 1 and 0 and aren't a +different type.) + +The type object for this new type is named :class:`bool`; the constructor for it +takes any Python value and converts it to :const:`True` or :const:`False`. :: + + >>> bool(1) + True + >>> bool(0) + False + >>> bool([]) + False + >>> bool( (1,) ) + True + +Most of the standard library modules and built-in functions have been changed to +return Booleans. :: + + >>> obj = [] + >>> hasattr(obj, 'append') + True + >>> isinstance(obj, list) + True + >>> isinstance(obj, tuple) + False + +Python's Booleans were added with the primary goal of making code clearer. For +example, if you're reading a function and encounter the statement ``return 1``, +you might wonder whether the ``1`` represents a Boolean truth value, an index, +or a coefficient that multiplies some other quantity. If the statement is +``return True``, however, the meaning of the return value is quite clear. + +Python's Booleans were *not* added for the sake of strict type-checking. A very +strict language such as Pascal would also prevent you performing arithmetic with +Booleans, and would require that the expression in an :keyword:`if` statement +always evaluate to a Boolean result. Python is not this strict and never will +be, as :pep:`285` explicitly says. This means you can still use any expression +in an :keyword:`if` statement, even ones that evaluate to a list or tuple or +some random object. The Boolean type is a subclass of the :class:`int` class so +that arithmetic using a Boolean still works. :: + + >>> True + 1 + 2 + >>> False + 1 + 1 + >>> False * 75 + 0 + >>> True * 75 + 75 + +To sum up :const:`True` and :const:`False` in a sentence: they're alternative +ways to spell the integer values 1 and 0, with the single difference that +:func:`str` and :func:`repr` return the strings ``'True'`` and ``'False'`` +instead of ``'1'`` and ``'0'``. + + +.. seealso:: + + :pep:`285` - Adding a bool type + Written and implemented by GvR. + +.. % ====================================================================== + + +PEP 293: Codec Error Handling Callbacks +======================================= + +When encoding a Unicode string into a byte string, unencodable characters may be +encountered. So far, Python has allowed specifying the error processing as +either "strict" (raising :exc:`UnicodeError`), "ignore" (skipping the +character), or "replace" (using a question mark in the output string), with +"strict" being the default behavior. It may be desirable to specify alternative +processing of such errors, such as inserting an XML character reference or HTML +entity reference into the converted string. + +Python now has a flexible framework to add different processing strategies. New +error handlers can be added with :func:`codecs.register_error`, and codecs then +can access the error handler with :func:`codecs.lookup_error`. An equivalent C +API has been added for codecs written in C. The error handler gets the necessary +state information such as the string being converted, the position in the string +where the error was detected, and the target encoding. The handler can then +either raise an exception or return a replacement string. + +Two additional error handlers have been implemented using this framework: +"backslashreplace" uses Python backslash quoting to represent unencodable +characters and "xmlcharrefreplace" emits XML character references. + + +.. seealso:: + + :pep:`293` - Codec Error Handling Callbacks + Written and implemented by Walter Dörwald. + +.. % ====================================================================== + + +.. _section-pep301: + +PEP 301: Package Index and Metadata for Distutils +================================================= + +Support for the long-requested Python catalog makes its first appearance in 2.3. + +The heart of the catalog is the new Distutils :command:`register` command. +Running ``python setup.py register`` will collect the metadata describing a +package, such as its name, version, maintainer, description, &c., and send it to +a central catalog server. The resulting catalog is available from +http://www.python.org/pypi. + +To make the catalog a bit more useful, a new optional *classifiers* keyword +argument has been added to the Distutils :func:`setup` function. A list of +`Trove <http://catb.org/~esr/trove/>`_-style strings can be supplied to help +classify the software. + +Here's an example :file:`setup.py` with classifiers, written to be compatible +with older versions of the Distutils:: + + from distutils import core + kw = {'name': "Quixote", + 'version': "0.5.1", + 'description': "A highly Pythonic Web application framework", + # ... + } + + if (hasattr(core, 'setup_keywords') and + 'classifiers' in core.setup_keywords): + kw['classifiers'] = \ + ['Topic :: Internet :: WWW/HTTP :: Dynamic Content', + 'Environment :: No Input/Output (Daemon)', + 'Intended Audience :: Developers'], + + core.setup(**kw) + +The full list of classifiers can be obtained by running ``python setup.py +register --list-classifiers``. + + +.. seealso:: + + :pep:`301` - Package Index and Metadata for Distutils + Written and implemented by Richard Jones. + +.. % ====================================================================== + + +.. _section-pep302: + +PEP 302: New Import Hooks +========================= + +While it's been possible to write custom import hooks ever since the +:mod:`ihooks` module was introduced in Python 1.3, no one has ever been really +happy with it because writing new import hooks is difficult and messy. There +have been various proposed alternatives such as the :mod:`imputil` and :mod:`iu` +modules, but none of them has ever gained much acceptance, and none of them were +easily usable from C code. + +:pep:`302` borrows ideas from its predecessors, especially from Gordon +McMillan's :mod:`iu` module. Three new items are added to the :mod:`sys` +module: + +* ``sys.path_hooks`` is a list of callable objects; most often they'll be + classes. Each callable takes a string containing a path and either returns an + importer object that will handle imports from this path or raises an + :exc:`ImportError` exception if it can't handle this path. + +* ``sys.path_importer_cache`` caches importer objects for each path, so + ``sys.path_hooks`` will only need to be traversed once for each path. + +* ``sys.meta_path`` is a list of importer objects that will be traversed before + ``sys.path`` is checked. This list is initially empty, but user code can add + objects to it. Additional built-in and frozen modules can be imported by an + object added to this list. + +Importer objects must have a single method, :meth:`find_module(fullname, +path=None)`. *fullname* will be a module or package name, e.g. ``string`` or +``distutils.core``. :meth:`find_module` must return a loader object that has a +single method, :meth:`load_module(fullname)`, that creates and returns the +corresponding module object. + +Pseudo-code for Python's new import logic, therefore, looks something like this +(simplified a bit; see :pep:`302` for the full details):: + + for mp in sys.meta_path: + loader = mp(fullname) + if loader is not None: + <module> = loader.load_module(fullname) + + for path in sys.path: + for hook in sys.path_hooks: + try: + importer = hook(path) + except ImportError: + # ImportError, so try the other path hooks + pass + else: + loader = importer.find_module(fullname) + <module> = loader.load_module(fullname) + + # Not found! + raise ImportError + + +.. seealso:: + + :pep:`302` - New Import Hooks + Written by Just van Rossum and Paul Moore. Implemented by Just van Rossum. + +.. % ====================================================================== + + +.. _section-pep305: + +PEP 305: Comma-separated Files +============================== + +Comma-separated files are a format frequently used for exporting data from +databases and spreadsheets. Python 2.3 adds a parser for comma-separated files. + +Comma-separated format is deceptively simple at first glance:: + + Costs,150,200,3.95 + +Read a line and call ``line.split(',')``: what could be simpler? But toss in +string data that can contain commas, and things get more complicated:: + + "Costs",150,200,3.95,"Includes taxes, shipping, and sundry items" + +A big ugly regular expression can parse this, but using the new :mod:`csv` +package is much simpler:: + + import csv + + input = open('datafile', 'rb') + reader = csv.reader(input) + for line in reader: + print line + +The :func:`reader` function takes a number of different options. The field +separator isn't limited to the comma and can be changed to any character, and so +can the quoting and line-ending characters. + +Different dialects of comma-separated files can be defined and registered; +currently there are two dialects, both used by Microsoft Excel. A separate +:class:`csv.writer` class will generate comma-separated files from a succession +of tuples or lists, quoting strings that contain the delimiter. + + +.. seealso:: + + :pep:`305` - CSV File API + Written and implemented by Kevin Altis, Dave Cole, Andrew McNamara, Skip + Montanaro, Cliff Wells. + +.. % ====================================================================== + + +.. _section-pep307: + +PEP 307: Pickle Enhancements +============================ + +The :mod:`pickle` and :mod:`cPickle` modules received some attention during the +2.3 development cycle. In 2.2, new-style classes could be pickled without +difficulty, but they weren't pickled very compactly; :pep:`307` quotes a trivial +example where a new-style class results in a pickled string three times longer +than that for a classic class. + +The solution was to invent a new pickle protocol. The :func:`pickle.dumps` +function has supported a text-or-binary flag for a long time. In 2.3, this +flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle +format, 1 is the old binary format, and now 2 is a new 2.3-specific format. A +new constant, :const:`pickle.HIGHEST_PROTOCOL`, can be used to select the +fanciest protocol available. + +Unpickling is no longer considered a safe operation. 2.2's :mod:`pickle` +provided hooks for trying to prevent unsafe classes from being unpickled +(specifically, a :attr:`__safe_for_unpickling__` attribute), but none of this +code was ever audited and therefore it's all been ripped out in 2.3. You should +not unpickle untrusted data in any version of Python. + +To reduce the pickling overhead for new-style classes, a new interface for +customizing pickling was added using three special methods: +:meth:`__getstate__`, :meth:`__setstate__`, and :meth:`__getnewargs__`. Consult +:pep:`307` for the full semantics of these methods. + +As a way to compress pickles yet further, it's now possible to use integer codes +instead of long strings to identify pickled classes. The Python Software +Foundation will maintain a list of standardized codes; there's also a range of +codes for private use. Currently no codes have been specified. + + +.. seealso:: + + :pep:`307` - Extensions to the pickle protocol + Written and implemented by Guido van Rossum and Tim Peters. + +.. % ====================================================================== + + +.. _section-slices: + +Extended Slices +=============== + +Ever since Python 1.4, the slicing syntax has supported an optional third "step" +or "stride" argument. For example, these are all legal Python syntax: +``L[1:10:2]``, ``L[:-1:1]``, ``L[::-1]``. This was added to Python at the +request of the developers of Numerical Python, which uses the third argument +extensively. However, Python's built-in list, tuple, and string sequence types +have never supported this feature, raising a :exc:`TypeError` if you tried it. +Michael Hudson contributed a patch to fix this shortcoming. + +For example, you can now easily extract the elements of a list that have even +indexes:: + + >>> L = range(10) + >>> L[::2] + [0, 2, 4, 6, 8] + +Negative values also work to make a copy of the same list in reverse order:: + + >>> L[::-1] + [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] + +This also works for tuples, arrays, and strings:: + + >>> s='abcd' + >>> s[::2] + 'ac' + >>> s[::-1] + 'dcba' + +If you have a mutable sequence such as a list or an array you can assign to or +delete an extended slice, but there are some differences between assignment to +extended and regular slices. Assignment to a regular slice can be used to +change the length of the sequence:: + + >>> a = range(3) + >>> a + [0, 1, 2] + >>> a[1:3] = [4, 5, 6] + >>> a + [0, 4, 5, 6] + +Extended slices aren't this flexible. When assigning to an extended slice, the +list on the right hand side of the statement must contain the same number of +items as the slice it is replacing:: + + >>> a = range(4) + >>> a + [0, 1, 2, 3] + >>> a[::2] + [0, 2] + >>> a[::2] = [0, -1] + >>> a + [0, 1, -1, 3] + >>> a[::2] = [0,1,2] + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: attempt to assign sequence of size 3 to extended slice of size 2 + +Deletion is more straightforward:: + + >>> a = range(4) + >>> a + [0, 1, 2, 3] + >>> a[::2] + [0, 2] + >>> del a[::2] + >>> a + [1, 3] + +One can also now pass slice objects to the :meth:`__getitem__` methods of the +built-in sequences:: + + >>> range(10).__getitem__(slice(0, 5, 2)) + [0, 2, 4] + +Or use slice objects directly in subscripts:: + + >>> range(10)[slice(0, 5, 2)] + [0, 2, 4] + +To simplify implementing sequences that support extended slicing, slice objects +now have a method :meth:`indices(length)` which, given the length of a sequence, +returns a ``(start, stop, step)`` tuple that can be passed directly to +:func:`range`. :meth:`indices` handles omitted and out-of-bounds indices in a +manner consistent with regular slices (and this innocuous phrase hides a welter +of confusing details!). The method is intended to be used like this:: + + class FakeSeq: + ... + def calc_item(self, i): + ... + def __getitem__(self, item): + if isinstance(item, slice): + indices = item.indices(len(self)) + return FakeSeq([self.calc_item(i) for i in range(*indices)]) + else: + return self.calc_item(i) + +From this example you can also see that the built-in :class:`slice` object is +now the type object for the slice type, and is no longer a function. This is +consistent with Python 2.2, where :class:`int`, :class:`str`, etc., underwent +the same change. + +.. % ====================================================================== + + +Other Language Changes +====================== + +Here are all of the changes that Python 2.3 makes to the core Python language. + +* The :keyword:`yield` statement is now always a keyword, as described in + section :ref:`section-generators` of this document. + +* A new built-in function :func:`enumerate` was added, as described in section + :ref:`section-enumerate` of this document. + +* Two new constants, :const:`True` and :const:`False` were added along with the + built-in :class:`bool` type, as described in section :ref:`section-bool` of this + document. + +* The :func:`int` type constructor will now return a long integer instead of + raising an :exc:`OverflowError` when a string or floating-point number is too + large to fit into an integer. This can lead to the paradoxical result that + ``isinstance(int(expression), int)`` is false, but that seems unlikely to cause + problems in practice. + +* Built-in types now support the extended slicing syntax, as described in + section :ref:`section-slices` of this document. + +* A new built-in function, :func:`sum(iterable, start=0)`, adds up the numeric + items in the iterable object and returns their sum. :func:`sum` only accepts + numbers, meaning that you can't use it to concatenate a bunch of strings. + (Contributed by Alex Martelli.) + +* ``list.insert(pos, value)`` used to insert *value* at the front of the list + when *pos* was negative. The behaviour has now been changed to be consistent + with slice indexing, so when *pos* is -1 the value will be inserted before the + last element, and so forth. + +* ``list.index(value)``, which searches for *value* within the list and returns + its index, now takes optional *start* and *stop* arguments to limit the search + to only part of the list. + +* Dictionaries have a new method, :meth:`pop(key[, *default*])`, that returns + the value corresponding to *key* and removes that key/value pair from the + dictionary. If the requested key isn't present in the dictionary, *default* is + returned if it's specified and :exc:`KeyError` raised if it isn't. :: + + >>> d = {1:2} + >>> d + {1: 2} + >>> d.pop(4) + Traceback (most recent call last): + File "stdin", line 1, in ? + KeyError: 4 + >>> d.pop(1) + 2 + >>> d.pop(1) + Traceback (most recent call last): + File "stdin", line 1, in ? + KeyError: 'pop(): dictionary is empty' + >>> d + {} + >>> + + There's also a new class method, :meth:`dict.fromkeys(iterable, value)`, that + creates a dictionary with keys taken from the supplied iterator *iterable* and + all values set to *value*, defaulting to ``None``. + + (Patches contributed by Raymond Hettinger.) + + Also, the :func:`dict` constructor now accepts keyword arguments to simplify + creating small dictionaries:: + + >>> dict(red=1, blue=2, green=3, black=4) + {'blue': 2, 'black': 4, 'green': 3, 'red': 1} + + (Contributed by Just van Rossum.) + +* The :keyword:`assert` statement no longer checks the ``__debug__`` flag, so + you can no longer disable assertions by assigning to ``__debug__``. Running + Python with the :option:`-O` switch will still generate code that doesn't + execute any assertions. + +* Most type objects are now callable, so you can use them to create new objects + such as functions, classes, and modules. (This means that the :mod:`new` module + can be deprecated in a future Python version, because you can now use the type + objects available in the :mod:`types` module.) For example, you can create a new + module object with the following code: + + .. % XXX should new.py use PendingDeprecationWarning? + + :: + + >>> import types + >>> m = types.ModuleType('abc','docstring') + >>> m + <module 'abc' (built-in)> + >>> m.__doc__ + 'docstring' + +* A new warning, :exc:`PendingDeprecationWarning` was added to indicate features + which are in the process of being deprecated. The warning will *not* be printed + by default. To check for use of features that will be deprecated in the future, + supply :option:`-Walways::PendingDeprecationWarning::` on the command line or + use :func:`warnings.filterwarnings`. + +* The process of deprecating string-based exceptions, as in ``raise "Error + occurred"``, has begun. Raising a string will now trigger + :exc:`PendingDeprecationWarning`. + +* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning` + warning. In a future version of Python, ``None`` may finally become a keyword. + +* The :meth:`xreadlines` method of file objects, introduced in Python 2.1, is no + longer necessary because files now behave as their own iterator. + :meth:`xreadlines` was originally introduced as a faster way to loop over all + the lines in a file, but now you can simply write ``for line in file_obj``. + File objects also have a new read-only :attr:`encoding` attribute that gives the + encoding used by the file; Unicode strings written to the file will be + automatically converted to bytes using the given encoding. + +* The method resolution order used by new-style classes has changed, though + you'll only notice the difference if you have a really complicated inheritance + hierarchy. Classic classes are unaffected by this change. Python 2.2 + originally used a topological sort of a class's ancestors, but 2.3 now uses the + C3 algorithm as described in the paper `"A Monotonic Superclass Linearization + for Dylan" <http://www.webcom.com/haahr/dylan/linearization-oopsla96.html>`_. To + understand the motivation for this change, read Michele Simionato's article + `"Python 2.3 Method Resolution Order" <http://www.python.org/2.3/mro.html>`_, or + read the thread on python-dev starting with the message at + http://mail.python.org/pipermail/python-dev/2002-October/029035.html. Samuele + Pedroni first pointed out the problem and also implemented the fix by coding the + C3 algorithm. + +* Python runs multithreaded programs by switching between threads after + executing N bytecodes. The default value for N has been increased from 10 to + 100 bytecodes, speeding up single-threaded applications by reducing the + switching overhead. Some multithreaded applications may suffer slower response + time, but that's easily fixed by setting the limit back to a lower number using + :func:`sys.setcheckinterval(N)`. The limit can be retrieved with the new + :func:`sys.getcheckinterval` function. + +* One minor but far-reaching change is that the names of extension types defined + by the modules included with Python now contain the module and a ``'.'`` in + front of the type name. For example, in Python 2.2, if you created a socket and + printed its :attr:`__class__`, you'd get this output:: + + >>> s = socket.socket() + >>> s.__class__ + <type 'socket'> + + In 2.3, you get this:: + + >>> s.__class__ + <type '_socket.socket'> + +* One of the noted incompatibilities between old- and new-style classes has been + removed: you can now assign to the :attr:`__name__` and :attr:`__bases__` + attributes of new-style classes. There are some restrictions on what can be + assigned to :attr:`__bases__` along the lines of those relating to assigning to + an instance's :attr:`__class__` attribute. + +.. % ====================================================================== + + +String Changes +-------------- + +* The :keyword:`in` operator now works differently for strings. Previously, when + evaluating ``X in Y`` where *X* and *Y* are strings, *X* could only be a single + character. That's now changed; *X* can be a string of any length, and ``X in Y`` + will return :const:`True` if *X* is a substring of *Y*. If *X* is the empty + string, the result is always :const:`True`. :: + + >>> 'ab' in 'abcd' + True + >>> 'ad' in 'abcd' + False + >>> '' in 'abcd' + True + + Note that this doesn't tell you where the substring starts; if you need that + information, use the :meth:`find` string method. + +* The :meth:`strip`, :meth:`lstrip`, and :meth:`rstrip` string methods now have + an optional argument for specifying the characters to strip. The default is + still to remove all whitespace characters:: + + >>> ' abc '.strip() + 'abc' + >>> '><><abc<><><>'.strip('<>') + 'abc' + >>> '><><abc<><><>\n'.strip('<>') + 'abc<><><>\n' + >>> u'\u4000\u4001abc\u4000'.strip(u'\u4000') + u'\u4001abc' + >>> + + (Suggested by Simon Brunning and implemented by Walter Dörwald.) + +* The :meth:`startswith` and :meth:`endswith` string methods now accept negative + numbers for the *start* and *end* parameters. + +* Another new string method is :meth:`zfill`, originally a function in the + :mod:`string` module. :meth:`zfill` pads a numeric string with zeros on the + left until it's the specified width. Note that the ``%`` operator is still more + flexible and powerful than :meth:`zfill`. :: + + >>> '45'.zfill(4) + '0045' + >>> '12345'.zfill(4) + '12345' + >>> 'goofy'.zfill(6) + '0goofy' + + (Contributed by Walter Dörwald.) + +* A new type object, :class:`basestring`, has been added. Both 8-bit strings and + Unicode strings inherit from this type, so ``isinstance(obj, basestring)`` will + return :const:`True` for either kind of string. It's a completely abstract + type, so you can't create :class:`basestring` instances. + +* Interned strings are no longer immortal and will now be garbage-collected in + the usual way when the only reference to them is from the internal dictionary of + interned strings. (Implemented by Oren Tirosh.) + +.. % ====================================================================== + + +Optimizations +------------- + +* The creation of new-style class instances has been made much faster; they're + now faster than classic classes! + +* The :meth:`sort` method of list objects has been extensively rewritten by Tim + Peters, and the implementation is significantly faster. + +* Multiplication of large long integers is now much faster thanks to an + implementation of Karatsuba multiplication, an algorithm that scales better than + the O(n\*n) required for the grade-school multiplication algorithm. (Original + patch by Christopher A. Craig, and significantly reworked by Tim Peters.) + +* The ``SET_LINENO`` opcode is now gone. This may provide a small speed + increase, depending on your compiler's idiosyncrasies. See section + :ref:`section-other` for a longer explanation. (Removed by Michael Hudson.) + +* :func:`xrange` objects now have their own iterator, making ``for i in + xrange(n)`` slightly faster than ``for i in range(n)``. (Patch by Raymond + Hettinger.) + +* A number of small rearrangements have been made in various hotspots to improve + performance, such as inlining a function or removing some code. (Implemented + mostly by GvR, but lots of people have contributed single changes.) + +The net result of the 2.3 optimizations is that Python 2.3 runs the pystone +benchmark around 25% faster than Python 2.2. + +.. % ====================================================================== + + +New, Improved, and Deprecated Modules +===================================== + +As usual, Python's standard library received a number of enhancements and bug +fixes. Here's a partial list of the most notable changes, sorted alphabetically +by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more +complete list of changes, or look through the CVS logs for all the details. + +* The :mod:`array` module now supports arrays of Unicode characters using the + ``'u'`` format character. Arrays also now support using the ``+=`` assignment + operator to add another array's contents, and the ``*=`` assignment operator to + repeat an array. (Contributed by Jason Orendorff.) + +* The :mod:`bsddb` module has been replaced by version 4.1.6 of the `PyBSDDB + <http://pybsddb.sourceforge.net>`_ package, providing a more complete interface + to the transactional features of the BerkeleyDB library. + + The old version of the module has been renamed to :mod:`bsddb185` and is no + longer built automatically; you'll have to edit :file:`Modules/Setup` to enable + it. Note that the new :mod:`bsddb` package is intended to be compatible with + the old module, so be sure to file bugs if you discover any incompatibilities. + When upgrading to Python 2.3, if the new interpreter is compiled with a new + version of the underlying BerkeleyDB library, you will almost certainly have to + convert your database files to the new version. You can do this fairly easily + with the new scripts :file:`db2pickle.py` and :file:`pickle2db.py` which you + will find in the distribution's :file:`Tools/scripts` directory. If you've + already been using the PyBSDDB package and importing it as :mod:`bsddb3`, you + will have to change your ``import`` statements to import it as :mod:`bsddb`. + +* The new :mod:`bz2` module is an interface to the bz2 data compression library. + bz2-compressed data is usually smaller than corresponding :mod:`zlib`\ + -compressed data. (Contributed by Gustavo Niemeyer.) + +* A set of standard date/time types has been added in the new :mod:`datetime` + module. See the following section for more details. + +* The Distutils :class:`Extension` class now supports an extra constructor + argument named *depends* for listing additional source files that an extension + depends on. This lets Distutils recompile the module if any of the dependency + files are modified. For example, if :file:`sampmodule.c` includes the header + file :file:`sample.h`, you would create the :class:`Extension` object like + this:: + + ext = Extension("samp", + sources=["sampmodule.c"], + depends=["sample.h"]) + + Modifying :file:`sample.h` would then cause the module to be recompiled. + (Contributed by Jeremy Hylton.) + +* Other minor changes to Distutils: it now checks for the :envvar:`CC`, + :envvar:`CFLAGS`, :envvar:`CPP`, :envvar:`LDFLAGS`, and :envvar:`CPPFLAGS` + environment variables, using them to override the settings in Python's + configuration (contributed by Robert Weber). + +* Previously the :mod:`doctest` module would only search the docstrings of + public methods and functions for test cases, but it now also examines private + ones as well. The :func:`DocTestSuite(` function creates a + :class:`unittest.TestSuite` object from a set of :mod:`doctest` tests. + +* The new :func:`gc.get_referents(object)` function returns a list of all the + objects referenced by *object*. + +* The :mod:`getopt` module gained a new function, :func:`gnu_getopt`, that + supports the same arguments as the existing :func:`getopt` function but uses + GNU-style scanning mode. The existing :func:`getopt` stops processing options as + soon as a non-option argument is encountered, but in GNU-style mode processing + continues, meaning that options and arguments can be mixed. For example:: + + >>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v') + ([('-f', 'filename')], ['output', '-v']) + >>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v') + ([('-f', 'filename'), ('-v', '')], ['output']) + + (Contributed by Peter Åstrand.) + +* The :mod:`grp`, :mod:`pwd`, and :mod:`resource` modules now return enhanced + tuples:: + + >>> import grp + >>> g = grp.getgrnam('amk') + >>> g.gr_name, g.gr_gid + ('amk', 500) + +* The :mod:`gzip` module can now handle files exceeding 2 GiB. + +* The new :mod:`heapq` module contains an implementation of a heap queue + algorithm. A heap is an array-like data structure that keeps items in a + partially sorted order such that, for every index *k*, ``heap[k] <= + heap[2*k+1]`` and ``heap[k] <= heap[2*k+2]``. This makes it quick to remove the + smallest item, and inserting a new item while maintaining the heap property is + O(lg n). (See http://www.nist.gov/dads/HTML/priorityque.html for more + information about the priority queue data structure.) + + The :mod:`heapq` module provides :func:`heappush` and :func:`heappop` functions + for adding and removing items while maintaining the heap property on top of some + other mutable Python sequence type. Here's an example that uses a Python list:: + + >>> import heapq + >>> heap = [] + >>> for item in [3, 7, 5, 11, 1]: + ... heapq.heappush(heap, item) + ... + >>> heap + [1, 3, 5, 11, 7] + >>> heapq.heappop(heap) + 1 + >>> heapq.heappop(heap) + 3 + >>> heap + [5, 7, 11] + + (Contributed by Kevin O'Connor.) + +* The IDLE integrated development environment has been updated using the code + from the IDLEfork project (http://idlefork.sf.net). The most notable feature is + that the code being developed is now executed in a subprocess, meaning that + there's no longer any need for manual ``reload()`` operations. IDLE's core code + has been incorporated into the standard library as the :mod:`idlelib` package. + +* The :mod:`imaplib` module now supports IMAP over SSL. (Contributed by Piers + Lauder and Tino Lange.) + +* The :mod:`itertools` contains a number of useful functions for use with + iterators, inspired by various functions provided by the ML and Haskell + languages. For example, ``itertools.ifilter(predicate, iterator)`` returns all + elements in the iterator for which the function :func:`predicate` returns + :const:`True`, and ``itertools.repeat(obj, N)`` returns ``obj`` *N* times. + There are a number of other functions in the module; see the package's reference + documentation for details. + (Contributed by Raymond Hettinger.) + +* Two new functions in the :mod:`math` module, :func:`degrees(rads)` and + :func:`radians(degs)`, convert between radians and degrees. Other functions in + the :mod:`math` module such as :func:`math.sin` and :func:`math.cos` have always + required input values measured in radians. Also, an optional *base* argument + was added to :func:`math.log` to make it easier to compute logarithms for bases + other than ``e`` and ``10``. (Contributed by Raymond Hettinger.) + +* Several new POSIX functions (:func:`getpgid`, :func:`killpg`, :func:`lchown`, + :func:`loadavg`, :func:`major`, :func:`makedev`, :func:`minor`, and + :func:`mknod`) were added to the :mod:`posix` module that underlies the + :mod:`os` module. (Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S. + Otkidach.) + +* In the :mod:`os` module, the :func:`\*stat` family of functions can now report + fractions of a second in a timestamp. Such time stamps are represented as + floats, similar to the value returned by :func:`time.time`. + + During testing, it was found that some applications will break if time stamps + are floats. For compatibility, when using the tuple interface of the + :class:`stat_result` time stamps will be represented as integers. When using + named fields (a feature first introduced in Python 2.2), time stamps are still + represented as integers, unless :func:`os.stat_float_times` is invoked to enable + float return values:: + + >>> os.stat("/tmp").st_mtime + 1034791200 + >>> os.stat_float_times(True) + >>> os.stat("/tmp").st_mtime + 1034791200.6335014 + + In Python 2.4, the default will change to always returning floats. + + Application developers should enable this feature only if all their libraries + work properly when confronted with floating point time stamps, or if they use + the tuple API. If used, the feature should be activated on an application level + instead of trying to enable it on a per-use basis. + +* The :mod:`optparse` module contains a new parser for command-line arguments + that can convert option values to a particular Python type and will + automatically generate a usage message. See the following section for more + details. + +* The old and never-documented :mod:`linuxaudiodev` module has been deprecated, + and a new version named :mod:`ossaudiodev` has been added. The module was + renamed because the OSS sound drivers can be used on platforms other than Linux, + and the interface has also been tidied and brought up to date in various ways. + (Contributed by Greg Ward and Nicholas FitzRoy-Dale.) + +* The new :mod:`platform` module contains a number of functions that try to + determine various properties of the platform you're running on. There are + functions for getting the architecture, CPU type, the Windows OS version, and + even the Linux distribution version. (Contributed by Marc-André Lemburg.) + +* The parser objects provided by the :mod:`pyexpat` module can now optionally + buffer character data, resulting in fewer calls to your character data handler + and therefore faster performance. Setting the parser object's + :attr:`buffer_text` attribute to :const:`True` will enable buffering. + +* The :func:`sample(population, k)` function was added to the :mod:`random` + module. *population* is a sequence or :class:`xrange` object containing the + elements of a population, and :func:`sample` chooses *k* elements from the + population without replacing chosen elements. *k* can be any value up to + ``len(population)``. For example:: + + >>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn'] + >>> random.sample(days, 3) # Choose 3 elements + ['St', 'Sn', 'Th'] + >>> random.sample(days, 7) # Choose 7 elements + ['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn'] + >>> random.sample(days, 7) # Choose 7 again + ['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th'] + >>> random.sample(days, 8) # Can't choose eight + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "random.py", line 414, in sample + raise ValueError, "sample larger than population" + ValueError: sample larger than population + >>> random.sample(xrange(1,10000,2), 10) # Choose ten odd nos. under 10000 + [3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195] + + The :mod:`random` module now uses a new algorithm, the Mersenne Twister, + implemented in C. It's faster and more extensively studied than the previous + algorithm. + + (All changes contributed by Raymond Hettinger.) + +* The :mod:`readline` module also gained a number of new functions: + :func:`get_history_item`, :func:`get_current_history_length`, and + :func:`redisplay`. + +* The :mod:`rexec` and :mod:`Bastion` modules have been declared dead, and + attempts to import them will fail with a :exc:`RuntimeError`. New-style classes + provide new ways to break out of the restricted execution environment provided + by :mod:`rexec`, and no one has interest in fixing them or time to do so. If + you have applications using :mod:`rexec`, rewrite them to use something else. + + (Sticking with Python 2.2 or 2.1 will not make your applications any safer + because there are known bugs in the :mod:`rexec` module in those versions. To + repeat: if you're using :mod:`rexec`, stop using it immediately.) + +* The :mod:`rotor` module has been deprecated because the algorithm it uses for + encryption is not believed to be secure. If you need encryption, use one of the + several AES Python modules that are available separately. + +* The :mod:`shutil` module gained a :func:`move(src, dest)` function that + recursively moves a file or directory to a new location. + +* Support for more advanced POSIX signal handling was added to the :mod:`signal` + but then removed again as it proved impossible to make it work reliably across + platforms. + +* The :mod:`socket` module now supports timeouts. You can call the + :meth:`settimeout(t)` method on a socket object to set a timeout of *t* seconds. + Subsequent socket operations that take longer than *t* seconds to complete will + abort and raise a :exc:`socket.timeout` exception. + + The original timeout implementation was by Tim O'Malley. Michael Gilfix + integrated it into the Python :mod:`socket` module and shepherded it through a + lengthy review. After the code was checked in, Guido van Rossum rewrote parts + of it. (This is a good example of a collaborative development process in + action.) + +* On Windows, the :mod:`socket` module now ships with Secure Sockets Layer + (SSL) support. + +* The value of the C :const:`PYTHON_API_VERSION` macro is now exposed at the + Python level as ``sys.api_version``. The current exception can be cleared by + calling the new :func:`sys.exc_clear` function. + +* The new :mod:`tarfile` module allows reading from and writing to + :program:`tar`\ -format archive files. (Contributed by Lars Gustäbel.) + +* The new :mod:`textwrap` module contains functions for wrapping strings + containing paragraphs of text. The :func:`wrap(text, width)` function takes a + string and returns a list containing the text split into lines of no more than + the chosen width. The :func:`fill(text, width)` function returns a single + string, reformatted to fit into lines no longer than the chosen width. (As you + can guess, :func:`fill` is built on top of :func:`wrap`. For example:: + + >>> import textwrap + >>> paragraph = "Not a whit, we defy augury: ... more text ..." + >>> textwrap.wrap(paragraph, 60) + ["Not a whit, we defy augury: there's a special providence in", + "the fall of a sparrow. If it be now, 'tis not to come; if it", + ...] + >>> print textwrap.fill(paragraph, 35) + Not a whit, we defy augury: there's + a special providence in the fall of + a sparrow. If it be now, 'tis not + to come; if it be not to come, it + will be now; if it be not now, yet + it will come: the readiness is all. + >>> + + The module also contains a :class:`TextWrapper` class that actually implements + the text wrapping strategy. Both the :class:`TextWrapper` class and the + :func:`wrap` and :func:`fill` functions support a number of additional keyword + arguments for fine-tuning the formatting; consult the module's documentation + for details. (Contributed by Greg Ward.) + +* The :mod:`thread` and :mod:`threading` modules now have companion modules, + :mod:`dummy_thread` and :mod:`dummy_threading`, that provide a do-nothing + implementation of the :mod:`thread` module's interface for platforms where + threads are not supported. The intention is to simplify thread-aware modules + (ones that *don't* rely on threads to run) by putting the following code at the + top:: + + try: + import threading as _threading + except ImportError: + import dummy_threading as _threading + + In this example, :mod:`_threading` is used as the module name to make it clear + that the module being used is not necessarily the actual :mod:`threading` + module. Code can call functions and use classes in :mod:`_threading` whether or + not threads are supported, avoiding an :keyword:`if` statement and making the + code slightly clearer. This module will not magically make multithreaded code + run without threads; code that waits for another thread to return or to do + something will simply hang forever. + +* The :mod:`time` module's :func:`strptime` function has long been an annoyance + because it uses the platform C library's :func:`strptime` implementation, and + different platforms sometimes have odd bugs. Brett Cannon contributed a + portable implementation that's written in pure Python and should behave + identically on all platforms. + +* The new :mod:`timeit` module helps measure how long snippets of Python code + take to execute. The :file:`timeit.py` file can be run directly from the + command line, or the module's :class:`Timer` class can be imported and used + directly. Here's a short example that figures out whether it's faster to + convert an 8-bit string to Unicode by appending an empty Unicode string to it or + by using the :func:`unicode` function:: + + import timeit + + timer1 = timeit.Timer('unicode("abc")') + timer2 = timeit.Timer('"abc" + u""') + + # Run three trials + print timer1.repeat(repeat=3, number=100000) + print timer2.repeat(repeat=3, number=100000) + + # On my laptop this outputs: + # [0.36831796169281006, 0.37441694736480713, 0.35304892063140869] + # [0.17574405670166016, 0.18193507194519043, 0.17565798759460449] + +* The :mod:`Tix` module has received various bug fixes and updates for the + current version of the Tix package. + +* The :mod:`Tkinter` module now works with a thread-enabled version of Tcl. + Tcl's threading model requires that widgets only be accessed from the thread in + which they're created; accesses from another thread can cause Tcl to panic. For + certain Tcl interfaces, :mod:`Tkinter` will now automatically avoid this when a + widget is accessed from a different thread by marshalling a command, passing it + to the correct thread, and waiting for the results. Other interfaces can't be + handled automatically but :mod:`Tkinter` will now raise an exception on such an + access so that you can at least find out about the problem. See + http://mail.python.org/pipermail/python-dev/2002-December/031107.html for a more + detailed explanation of this change. (Implemented by Martin von Löwis.) + + .. % + +* Calling Tcl methods through :mod:`_tkinter` no longer returns only strings. + Instead, if Tcl returns other objects those objects are converted to their + Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj` + object if no Python equivalent exists. This behavior can be controlled through + the :meth:`wantobjects` method of :class:`tkapp` objects. + + When using :mod:`_tkinter` through the :mod:`Tkinter` module (as most Tkinter + applications will), this feature is always activated. It should not cause + compatibility problems, since Tkinter would always convert string results to + Python types where possible. + + If any incompatibilities are found, the old behavior can be restored by setting + the :attr:`wantobjects` variable in the :mod:`Tkinter` module to false before + creating the first :class:`tkapp` object. :: + + import Tkinter + Tkinter.wantobjects = 0 + + Any breakage caused by this change should be reported as a bug. + +* The :mod:`UserDict` module has a new :class:`DictMixin` class which defines + all dictionary methods for classes that already have a minimum mapping + interface. This greatly simplifies writing classes that need to be + substitutable for dictionaries, such as the classes in the :mod:`shelve` + module. + + Adding the mix-in as a superclass provides the full dictionary interface + whenever the class defines :meth:`__getitem__`, :meth:`__setitem__`, + :meth:`__delitem__`, and :meth:`keys`. For example:: + + >>> import UserDict + >>> class SeqDict(UserDict.DictMixin): + ... """Dictionary lookalike implemented with lists.""" + ... def __init__(self): + ... self.keylist = [] + ... self.valuelist = [] + ... def __getitem__(self, key): + ... try: + ... i = self.keylist.index(key) + ... except ValueError: + ... raise KeyError + ... return self.valuelist[i] + ... def __setitem__(self, key, value): + ... try: + ... i = self.keylist.index(key) + ... self.valuelist[i] = value + ... except ValueError: + ... self.keylist.append(key) + ... self.valuelist.append(value) + ... def __delitem__(self, key): + ... try: + ... i = self.keylist.index(key) + ... except ValueError: + ... raise KeyError + ... self.keylist.pop(i) + ... self.valuelist.pop(i) + ... def keys(self): + ... return list(self.keylist) + ... + >>> s = SeqDict() + >>> dir(s) # See that other dictionary methods are implemented + ['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__', + '__init__', '__iter__', '__len__', '__module__', '__repr__', + '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems', + 'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem', + 'setdefault', 'update', 'valuelist', 'values'] + + (Contributed by Raymond Hettinger.) + +* The DOM implementation in :mod:`xml.dom.minidom` can now generate XML output + in a particular encoding by providing an optional encoding argument to the + :meth:`toxml` and :meth:`toprettyxml` methods of DOM nodes. + +* The :mod:`xmlrpclib` module now supports an XML-RPC extension for handling nil + data values such as Python's ``None``. Nil values are always supported on + unmarshalling an XML-RPC response. To generate requests containing ``None``, + you must supply a true value for the *allow_none* parameter when creating a + :class:`Marshaller` instance. + +* The new :mod:`DocXMLRPCServer` module allows writing self-documenting XML-RPC + servers. Run it in demo mode (as a program) to see it in action. Pointing the + Web browser to the RPC server produces pydoc-style documentation; pointing + xmlrpclib to the server allows invoking the actual methods. (Contributed by + Brian Quinlan.) + +* Support for internationalized domain names (RFCs 3454, 3490, 3491, and 3492) + has been added. The "idna" encoding can be used to convert between a Unicode + domain name and the ASCII-compatible encoding (ACE) of that name. :: + + >{}>{}> u"www.Alliancefrançaise.nu".encode("idna") + 'www.xn--alliancefranaise-npb.nu' + + The :mod:`socket` module has also been extended to transparently convert + Unicode hostnames to the ACE version before passing them to the C library. + Modules that deal with hostnames such as :mod:`httplib` and :mod:`ftplib`) + also support Unicode host names; :mod:`httplib` also sends HTTP ``Host`` + headers using the ACE version of the domain name. :mod:`urllib` supports + Unicode URLs with non-ASCII host names as long as the ``path`` part of the URL + is ASCII only. + + To implement this change, the :mod:`stringprep` module, the ``mkstringprep`` + tool and the ``punycode`` encoding have been added. + +.. % ====================================================================== + + +Date/Time Type +-------------- + +Date and time types suitable for expressing timestamps were added as the +:mod:`datetime` module. The types don't support different calendars or many +fancy features, and just stick to the basics of representing time. + +The three primary types are: :class:`date`, representing a day, month, and year; +:class:`time`, consisting of hour, minute, and second; and :class:`datetime`, +which contains all the attributes of both :class:`date` and :class:`time`. +There's also a :class:`timedelta` class representing differences between two +points in time, and time zone logic is implemented by classes inheriting from +the abstract :class:`tzinfo` class. + +You can create instances of :class:`date` and :class:`time` by either supplying +keyword arguments to the appropriate constructor, e.g. +``datetime.date(year=1972, month=10, day=15)``, or by using one of a number of +class methods. For example, the :meth:`date.today` class method returns the +current local date. + +Once created, instances of the date/time classes are all immutable. There are a +number of methods for producing formatted strings from objects:: + + >>> import datetime + >>> now = datetime.datetime.now() + >>> now.isoformat() + '2002-12-30T21:27:03.994956' + >>> now.ctime() # Only available on date, datetime + 'Mon Dec 30 21:27:03 2002' + >>> now.strftime('%Y %d %b') + '2002 30 Dec' + +The :meth:`replace` method allows modifying one or more fields of a +:class:`date` or :class:`datetime` instance, returning a new instance:: + + >>> d = datetime.datetime.now() + >>> d + datetime.datetime(2002, 12, 30, 22, 15, 38, 827738) + >>> d.replace(year=2001, hour = 12) + datetime.datetime(2001, 12, 30, 12, 15, 38, 827738) + >>> + +Instances can be compared, hashed, and converted to strings (the result is the +same as that of :meth:`isoformat`). :class:`date` and :class:`datetime` +instances can be subtracted from each other, and added to :class:`timedelta` +instances. The largest missing feature is that there's no standard library +support for parsing strings and getting back a :class:`date` or +:class:`datetime`. + +For more information, refer to the module's reference documentation. +(Contributed by Tim Peters.) + +.. % ====================================================================== + + +The optparse Module +------------------- + +The :mod:`getopt` module provides simple parsing of command-line arguments. The +new :mod:`optparse` module (originally named Optik) provides more elaborate +command-line parsing that follows the Unix conventions, automatically creates +the output for :option:`--help`, and can perform different actions for different +options. + +You start by creating an instance of :class:`OptionParser` and telling it what +your program's options are. :: + + import sys + from optparse import OptionParser + + op = OptionParser() + op.add_option('-i', '--input', + action='store', type='string', dest='input', + help='set input filename') + op.add_option('-l', '--length', + action='store', type='int', dest='length', + help='set maximum length of output') + +Parsing a command line is then done by calling the :meth:`parse_args` method. :: + + options, args = op.parse_args(sys.argv[1:]) + print options + print args + +This returns an object containing all of the option values, and a list of +strings containing the remaining arguments. + +Invoking the script with the various arguments now works as you'd expect it to. +Note that the length argument is automatically converted to an integer. :: + + $ ./python opt.py -i data arg1 + <Values at 0x400cad4c: {'input': 'data', 'length': None}> + ['arg1'] + $ ./python opt.py --input=data --length=4 + <Values at 0x400cad2c: {'input': 'data', 'length': 4}> + [] + $ + +The help message is automatically generated for you:: + + $ ./python opt.py --help + usage: opt.py [options] + + options: + -h, --help show this help message and exit + -iINPUT, --input=INPUT + set input filename + -lLENGTH, --length=LENGTH + set maximum length of output + $ + +See the module's documentation for more details. + + +Optik was written by Greg Ward, with suggestions from the readers of the Getopt +SIG. + +.. % ====================================================================== + + +.. _section-pymalloc: + +Pymalloc: A Specialized Object Allocator +======================================== + +Pymalloc, a specialized object allocator written by Vladimir Marangozov, was a +feature added to Python 2.1. Pymalloc is intended to be faster than the system +:cfunc:`malloc` and to have less memory overhead for allocation patterns typical +of Python programs. The allocator uses C's :cfunc:`malloc` function to get large +pools of memory and then fulfills smaller memory requests from these pools. + +In 2.1 and 2.2, pymalloc was an experimental feature and wasn't enabled by +default; you had to explicitly enable it when compiling Python by providing the +:option:`--with-pymalloc` option to the :program:`configure` script. In 2.3, +pymalloc has had further enhancements and is now enabled by default; you'll have +to supply :option:`--without-pymalloc` to disable it. + +This change is transparent to code written in Python; however, pymalloc may +expose bugs in C extensions. Authors of C extension modules should test their +code with pymalloc enabled, because some incorrect code may cause core dumps at +runtime. + +There's one particularly common error that causes problems. There are a number +of memory allocation functions in Python's C API that have previously just been +aliases for the C library's :cfunc:`malloc` and :cfunc:`free`, meaning that if +you accidentally called mismatched functions the error wouldn't be noticeable. +When the object allocator is enabled, these functions aren't aliases of +:cfunc:`malloc` and :cfunc:`free` any more, and calling the wrong function to +free memory may get you a core dump. For example, if memory was allocated using +:cfunc:`PyObject_Malloc`, it has to be freed using :cfunc:`PyObject_Free`, not +:cfunc:`free`. A few modules included with Python fell afoul of this and had to +be fixed; doubtless there are more third-party modules that will have the same +problem. + +As part of this change, the confusing multiple interfaces for allocating memory +have been consolidated down into two API families. Memory allocated with one +family must not be manipulated with functions from the other family. There is +one family for allocating chunks of memory and another family of functions +specifically for allocating Python objects. + +* To allocate and free an undistinguished chunk of memory use the "raw memory" + family: :cfunc:`PyMem_Malloc`, :cfunc:`PyMem_Realloc`, and :cfunc:`PyMem_Free`. + +* The "object memory" family is the interface to the pymalloc facility described + above and is biased towards a large number of "small" allocations: + :cfunc:`PyObject_Malloc`, :cfunc:`PyObject_Realloc`, and :cfunc:`PyObject_Free`. + +* To allocate and free Python objects, use the "object" family + :cfunc:`PyObject_New`, :cfunc:`PyObject_NewVar`, and :cfunc:`PyObject_Del`. + +Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides debugging +features to catch memory overwrites and doubled frees in both extension modules +and in the interpreter itself. To enable this support, compile a debugging +version of the Python interpreter by running :program:`configure` with +:option:`--with-pydebug`. + +To aid extension writers, a header file :file:`Misc/pymemcompat.h` is +distributed with the source to Python 2.3 that allows Python extensions to use +the 2.3 interfaces to memory allocation while compiling against any version of +Python since 1.5.2. You would copy the file from Python's source distribution +and bundle it with the source of your extension. + + +.. seealso:: + + http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/obmalloc.c + For the full details of the pymalloc implementation, see the comments at the top + of the file :file:`Objects/obmalloc.c` in the Python source code. The above + link points to the file within the SourceForge CVS browser. + +.. % ====================================================================== + + +Build and C API Changes +======================= + +Changes to Python's build process and to the C API include: + +* The cycle detection implementation used by the garbage collection has proven + to be stable, so it's now been made mandatory. You can no longer compile Python + without it, and the :option:`--with-cycle-gc` switch to :program:`configure` has + been removed. + +* Python can now optionally be built as a shared library + (:file:`libpython2.3.so`) by supplying :option:`--enable-shared` when running + Python's :program:`configure` script. (Contributed by Ondrej Palkovsky.) + +* The :cmacro:`DL_EXPORT` and :cmacro:`DL_IMPORT` macros are now deprecated. + Initialization functions for Python extension modules should now be declared + using the new macro :cmacro:`PyMODINIT_FUNC`, while the Python core will + generally use the :cmacro:`PyAPI_FUNC` and :cmacro:`PyAPI_DATA` macros. + +* The interpreter can be compiled without any docstrings for the built-in + functions and modules by supplying :option:`--without-doc-strings` to the + :program:`configure` script. This makes the Python executable about 10% smaller, + but will also mean that you can't get help for Python's built-ins. (Contributed + by Gustavo Niemeyer.) + +* The :cfunc:`PyArg_NoArgs` macro is now deprecated, and code that uses it + should be changed. For Python 2.2 and later, the method definition table can + specify the :const:`METH_NOARGS` flag, signalling that there are no arguments, + and the argument checking can then be removed. If compatibility with pre-2.2 + versions of Python is important, the code could use ``PyArg_ParseTuple(args, + "")`` instead, but this will be slower than using :const:`METH_NOARGS`. + +* :cfunc:`PyArg_ParseTuple` accepts new format characters for various sizes of + unsigned integers: ``B`` for :ctype:`unsigned char`, ``H`` for :ctype:`unsigned + short int`, ``I`` for :ctype:`unsigned int`, and ``K`` for :ctype:`unsigned + long long`. + +* A new function, :cfunc:`PyObject_DelItemString(mapping, char \*key)` was added + as shorthand for ``PyObject_DelItem(mapping, PyString_New(key))``. + +* File objects now manage their internal string buffer differently, increasing + it exponentially when needed. This results in the benchmark tests in + :file:`Lib/test/test_bufio.py` speeding up considerably (from 57 seconds to 1.7 + seconds, according to one measurement). + +* It's now possible to define class and static methods for a C extension type by + setting either the :const:`METH_CLASS` or :const:`METH_STATIC` flags in a + method's :ctype:`PyMethodDef` structure. + +* Python now includes a copy of the Expat XML parser's source code, removing any + dependence on a system version or local installation of Expat. + +* If you dynamically allocate type objects in your extension, you should be + aware of a change in the rules relating to the :attr:`__module__` and + :attr:`__name__` attributes. In summary, you will want to ensure the type's + dictionary contains a ``'__module__'`` key; making the module name the part of + the type name leading up to the final period will no longer have the desired + effect. For more detail, read the API reference documentation or the source. + +.. % ====================================================================== + + +Port-Specific Changes +--------------------- + +Support for a port to IBM's OS/2 using the EMX runtime environment was merged +into the main Python source tree. EMX is a POSIX emulation layer over the OS/2 +system APIs. The Python port for EMX tries to support all the POSIX-like +capability exposed by the EMX runtime, and mostly succeeds; :func:`fork` and +:func:`fcntl` are restricted by the limitations of the underlying emulation +layer. The standard OS/2 port, which uses IBM's Visual Age compiler, also +gained support for case-sensitive import semantics as part of the integration of +the EMX port into CVS. (Contributed by Andrew MacIntyre.) + +On MacOS, most toolbox modules have been weaklinked to improve backward +compatibility. This means that modules will no longer fail to load if a single +routine is missing on the current OS version. Instead calling the missing +routine will raise an exception. (Contributed by Jack Jansen.) + +The RPM spec files, found in the :file:`Misc/RPM/` directory in the Python +source distribution, were updated for 2.3. (Contributed by Sean Reifschneider.) + +Other new platforms now supported by Python include AtheOS +(http://www.atheos.cx/), GNU/Hurd, and OpenVMS. + +.. % ====================================================================== + + +.. _section-other: + +Other Changes and Fixes +======================= + +As usual, there were a bunch of other improvements and bugfixes scattered +throughout the source tree. A search through the CVS change logs finds there +were 523 patches applied and 514 bugs fixed between Python 2.2 and 2.3. Both +figures are likely to be underestimates. + +Some of the more notable changes are: + +* If the :envvar:`PYTHONINSPECT` environment variable is set, the Python + interpreter will enter the interactive prompt after running a Python program, as + if Python had been invoked with the :option:`-i` option. The environment + variable can be set before running the Python interpreter, or it can be set by + the Python program as part of its execution. + +* The :file:`regrtest.py` script now provides a way to allow "all resources + except *foo*." A resource name passed to the :option:`-u` option can now be + prefixed with a hyphen (``'-'``) to mean "remove this resource." For example, + the option '``-uall,-bsddb``' could be used to enable the use of all resources + except ``bsddb``. + +* The tools used to build the documentation now work under Cygwin as well as + Unix. + +* The ``SET_LINENO`` opcode has been removed. Back in the mists of time, this + opcode was needed to produce line numbers in tracebacks and support trace + functions (for, e.g., :mod:`pdb`). Since Python 1.5, the line numbers in + tracebacks have been computed using a different mechanism that works with + "python -O". For Python 2.3 Michael Hudson implemented a similar scheme to + determine when to call the trace function, removing the need for ``SET_LINENO`` + entirely. + + It would be difficult to detect any resulting difference from Python code, apart + from a slight speed up when Python is run without :option:`-O`. + + C extensions that access the :attr:`f_lineno` field of frame objects should + instead call ``PyCode_Addr2Line(f->f_code, f->f_lasti)``. This will have the + added effect of making the code work as desired under "python -O" in earlier + versions of Python. + + A nifty new feature is that trace functions can now assign to the + :attr:`f_lineno` attribute of frame objects, changing the line that will be + executed next. A ``jump`` command has been added to the :mod:`pdb` debugger + taking advantage of this new feature. (Implemented by Richie Hindle.) + +.. % ====================================================================== + + +Porting to Python 2.3 +===================== + +This section lists previously described changes that may require changes to your +code: + +* :keyword:`yield` is now always a keyword; if it's used as a variable name in + your code, a different name must be chosen. + +* For strings *X* and *Y*, ``X in Y`` now works if *X* is more than one + character long. + +* The :func:`int` type constructor will now return a long integer instead of + raising an :exc:`OverflowError` when a string or floating-point number is too + large to fit into an integer. + +* If you have Unicode strings that contain 8-bit characters, you must declare + the file's encoding (UTF-8, Latin-1, or whatever) by adding a comment to the top + of the file. See section :ref:`section-encodings` for more information. + +* Calling Tcl methods through :mod:`_tkinter` no longer returns only strings. + Instead, if Tcl returns other objects those objects are converted to their + Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj` + object if no Python equivalent exists. + +* Large octal and hex literals such as ``0xffffffff`` now trigger a + :exc:`FutureWarning`. Currently they're stored as 32-bit numbers and result in a + negative value, but in Python 2.4 they'll become positive long integers. + + There are a few ways to fix this warning. If you really need a positive number, + just add an ``L`` to the end of the literal. If you're trying to get a 32-bit + integer with low bits set and have previously used an expression such as ``~(1 + << 31)``, it's probably clearest to start with all bits set and clear the + desired upper bits. For example, to clear just the top bit (bit 31), you could + write ``0xffffffffL &~(1L<<31)``. + + .. % The empty groups below prevent conversion to guillemets. + +* You can no longer disable assertions by assigning to ``__debug__``. + +* The Distutils :func:`setup` function has gained various new keyword arguments + such as *depends*. Old versions of the Distutils will abort if passed unknown + keywords. A solution is to check for the presence of the new + :func:`get_distutil_options` function in your :file:`setup.py` and only uses the + new keywords with a version of the Distutils that supports them:: + + from distutils import core + + kw = {'sources': 'foo.c', ...} + if hasattr(core, 'get_distutil_options'): + kw['depends'] = ['foo.h'] + ext = Extension(**kw) + +* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning` + warning. + +* Names of extension types defined by the modules included with Python now + contain the module and a ``'.'`` in front of the type name. + +.. % ====================================================================== + + +.. _acks: + +Acknowledgements +================ + +The author would like to thank the following people for offering suggestions, +corrections and assistance with various drafts of this article: Jeff Bauer, +Simon Brunning, Brett Cannon, Michael Chermside, Andrew Dalke, Scott David +Daniels, Fred L. Drake, Jr., David Fraser, Kelly Gerber, Raymond Hettinger, +Michael Hudson, Chris Lambert, Detlef Lannert, Martin von Löwis, Andrew +MacIntyre, Lalo Martins, Chad Netzer, Gustavo Niemeyer, Neal Norwitz, Hans +Nowak, Chris Reedy, Francesco Ricciardi, Vinay Sajip, Neil Schemenauer, Roman +Suzi, Jason Tishler, Just van Rossum. + diff --git a/Doc/whatsnew/2.4.rst b/Doc/whatsnew/2.4.rst new file mode 100644 index 0000000..d782f5d --- /dev/null +++ b/Doc/whatsnew/2.4.rst @@ -0,0 +1,1571 @@ +**************************** + What's New in Python 2.4 +**************************** + +:Author: A.M. Kuchling + +.. |release| replace:: 1.02 + +.. % $Id: whatsnew24.tex 55005 2007-04-27 19:54:29Z guido.van.rossum $ +.. % Don't write extensive text for new sections; I'll do that. +.. % Feel free to add commented-out reminders of things that need +.. % to be covered. --amk + +This article explains the new features in Python 2.4.1, released on March 30, +2005. + +Python 2.4 is a medium-sized release. It doesn't introduce as many changes as +the radical Python 2.2, but introduces more features than the conservative 2.3 +release. The most significant new language features are function decorators and +generator expressions; most other changes are to the standard library. + +According to the CVS change logs, there were 481 patches applied and 502 bugs +fixed between Python 2.3 and 2.4. Both figures are likely to be underestimates. + +This article doesn't attempt to provide a complete specification of every single +new feature, but instead provides a brief introduction to each feature. For +full details, you should refer to the documentation for Python 2.4, such as the +Python Library Reference and the Python Reference Manual. Often you will be +referred to the PEP for a particular new feature for explanations of the +implementation and design rationale. + +.. % ====================================================================== + + +PEP 218: Built-In Set Objects +============================= + +Python 2.3 introduced the :mod:`sets` module. C implementations of set data +types have now been added to the Python core as two new built-in types, +:func:`set(iterable)` and :func:`frozenset(iterable)`. They provide high speed +operations for membership testing, for eliminating duplicates from sequences, +and for mathematical operations like unions, intersections, differences, and +symmetric differences. :: + + >>> a = set('abracadabra') # form a set from a string + >>> 'z' in a # fast membership testing + False + >>> a # unique letters in a + set(['a', 'r', 'b', 'c', 'd']) + >>> ''.join(a) # convert back into a string + 'arbcd' + + >>> b = set('alacazam') # form a second set + >>> a - b # letters in a but not in b + set(['r', 'd', 'b']) + >>> a | b # letters in either a or b + set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) + >>> a & b # letters in both a and b + set(['a', 'c']) + >>> a ^ b # letters in a or b but not both + set(['r', 'd', 'b', 'm', 'z', 'l']) + + >>> a.add('z') # add a new element + >>> a.update('wxy') # add multiple new elements + >>> a + set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z']) + >>> a.remove('x') # take one element out + >>> a + set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z']) + +The :func:`frozenset` type is an immutable version of :func:`set`. Since it is +immutable and hashable, it may be used as a dictionary key or as a member of +another set. + +The :mod:`sets` module remains in the standard library, and may be useful if you +wish to subclass the :class:`Set` or :class:`ImmutableSet` classes. There are +currently no plans to deprecate the module. + + +.. seealso:: + + :pep:`218` - Adding a Built-In Set Object Type + Originally proposed by Greg Wilson and ultimately implemented by Raymond + Hettinger. + +.. % ====================================================================== + + +PEP 237: Unifying Long Integers and Integers +============================================ + +The lengthy transition process for this PEP, begun in Python 2.2, takes another +step forward in Python 2.4. In 2.3, certain integer operations that would +behave differently after int/long unification triggered :exc:`FutureWarning` +warnings and returned values limited to 32 or 64 bits (depending on your +platform). In 2.4, these expressions no longer produce a warning and instead +produce a different result that's usually a long integer. + +The problematic expressions are primarily left shifts and lengthy hexadecimal +and octal constants. For example, ``2 << 32`` results in a warning in 2.3, +evaluating to 0 on 32-bit platforms. In Python 2.4, this expression now returns +the correct answer, 8589934592. + + +.. seealso:: + + :pep:`237` - Unifying Long Integers and Integers + Original PEP written by Moshe Zadka and GvR. The changes for 2.4 were + implemented by Kalle Svensson. + +.. % ====================================================================== + + +PEP 289: Generator Expressions +============================== + +The iterator feature introduced in Python 2.2 and the :mod:`itertools` module +make it easier to write programs that loop through large data sets without +having the entire data set in memory at one time. List comprehensions don't fit +into this picture very well because they produce a Python list object containing +all of the items. This unavoidably pulls all of the objects into memory, which +can be a problem if your data set is very large. When trying to write a +functionally-styled program, it would be natural to write something like:: + + links = [link for link in get_all_links() if not link.followed] + for link in links: + ... + +instead of :: + + for link in get_all_links(): + if link.followed: + continue + ... + +The first form is more concise and perhaps more readable, but if you're dealing +with a large number of link objects you'd have to write the second form to avoid +having all link objects in memory at the same time. + +Generator expressions work similarly to list comprehensions but don't +materialize the entire list; instead they create a generator that will return +elements one by one. The above example could be written as:: + + links = (link for link in get_all_links() if not link.followed) + for link in links: + ... + +Generator expressions always have to be written inside parentheses, as in the +above example. The parentheses signalling a function call also count, so if you +want to create an iterator that will be immediately passed to a function you +could write:: + + print sum(obj.count for obj in list_all_objects()) + +Generator expressions differ from list comprehensions in various small ways. +Most notably, the loop variable (*obj* in the above example) is not accessible +outside of the generator expression. List comprehensions leave the variable +assigned to its last value; future versions of Python will change this, making +list comprehensions match generator expressions in this respect. + + +.. seealso:: + + :pep:`289` - Generator Expressions + Proposed by Raymond Hettinger and implemented by Jiwon Seo with early efforts + steered by Hye-Shik Chang. + +.. % ====================================================================== + + +PEP 292: Simpler String Substitutions +===================================== + +Some new classes in the standard library provide an alternative mechanism for +substituting variables into strings; this style of substitution may be better +for applications where untrained users need to edit templates. + +The usual way of substituting variables by name is the ``%`` operator:: + + >>> '%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'} + '2: The Best of Times' + +When writing the template string, it can be easy to forget the ``i`` or ``s`` +after the closing parenthesis. This isn't a big problem if the template is in a +Python module, because you run the code, get an "Unsupported format character" +:exc:`ValueError`, and fix the problem. However, consider an application such +as Mailman where template strings or translations are being edited by users who +aren't aware of the Python language. The format string's syntax is complicated +to explain to such users, and if they make a mistake, it's difficult to provide +helpful feedback to them. + +PEP 292 adds a :class:`Template` class to the :mod:`string` module that uses +``$`` to indicate a substitution:: + + >>> import string + >>> t = string.Template('$page: $title') + >>> t.substitute({'page':2, 'title': 'The Best of Times'}) + '2: The Best of Times' + +If a key is missing from the dictionary, the :meth:`substitute` method will +raise a :exc:`KeyError`. There's also a :meth:`safe_substitute` method that +ignores missing keys: + +.. % $ Terminate $-mode for Emacs + +:: + + >>> t = string.Template('$page: $title') + >>> t.safe_substitute({'page':3}) + '3: $title' + +.. % $ Terminate math-mode for Emacs + + +.. seealso:: + + :pep:`292` - Simpler String Substitutions + Written and implemented by Barry Warsaw. + +.. % ====================================================================== + + +PEP 318: Decorators for Functions and Methods +============================================= + +Python 2.2 extended Python's object model by adding static methods and class +methods, but it didn't extend Python's syntax to provide any new way of defining +static or class methods. Instead, you had to write a :keyword:`def` statement +in the usual way, and pass the resulting method to a :func:`staticmethod` or +:func:`classmethod` function that would wrap up the function as a method of the +new type. Your code would look like this:: + + class C: + def meth (cls): + ... + + meth = classmethod(meth) # Rebind name to wrapped-up class method + +If the method was very long, it would be easy to miss or forget the +:func:`classmethod` invocation after the function body. + +The intention was always to add some syntax to make such definitions more +readable, but at the time of 2.2's release a good syntax was not obvious. Today +a good syntax *still* isn't obvious but users are asking for easier access to +the feature; a new syntactic feature has been added to meet this need. + +The new feature is called "function decorators". The name comes from the idea +that :func:`classmethod`, :func:`staticmethod`, and friends are storing +additional information on a function object; they're *decorating* functions with +more details. + +The notation borrows from Java and uses the ``'@'`` character as an indicator. +Using the new syntax, the example above would be written:: + + class C: + + @classmethod + def meth (cls): + ... + + +The ``@classmethod`` is shorthand for the ``meth=classmethod(meth)`` assignment. +More generally, if you have the following:: + + @A + @B + @C + def f (): + ... + +It's equivalent to the following pre-decorator code:: + + def f(): ... + f = A(B(C(f))) + +Decorators must come on the line before a function definition, one decorator per +line, and can't be on the same line as the def statement, meaning that ``@A def +f(): ...`` is illegal. You can only decorate function definitions, either at +the module level or inside a class; you can't decorate class definitions. + +A decorator is just a function that takes the function to be decorated as an +argument and returns either the same function or some new object. The return +value of the decorator need not be callable (though it typically is), unless +further decorators will be applied to the result. It's easy to write your own +decorators. The following simple example just sets an attribute on the function +object:: + + >>> def deco(func): + ... func.attr = 'decorated' + ... return func + ... + >>> @deco + ... def f(): pass + ... + >>> f + <function f at 0x402ef0d4> + >>> f.attr + 'decorated' + >>> + +As a slightly more realistic example, the following decorator checks that the +supplied argument is an integer:: + + def require_int (func): + def wrapper (arg): + assert isinstance(arg, int) + return func(arg) + + return wrapper + + @require_int + def p1 (arg): + print arg + + @require_int + def p2(arg): + print arg*2 + +An example in :pep:`318` contains a fancier version of this idea that lets you +both specify the required type and check the returned type. + +Decorator functions can take arguments. If arguments are supplied, your +decorator function is called with only those arguments and must return a new +decorator function; this function must take a single function and return a +function, as previously described. In other words, ``@A @B @C(args)`` becomes:: + + def f(): ... + _deco = C(args) + f = A(B(_deco(f))) + +Getting this right can be slightly brain-bending, but it's not too difficult. + +A small related change makes the :attr:`func_name` attribute of functions +writable. This attribute is used to display function names in tracebacks, so +decorators should change the name of any new function that's constructed and +returned. + + +.. seealso:: + + :pep:`318` - Decorators for Functions, Methods and Classes + Written by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people + wrote patches implementing function decorators, but the one that was actually + checked in was patch #979728, written by Mark Russell. + + http://www.python.org/moin/PythonDecoratorLibrary + This Wiki page contains several examples of decorators. + +.. % ====================================================================== + + +PEP 322: Reverse Iteration +========================== + +A new built-in function, :func:`reversed(seq)`, takes a sequence and returns an +iterator that loops over the elements of the sequence in reverse order. :: + + >>> for i in reversed(xrange(1,4)): + ... print i + ... + 3 + 2 + 1 + +Compared to extended slicing, such as ``range(1,4)[::-1]``, :func:`reversed` is +easier to read, runs faster, and uses substantially less memory. + +Note that :func:`reversed` only accepts sequences, not arbitrary iterators. If +you want to reverse an iterator, first convert it to a list with :func:`list`. +:: + + >>> input = open('/etc/passwd', 'r') + >>> for line in reversed(list(input)): + ... print line + ... + root:*:0:0:System Administrator:/var/root:/bin/tcsh + ... + + +.. seealso:: + + :pep:`322` - Reverse Iteration + Written and implemented by Raymond Hettinger. + +.. % ====================================================================== + + +PEP 324: New subprocess Module +============================== + +The standard library provides a number of ways to execute a subprocess, offering +different features and different levels of complexity. +:func:`os.system(command)` is easy to use, but slow (it runs a shell process +which executes the command) and dangerous (you have to be careful about escaping +the shell's metacharacters). The :mod:`popen2` module offers classes that can +capture standard output and standard error from the subprocess, but the naming +is confusing. The :mod:`subprocess` module cleans this up, providing a unified +interface that offers all the features you might need. + +Instead of :mod:`popen2`'s collection of classes, :mod:`subprocess` contains a +single class called :class:`Popen` whose constructor supports a number of +different keyword arguments. :: + + class Popen(args, bufsize=0, executable=None, + stdin=None, stdout=None, stderr=None, + preexec_fn=None, close_fds=False, shell=False, + cwd=None, env=None, universal_newlines=False, + startupinfo=None, creationflags=0): + +*args* is commonly a sequence of strings that will be the arguments to the +program executed as the subprocess. (If the *shell* argument is true, *args* +can be a string which will then be passed on to the shell for interpretation, +just as :func:`os.system` does.) + +*stdin*, *stdout*, and *stderr* specify what the subprocess's input, output, and +error streams will be. You can provide a file object or a file descriptor, or +you can use the constant ``subprocess.PIPE`` to create a pipe between the +subprocess and the parent. + +The constructor has a number of handy options: + +* *close_fds* requests that all file descriptors be closed before running the + subprocess. + +* *cwd* specifies the working directory in which the subprocess will be executed + (defaulting to whatever the parent's working directory is). + +* *env* is a dictionary specifying environment variables. + +* *preexec_fn* is a function that gets called before the child is started. + +* *universal_newlines* opens the child's input and output using Python's + universal newline feature. + +Once you've created the :class:`Popen` instance, you can call its :meth:`wait` +method to pause until the subprocess has exited, :meth:`poll` to check if it's +exited without pausing, or :meth:`communicate(data)` to send the string *data* +to the subprocess's standard input. :meth:`communicate(data)` then reads any +data that the subprocess has sent to its standard output or standard error, +returning a tuple ``(stdout_data, stderr_data)``. + +:func:`call` is a shortcut that passes its arguments along to the :class:`Popen` +constructor, waits for the command to complete, and returns the status code of +the subprocess. It can serve as a safer analog to :func:`os.system`:: + + sts = subprocess.call(['dpkg', '-i', '/tmp/new-package.deb']) + if sts == 0: + # Success + ... + else: + # dpkg returned an error + ... + +The command is invoked without use of the shell. If you really do want to use +the shell, you can add ``shell=True`` as a keyword argument and provide a string +instead of a sequence:: + + sts = subprocess.call('dpkg -i /tmp/new-package.deb', shell=True) + +The PEP takes various examples of shell and Python code and shows how they'd be +translated into Python code that uses :mod:`subprocess`. Reading this section +of the PEP is highly recommended. + + +.. seealso:: + + :pep:`324` - subprocess - New process module + Written and implemented by Peter Åstrand, with assistance from Fredrik Lundh and + others. + +.. % ====================================================================== + + +PEP 327: Decimal Data Type +========================== + +Python has always supported floating-point (FP) numbers, based on the underlying +C :ctype:`double` type, as a data type. However, while most programming +languages provide a floating-point type, many people (even programmers) are +unaware that floating-point numbers don't represent certain decimal fractions +accurately. The new :class:`Decimal` type can represent these fractions +accurately, up to a user-specified precision limit. + + +Why is Decimal needed? +---------------------- + +The limitations arise from the representation used for floating-point numbers. +FP numbers are made up of three components: + +* The sign, which is positive or negative. + +* The mantissa, which is a single-digit binary number followed by a fractional + part. For example, ``1.01`` in base-2 notation is ``1 + 0/2 + 1/4``, or 1.25 in + decimal notation. + +* The exponent, which tells where the decimal point is located in the number + represented. + +For example, the number 1.25 has positive sign, a mantissa value of 1.01 (in +binary), and an exponent of 0 (the decimal point doesn't need to be shifted). +The number 5 has the same sign and mantissa, but the exponent is 2 because the +mantissa is multiplied by 4 (2 to the power of the exponent 2); 1.25 \* 4 equals +5. + +Modern systems usually provide floating-point support that conforms to a +standard called IEEE 754. C's :ctype:`double` type is usually implemented as a +64-bit IEEE 754 number, which uses 52 bits of space for the mantissa. This +means that numbers can only be specified to 52 bits of precision. If you're +trying to represent numbers whose expansion repeats endlessly, the expansion is +cut off after 52 bits. Unfortunately, most software needs to produce output in +base 10, and common fractions in base 10 are often repeating decimals in binary. +For example, 1.1 decimal is binary ``1.0001100110011 ...``; .1 = 1/16 + 1/32 + +1/256 plus an infinite number of additional terms. IEEE 754 has to chop off +that infinitely repeated decimal after 52 digits, so the representation is +slightly inaccurate. + +Sometimes you can see this inaccuracy when the number is printed:: + + >>> 1.1 + 1.1000000000000001 + +The inaccuracy isn't always visible when you print the number because the FP-to- +decimal-string conversion is provided by the C library, and most C libraries try +to produce sensible output. Even if it's not displayed, however, the inaccuracy +is still there and subsequent operations can magnify the error. + +For many applications this doesn't matter. If I'm plotting points and +displaying them on my monitor, the difference between 1.1 and 1.1000000000000001 +is too small to be visible. Reports often limit output to a certain number of +decimal places, and if you round the number to two or three or even eight +decimal places, the error is never apparent. However, for applications where it +does matter, it's a lot of work to implement your own custom arithmetic +routines. + +Hence, the :class:`Decimal` type was created. + + +The :class:`Decimal` type +------------------------- + +A new module, :mod:`decimal`, was added to Python's standard library. It +contains two classes, :class:`Decimal` and :class:`Context`. :class:`Decimal` +instances represent numbers, and :class:`Context` instances are used to wrap up +various settings such as the precision and default rounding mode. + +:class:`Decimal` instances are immutable, like regular Python integers and FP +numbers; once it's been created, you can't change the value an instance +represents. :class:`Decimal` instances can be created from integers or +strings:: + + >>> import decimal + >>> decimal.Decimal(1972) + Decimal("1972") + >>> decimal.Decimal("1.1") + Decimal("1.1") + +You can also provide tuples containing the sign, the mantissa represented as a +tuple of decimal digits, and the exponent:: + + >>> decimal.Decimal((1, (1, 4, 7, 5), -2)) + Decimal("-14.75") + +Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 is +negative. + +Converting from floating-point numbers poses a bit of a problem: should the FP +number representing 1.1 turn into the decimal number for exactly 1.1, or for 1.1 +plus whatever inaccuracies are introduced? The decision was to dodge the issue +and leave such a conversion out of the API. Instead, you should convert the +floating-point number into a string using the desired precision and pass the +string to the :class:`Decimal` constructor:: + + >>> f = 1.1 + >>> decimal.Decimal(str(f)) + Decimal("1.1") + >>> decimal.Decimal('%.12f' % f) + Decimal("1.100000000000") + +Once you have :class:`Decimal` instances, you can perform the usual mathematical +operations on them. One limitation: exponentiation requires an integer +exponent:: + + >>> a = decimal.Decimal('35.72') + >>> b = decimal.Decimal('1.73') + >>> a+b + Decimal("37.45") + >>> a-b + Decimal("33.99") + >>> a*b + Decimal("61.7956") + >>> a/b + Decimal("20.64739884393063583815028902") + >>> a ** 2 + Decimal("1275.9184") + >>> a**b + Traceback (most recent call last): + ... + decimal.InvalidOperation: x ** (non-integer) + +You can combine :class:`Decimal` instances with integers, but not with floating- +point numbers:: + + >>> a + 4 + Decimal("39.72") + >>> a + 4.5 + Traceback (most recent call last): + ... + TypeError: You can interact Decimal only with int, long or Decimal data types. + >>> + +:class:`Decimal` numbers can be used with the :mod:`math` and :mod:`cmath` +modules, but note that they'll be immediately converted to floating-point +numbers before the operation is performed, resulting in a possible loss of +precision and accuracy. You'll also get back a regular floating-point number +and not a :class:`Decimal`. :: + + >>> import math, cmath + >>> d = decimal.Decimal('123456789012.345') + >>> math.sqrt(d) + 351364.18288201344 + >>> cmath.sqrt(-d) + 351364.18288201344j + +:class:`Decimal` instances have a :meth:`sqrt` method that returns a +:class:`Decimal`, but if you need other things such as trigonometric functions +you'll have to implement them. :: + + >>> d.sqrt() + Decimal("351364.1828820134592177245001") + + +The :class:`Context` type +------------------------- + +Instances of the :class:`Context` class encapsulate several settings for +decimal operations: + +* :attr:`prec` is the precision, the number of decimal places. + +* :attr:`rounding` specifies the rounding mode. The :mod:`decimal` module has + constants for the various possibilities: :const:`ROUND_DOWN`, + :const:`ROUND_CEILING`, :const:`ROUND_HALF_EVEN`, and various others. + +* :attr:`traps` is a dictionary specifying what happens on encountering certain + error conditions: either an exception is raised or a value is returned. Some + examples of error conditions are division by zero, loss of precision, and + overflow. + +There's a thread-local default context available by calling :func:`getcontext`; +you can change the properties of this context to alter the default precision, +rounding, or trap handling. The following example shows the effect of changing +the precision of the default context:: + + >>> decimal.getcontext().prec + 28 + >>> decimal.Decimal(1) / decimal.Decimal(7) + Decimal("0.1428571428571428571428571429") + >>> decimal.getcontext().prec = 9 + >>> decimal.Decimal(1) / decimal.Decimal(7) + Decimal("0.142857143") + +The default action for error conditions is selectable; the module can either +return a special value such as infinity or not-a-number, or exceptions can be +raised:: + + >>> decimal.Decimal(1) / decimal.Decimal(0) + Traceback (most recent call last): + ... + decimal.DivisionByZero: x / 0 + >>> decimal.getcontext().traps[decimal.DivisionByZero] = False + >>> decimal.Decimal(1) / decimal.Decimal(0) + Decimal("Infinity") + >>> + +The :class:`Context` instance also has various methods for formatting numbers +such as :meth:`to_eng_string` and :meth:`to_sci_string`. + +For more information, see the documentation for the :mod:`decimal` module, which +includes a quick-start tutorial and a reference. + + +.. seealso:: + + :pep:`327` - Decimal Data Type + Written by Facundo Batista and implemented by Facundo Batista, Eric Price, + Raymond Hettinger, Aahz, and Tim Peters. + + http://research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html + A more detailed overview of the IEEE-754 representation. + + http://www.lahey.com/float.htm + The article uses Fortran code to illustrate many of the problems that floating- + point inaccuracy can cause. + + http://www2.hursley.ibm.com/decimal/ + A description of a decimal-based representation. This representation is being + proposed as a standard, and underlies the new Python decimal type. Much of this + material was written by Mike Cowlishaw, designer of the Rexx language. + +.. % ====================================================================== + + +PEP 328: Multi-line Imports +=========================== + +One language change is a small syntactic tweak aimed at making it easier to +import many names from a module. In a ``from module import names`` statement, +*names* is a sequence of names separated by commas. If the sequence is very +long, you can either write multiple imports from the same module, or you can use +backslashes to escape the line endings like this:: + + from SimpleXMLRPCServer import SimpleXMLRPCServer,\ + SimpleXMLRPCRequestHandler,\ + CGIXMLRPCRequestHandler,\ + resolve_dotted_attribute + +The syntactic change in Python 2.4 simply allows putting the names within +parentheses. Python ignores newlines within a parenthesized expression, so the +backslashes are no longer needed:: + + from SimpleXMLRPCServer import (SimpleXMLRPCServer, + SimpleXMLRPCRequestHandler, + CGIXMLRPCRequestHandler, + resolve_dotted_attribute) + +The PEP also proposes that all :keyword:`import` statements be absolute imports, +with a leading ``.`` character to indicate a relative import. This part of the +PEP was not implemented for Python 2.4, but was completed for Python 2.5. + + +.. seealso:: + + :pep:`328` - Imports: Multi-Line and Absolute/Relative + Written by Aahz. Multi-line imports were implemented by Dima Dorfman. + +.. % ====================================================================== + + +PEP 331: Locale-Independent Float/String Conversions +==================================================== + +The :mod:`locale` modules lets Python software select various conversions and +display conventions that are localized to a particular country or language. +However, the module was careful to not change the numeric locale because various +functions in Python's implementation required that the numeric locale remain set +to the ``'C'`` locale. Often this was because the code was using the C +library's :cfunc:`atof` function. + +Not setting the numeric locale caused trouble for extensions that used third- +party C libraries, however, because they wouldn't have the correct locale set. +The motivating example was GTK+, whose user interface widgets weren't displaying +numbers in the current locale. + +The solution described in the PEP is to add three new functions to the Python +API that perform ASCII-only conversions, ignoring the locale setting: + +* :cfunc:`PyOS_ascii_strtod(str, ptr)` and :cfunc:`PyOS_ascii_atof(str, ptr)` + both convert a string to a C :ctype:`double`. + +* :cfunc:`PyOS_ascii_formatd(buffer, buf_len, format, d)` converts a + :ctype:`double` to an ASCII string. + +The code for these functions came from the GLib library +(http://developer.gnome.org/arch/gtk/glib.html), whose developers kindly +relicensed the relevant functions and donated them to the Python Software +Foundation. The :mod:`locale` module can now change the numeric locale, +letting extensions such as GTK+ produce the correct results. + + +.. seealso:: + + :pep:`331` - Locale-Independent Float/String Conversions + Written by Christian R. Reis, and implemented by Gustavo Carneiro. + +.. % ====================================================================== + + +Other Language Changes +====================== + +Here are all of the changes that Python 2.4 makes to the core Python language. + +* Decorators for functions and methods were added (:pep:`318`). + +* Built-in :func:`set` and :func:`frozenset` types were added (:pep:`218`). + Other new built-ins include the :func:`reversed(seq)` function (:pep:`322`). + +* Generator expressions were added (:pep:`289`). + +* Certain numeric expressions no longer return values restricted to 32 or 64 + bits (:pep:`237`). + +* You can now put parentheses around the list of names in a ``from module import + names`` statement (:pep:`328`). + +* The :meth:`dict.update` method now accepts the same argument forms as the + :class:`dict` constructor. This includes any mapping, any iterable of key/value + pairs, and keyword arguments. (Contributed by Raymond Hettinger.) + +* The string methods :meth:`ljust`, :meth:`rjust`, and :meth:`center` now take + an optional argument for specifying a fill character other than a space. + (Contributed by Raymond Hettinger.) + +* Strings also gained an :meth:`rsplit` method that works like the :meth:`split` + method but splits from the end of the string. (Contributed by Sean + Reifschneider.) :: + + >>> 'www.python.org'.split('.', 1) + ['www', 'python.org'] + 'www.python.org'.rsplit('.', 1) + ['www.python', 'org'] + +* Three keyword parameters, *cmp*, *key*, and *reverse*, were added to the + :meth:`sort` method of lists. These parameters make some common usages of + :meth:`sort` simpler. All of these parameters are optional. + + For the *cmp* parameter, the value should be a comparison function that takes + two parameters and returns -1, 0, or +1 depending on how the parameters compare. + This function will then be used to sort the list. Previously this was the only + parameter that could be provided to :meth:`sort`. + + *key* should be a single-parameter function that takes a list element and + returns a comparison key for the element. The list is then sorted using the + comparison keys. The following example sorts a list case-insensitively:: + + >>> L = ['A', 'b', 'c', 'D'] + >>> L.sort() # Case-sensitive sort + >>> L + ['A', 'D', 'b', 'c'] + >>> # Using 'key' parameter to sort list + >>> L.sort(key=lambda x: x.lower()) + >>> L + ['A', 'b', 'c', 'D'] + >>> # Old-fashioned way + >>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower())) + >>> L + ['A', 'b', 'c', 'D'] + + The last example, which uses the *cmp* parameter, is the old way to perform a + case-insensitive sort. It works but is slower than using a *key* parameter. + Using *key* calls :meth:`lower` method once for each element in the list while + using *cmp* will call it twice for each comparison, so using *key* saves on + invocations of the :meth:`lower` method. + + For simple key functions and comparison functions, it is often possible to avoid + a :keyword:`lambda` expression by using an unbound method instead. For example, + the above case-insensitive sort is best written as:: + + >>> L.sort(key=str.lower) + >>> L + ['A', 'b', 'c', 'D'] + + Finally, the *reverse* parameter takes a Boolean value. If the value is true, + the list will be sorted into reverse order. Instead of ``L.sort() ; + L.reverse()``, you can now write ``L.sort(reverse=True)``. + + The results of sorting are now guaranteed to be stable. This means that two + entries with equal keys will be returned in the same order as they were input. + For example, you can sort a list of people by name, and then sort the list by + age, resulting in a list sorted by age where people with the same age are in + name-sorted order. + + (All changes to :meth:`sort` contributed by Raymond Hettinger.) + +* There is a new built-in function :func:`sorted(iterable)` that works like the + in-place :meth:`list.sort` method but can be used in expressions. The + differences are: + +* the input may be any iterable; + +* a newly formed copy is sorted, leaving the original intact; and + +* the expression returns the new sorted copy + + :: + + >>> L = [9,7,8,3,2,4,1,6,5] + >>> [10+i for i in sorted(L)] # usable in a list comprehension + [11, 12, 13, 14, 15, 16, 17, 18, 19] + >>> L # original is left unchanged + [9,7,8,3,2,4,1,6,5] + >>> sorted('Monty Python') # any iterable may be an input + [' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y'] + + >>> # List the contents of a dict sorted by key values + >>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5) + >>> for k, v in sorted(colormap.iteritems()): + ... print k, v + ... + black 4 + blue 2 + green 3 + red 1 + yellow 5 + + (Contributed by Raymond Hettinger.) + +* Integer operations will no longer trigger an :exc:`OverflowWarning`. The + :exc:`OverflowWarning` warning will disappear in Python 2.5. + +* The interpreter gained a new switch, :option:`-m`, that takes a name, searches + for the corresponding module on ``sys.path``, and runs the module as a script. + For example, you can now run the Python profiler with ``python -m profile``. + (Contributed by Nick Coghlan.) + +* The :func:`eval(expr, globals, locals)` and :func:`execfile(filename, globals, + locals)` functions and the :keyword:`exec` statement now accept any mapping type + for the *locals* parameter. Previously this had to be a regular Python + dictionary. (Contributed by Raymond Hettinger.) + +* The :func:`zip` built-in function and :func:`itertools.izip` now return an + empty list if called with no arguments. Previously they raised a + :exc:`TypeError` exception. This makes them more suitable for use with variable + length argument lists:: + + >>> def transpose(array): + ... return zip(*array) + ... + >>> transpose([(1,2,3), (4,5,6)]) + [(1, 4), (2, 5), (3, 6)] + >>> transpose([]) + [] + + (Contributed by Raymond Hettinger.) + +* Encountering a failure while importing a module no longer leaves a partially- + initialized module object in ``sys.modules``. The incomplete module object left + behind would fool further imports of the same module into succeeding, leading to + confusing errors. (Fixed by Tim Peters.) + +* :const:`None` is now a constant; code that binds a new value to the name + ``None`` is now a syntax error. (Contributed by Raymond Hettinger.) + +.. % ====================================================================== + + +Optimizations +------------- + +* The inner loops for list and tuple slicing were optimized and now run about + one-third faster. The inner loops for dictionaries were also optimized, + resulting in performance boosts for :meth:`keys`, :meth:`values`, :meth:`items`, + :meth:`iterkeys`, :meth:`itervalues`, and :meth:`iteritems`. (Contributed by + Raymond Hettinger.) + +* The machinery for growing and shrinking lists was optimized for speed and for + space efficiency. Appending and popping from lists now runs faster due to more + efficient code paths and less frequent use of the underlying system + :cfunc:`realloc`. List comprehensions also benefit. :meth:`list.extend` was + also optimized and no longer converts its argument into a temporary list before + extending the base list. (Contributed by Raymond Hettinger.) + +* :func:`list`, :func:`tuple`, :func:`map`, :func:`filter`, and :func:`zip` now + run several times faster with non-sequence arguments that supply a + :meth:`__len__` method. (Contributed by Raymond Hettinger.) + +* The methods :meth:`list.__getitem__`, :meth:`dict.__getitem__`, and + :meth:`dict.__contains__` are are now implemented as :class:`method_descriptor` + objects rather than :class:`wrapper_descriptor` objects. This form of access + doubles their performance and makes them more suitable for use as arguments to + functionals: ``map(mydict.__getitem__, keylist)``. (Contributed by Raymond + Hettinger.) + +* Added a new opcode, ``LIST_APPEND``, that simplifies the generated bytecode + for list comprehensions and speeds them up by about a third. (Contributed by + Raymond Hettinger.) + +* The peephole bytecode optimizer has been improved to produce shorter, faster + bytecode; remarkably, the resulting bytecode is more readable. (Enhanced by + Raymond Hettinger.) + +* String concatenations in statements of the form ``s = s + "abc"`` and ``s += + "abc"`` are now performed more efficiently in certain circumstances. This + optimization won't be present in other Python implementations such as Jython, so + you shouldn't rely on it; using the :meth:`join` method of strings is still + recommended when you want to efficiently glue a large number of strings + together. (Contributed by Armin Rigo.) + +The net result of the 2.4 optimizations is that Python 2.4 runs the pystone +benchmark around 5% faster than Python 2.3 and 35% faster than Python 2.2. +(pystone is not a particularly good benchmark, but it's the most commonly used +measurement of Python's performance. Your own applications may show greater or +smaller benefits from Python 2.4.) + +.. % pystone is almost useless for comparing different versions of Python; +.. % instead, it excels at predicting relative Python performance on +.. % different machines. +.. % So, this section would be more informative if it used other tools +.. % such as pybench and parrotbench. For a more application oriented +.. % benchmark, try comparing the timings of test_decimal.py under 2.3 +.. % and 2.4. + +.. % ====================================================================== + + +New, Improved, and Deprecated Modules +===================================== + +As usual, Python's standard library received a number of enhancements and bug +fixes. Here's a partial list of the most notable changes, sorted alphabetically +by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more +complete list of changes, or look through the CVS logs for all the details. + +* The :mod:`asyncore` module's :func:`loop` function now has a *count* parameter + that lets you perform a limited number of passes through the polling loop. The + default is still to loop forever. + +* The :mod:`base64` module now has more complete RFC 3548 support for Base64, + Base32, and Base16 encoding and decoding, including optional case folding and + optional alternative alphabets. (Contributed by Barry Warsaw.) + +* The :mod:`bisect` module now has an underlying C implementation for improved + performance. (Contributed by Dmitry Vasiliev.) + +* The CJKCodecs collections of East Asian codecs, maintained by Hye-Shik Chang, + was integrated into 2.4. The new encodings are: + +* Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz + +* Chinese (ROC): big5, cp950 + +* Japanese: cp932, euc-jis-2004, euc-jp, euc-jisx0213, iso-2022-jp, + iso-2022-jp-1, iso-2022-jp-2, iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004, + shift-jis, shift-jisx0213, shift-jis-2004 + +* Korean: cp949, euc-kr, johab, iso-2022-kr + +* Some other new encodings were added: HP Roman8, ISO_8859-11, ISO_8859-16, + PCTP-154, and TIS-620. + +* The UTF-8 and UTF-16 codecs now cope better with receiving partial input. + Previously the :class:`StreamReader` class would try to read more data, making + it impossible to resume decoding from the stream. The :meth:`read` method will + now return as much data as it can and future calls will resume decoding where + previous ones left off. (Implemented by Walter Dörwald.) + +* There is a new :mod:`collections` module for various specialized collection + datatypes. Currently it contains just one type, :class:`deque`, a double- + ended queue that supports efficiently adding and removing elements from either + end:: + + >>> from collections import deque + >>> d = deque('ghi') # make a new deque with three items + >>> d.append('j') # add a new entry to the right side + >>> d.appendleft('f') # add a new entry to the left side + >>> d # show the representation of the deque + deque(['f', 'g', 'h', 'i', 'j']) + >>> d.pop() # return and remove the rightmost item + 'j' + >>> d.popleft() # return and remove the leftmost item + 'f' + >>> list(d) # list the contents of the deque + ['g', 'h', 'i'] + >>> 'h' in d # search the deque + True + + Several modules, such as the :mod:`Queue` and :mod:`threading` modules, now take + advantage of :class:`collections.deque` for improved performance. (Contributed + by Raymond Hettinger.) + +* The :mod:`ConfigParser` classes have been enhanced slightly. The :meth:`read` + method now returns a list of the files that were successfully parsed, and the + :meth:`set` method raises :exc:`TypeError` if passed a *value* argument that + isn't a string. (Contributed by John Belmonte and David Goodger.) + +* The :mod:`curses` module now supports the ncurses extension + :func:`use_default_colors`. On platforms where the terminal supports + transparency, this makes it possible to use a transparent background. + (Contributed by Jörg Lehmann.) + +* The :mod:`difflib` module now includes an :class:`HtmlDiff` class that creates + an HTML table showing a side by side comparison of two versions of a text. + (Contributed by Dan Gass.) + +* The :mod:`email` package was updated to version 3.0, which dropped various + deprecated APIs and removes support for Python versions earlier than 2.3. The + 3.0 version of the package uses a new incremental parser for MIME messages, + available in the :mod:`email.FeedParser` module. The new parser doesn't require + reading the entire message into memory, and doesn't throw exceptions if a + message is malformed; instead it records any problems in the :attr:`defect` + attribute of the message. (Developed by Anthony Baxter, Barry Warsaw, Thomas + Wouters, and others.) + +* The :mod:`heapq` module has been converted to C. The resulting tenfold + improvement in speed makes the module suitable for handling high volumes of + data. In addition, the module has two new functions :func:`nlargest` and + :func:`nsmallest` that use heaps to find the N largest or smallest values in a + dataset without the expense of a full sort. (Contributed by Raymond Hettinger.) + +* The :mod:`httplib` module now contains constants for HTTP status codes defined + in various HTTP-related RFC documents. Constants have names such as + :const:`OK`, :const:`CREATED`, :const:`CONTINUE`, and + :const:`MOVED_PERMANENTLY`; use pydoc to get a full list. (Contributed by + Andrew Eland.) + +* The :mod:`imaplib` module now supports IMAP's THREAD command (contributed by + Yves Dionne) and new :meth:`deleteacl` and :meth:`myrights` methods (contributed + by Arnaud Mazin). + +* The :mod:`itertools` module gained a :func:`groupby(iterable[, *func*])` + function. *iterable* is something that can be iterated over to return a stream + of elements, and the optional *func* parameter is a function that takes an + element and returns a key value; if omitted, the key is simply the element + itself. :func:`groupby` then groups the elements into subsequences which have + matching values of the key, and returns a series of 2-tuples containing the key + value and an iterator over the subsequence. + + Here's an example to make this clearer. The *key* function simply returns + whether a number is even or odd, so the result of :func:`groupby` is to return + consecutive runs of odd or even numbers. :: + + >>> import itertools + >>> L = [2, 4, 6, 7, 8, 9, 11, 12, 14] + >>> for key_val, it in itertools.groupby(L, lambda x: x % 2): + ... print key_val, list(it) + ... + 0 [2, 4, 6] + 1 [7] + 0 [8] + 1 [9, 11] + 0 [12, 14] + >>> + + :func:`groupby` is typically used with sorted input. The logic for + :func:`groupby` is similar to the Unix ``uniq`` filter which makes it handy for + eliminating, counting, or identifying duplicate elements:: + + >>> word = 'abracadabra' + >>> letters = sorted(word) # Turn string into a sorted list of letters + >>> letters + ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r'] + >>> for k, g in itertools.groupby(letters): + ... print k, list(g) + ... + a ['a', 'a', 'a', 'a', 'a'] + b ['b', 'b'] + c ['c'] + d ['d'] + r ['r', 'r'] + >>> # List unique letters + >>> [k for k, g in groupby(letters)] + ['a', 'b', 'c', 'd', 'r'] + >>> # Count letter occurrences + >>> [(k, len(list(g))) for k, g in groupby(letters)] + [('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)] + + (Contributed by Hye-Shik Chang.) + +* :mod:`itertools` also gained a function named :func:`tee(iterator, N)` that + returns *N* independent iterators that replicate *iterator*. If *N* is omitted, + the default is 2. :: + + >>> L = [1,2,3] + >>> i1, i2 = itertools.tee(L) + >>> i1,i2 + (<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>) + >>> list(i1) # Run the first iterator to exhaustion + [1, 2, 3] + >>> list(i2) # Run the second iterator to exhaustion + [1, 2, 3] + + Note that :func:`tee` has to keep copies of the values returned by the + iterator; in the worst case, it may need to keep all of them. This should + therefore be used carefully if the leading iterator can run far ahead of the + trailing iterator in a long stream of inputs. If the separation is large, then + you might as well use :func:`list` instead. When the iterators track closely + with one another, :func:`tee` is ideal. Possible applications include + bookmarking, windowing, or lookahead iterators. (Contributed by Raymond + Hettinger.) + +* A number of functions were added to the :mod:`locale` module, such as + :func:`bind_textdomain_codeset` to specify a particular encoding and a family of + :func:`l\*gettext` functions that return messages in the chosen encoding. + (Contributed by Gustavo Niemeyer.) + +* Some keyword arguments were added to the :mod:`logging` package's + :func:`basicConfig` function to simplify log configuration. The default + behavior is to log messages to standard error, but various keyword arguments can + be specified to log to a particular file, change the logging format, or set the + logging level. For example:: + + import logging + logging.basicConfig(filename='/var/log/application.log', + level=0, # Log all messages + format='%(levelname):%(process):%(thread):%(message)') + + Other additions to the :mod:`logging` package include a :meth:`log(level, msg)` + convenience method, as well as a :class:`TimedRotatingFileHandler` class that + rotates its log files at a timed interval. The module already had + :class:`RotatingFileHandler`, which rotated logs once the file exceeded a + certain size. Both classes derive from a new :class:`BaseRotatingHandler` class + that can be used to implement other rotating handlers. + + (Changes implemented by Vinay Sajip.) + +* The :mod:`marshal` module now shares interned strings on unpacking a data + structure. This may shrink the size of certain pickle strings, but the primary + effect is to make :file:`.pyc` files significantly smaller. (Contributed by + Martin von Löwis.) + +* The :mod:`nntplib` module's :class:`NNTP` class gained :meth:`description` and + :meth:`descriptions` methods to retrieve newsgroup descriptions for a single + group or for a range of groups. (Contributed by Jürgen A. Erhard.) + +* Two new functions were added to the :mod:`operator` module, + :func:`attrgetter(attr)` and :func:`itemgetter(index)`. Both functions return + callables that take a single argument and return the corresponding attribute or + item; these callables make excellent data extractors when used with :func:`map` + or :func:`sorted`. For example:: + + >>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)] + >>> map(operator.itemgetter(0), L) + ['c', 'd', 'a', 'b'] + >>> map(operator.itemgetter(1), L) + [2, 1, 4, 3] + >>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item + [('d', 1), ('c', 2), ('b', 3), ('a', 4)] + + (Contributed by Raymond Hettinger.) + +* The :mod:`optparse` module was updated in various ways. The module now passes + its messages through :func:`gettext.gettext`, making it possible to + internationalize Optik's help and error messages. Help messages for options can + now include the string ``'%default'``, which will be replaced by the option's + default value. (Contributed by Greg Ward.) + +* The long-term plan is to deprecate the :mod:`rfc822` module in some future + Python release in favor of the :mod:`email` package. To this end, the + :func:`email.Utils.formatdate` function has been changed to make it usable as a + replacement for :func:`rfc822.formatdate`. You may want to write new e-mail + processing code with this in mind. (Change implemented by Anthony Baxter.) + +* A new :func:`urandom(n)` function was added to the :mod:`os` module, returning + a string containing *n* bytes of random data. This function provides access to + platform-specific sources of randomness such as :file:`/dev/urandom` on Linux or + the Windows CryptoAPI. (Contributed by Trevor Perrin.) + +* Another new function: :func:`os.path.lexists(path)` returns true if the file + specified by *path* exists, whether or not it's a symbolic link. This differs + from the existing :func:`os.path.exists(path)` function, which returns false if + *path* is a symlink that points to a destination that doesn't exist. + (Contributed by Beni Cherniavsky.) + +* A new :func:`getsid` function was added to the :mod:`posix` module that + underlies the :mod:`os` module. (Contributed by J. Raynor.) + +* The :mod:`poplib` module now supports POP over SSL. (Contributed by Hector + Urtubia.) + +* The :mod:`profile` module can now profile C extension functions. (Contributed + by Nick Bastin.) + +* The :mod:`random` module has a new method called :meth:`getrandbits(N)` that + returns a long integer *N* bits in length. The existing :meth:`randrange` + method now uses :meth:`getrandbits` where appropriate, making generation of + arbitrarily large random numbers more efficient. (Contributed by Raymond + Hettinger.) + +* The regular expression language accepted by the :mod:`re` module was extended + with simple conditional expressions, written as ``(?(group)A|B)``. *group* is + either a numeric group ID or a group name defined with ``(?P<group>...)`` + earlier in the expression. If the specified group matched, the regular + expression pattern *A* will be tested against the string; if the group didn't + match, the pattern *B* will be used instead. (Contributed by Gustavo Niemeyer.) + +* The :mod:`re` module is also no longer recursive, thanks to a massive amount + of work by Gustavo Niemeyer. In a recursive regular expression engine, certain + patterns result in a large amount of C stack space being consumed, and it was + possible to overflow the stack. For example, if you matched a 30000-byte string + of ``a`` characters against the expression ``(a|b)+``, one stack frame was + consumed per character. Python 2.3 tried to check for stack overflow and raise + a :exc:`RuntimeError` exception, but certain patterns could sidestep the + checking and if you were unlucky Python could segfault. Python 2.4's regular + expression engine can match this pattern without problems. + +* The :mod:`signal` module now performs tighter error-checking on the parameters + to the :func:`signal.signal` function. For example, you can't set a handler on + the :const:`SIGKILL` signal; previous versions of Python would quietly accept + this, but 2.4 will raise a :exc:`RuntimeError` exception. + +* Two new functions were added to the :mod:`socket` module. :func:`socketpair` + returns a pair of connected sockets and :func:`getservbyport(port)` looks up the + service name for a given port number. (Contributed by Dave Cole and Barry + Warsaw.) + +* The :func:`sys.exitfunc` function has been deprecated. Code should be using + the existing :mod:`atexit` module, which correctly handles calling multiple exit + functions. Eventually :func:`sys.exitfunc` will become a purely internal + interface, accessed only by :mod:`atexit`. + +* The :mod:`tarfile` module now generates GNU-format tar files by default. + (Contributed by Lars Gustaebel.) + +* The :mod:`threading` module now has an elegantly simple way to support + thread-local data. The module contains a :class:`local` class whose attribute + values are local to different threads. :: + + import threading + + data = threading.local() + data.number = 42 + data.url = ('www.python.org', 80) + + Other threads can assign and retrieve their own values for the :attr:`number` + and :attr:`url` attributes. You can subclass :class:`local` to initialize + attributes or to add methods. (Contributed by Jim Fulton.) + +* The :mod:`timeit` module now automatically disables periodic garbage + collection during the timing loop. This change makes consecutive timings more + comparable. (Contributed by Raymond Hettinger.) + +* The :mod:`weakref` module now supports a wider variety of objects including + Python functions, class instances, sets, frozensets, deques, arrays, files, + sockets, and regular expression pattern objects. (Contributed by Raymond + Hettinger.) + +* The :mod:`xmlrpclib` module now supports a multi-call extension for + transmitting multiple XML-RPC calls in a single HTTP operation. (Contributed by + Brian Quinlan.) + +* The :mod:`mpz`, :mod:`rotor`, and :mod:`xreadlines` modules have been + removed. + +.. % ====================================================================== +.. % whole new modules get described in subsections here +.. % ===================== + + +cookielib +--------- + +The :mod:`cookielib` library supports client-side handling for HTTP cookies, +mirroring the :mod:`Cookie` module's server-side cookie support. Cookies are +stored in cookie jars; the library transparently stores cookies offered by the +web server in the cookie jar, and fetches the cookie from the jar when +connecting to the server. As in web browsers, policy objects control whether +cookies are accepted or not. + +In order to store cookies across sessions, two implementations of cookie jars +are provided: one that stores cookies in the Netscape format so applications can +use the Mozilla or Lynx cookie files, and one that stores cookies in the same +format as the Perl libwww library. + +:mod:`urllib2` has been changed to interact with :mod:`cookielib`: +:class:`HTTPCookieProcessor` manages a cookie jar that is used when accessing +URLs. + +This module was contributed by John J. Lee. + +.. % ================== + + +doctest +------- + +The :mod:`doctest` module underwent considerable refactoring thanks to Edward +Loper and Tim Peters. Testing can still be as simple as running +:func:`doctest.testmod`, but the refactorings allow customizing the module's +operation in various ways + +The new :class:`DocTestFinder` class extracts the tests from a given object's +docstrings:: + + def f (x, y): + """>>> f(2,2) + 4 + >>> f(3,2) + 6 + """ + return x*y + + finder = doctest.DocTestFinder() + + # Get list of DocTest instances + tests = finder.find(f) + +The new :class:`DocTestRunner` class then runs individual tests and can produce +a summary of the results:: + + runner = doctest.DocTestRunner() + for t in tests: + tried, failed = runner.run(t) + + runner.summarize(verbose=1) + +The above example produces the following output:: + + 1 items passed all tests: + 2 tests in f + 2 tests in 1 items. + 2 passed and 0 failed. + Test passed. + +:class:`DocTestRunner` uses an instance of the :class:`OutputChecker` class to +compare the expected output with the actual output. This class takes a number +of different flags that customize its behaviour; ambitious users can also write +a completely new subclass of :class:`OutputChecker`. + +The default output checker provides a number of handy features. For example, +with the :const:`doctest.ELLIPSIS` option flag, an ellipsis (``...``) in the +expected output matches any substring, making it easier to accommodate outputs +that vary in minor ways:: + + def o (n): + """>>> o(1) + <__main__.C instance at 0x...> + >>> + """ + +Another special string, ``<BLANKLINE>``, matches a blank line:: + + def p (n): + """>>> p(1) + <BLANKLINE> + >>> + """ + +Another new capability is producing a diff-style display of the output by +specifying the :const:`doctest.REPORT_UDIFF` (unified diffs), +:const:`doctest.REPORT_CDIFF` (context diffs), or :const:`doctest.REPORT_NDIFF` +(delta-style) option flags. For example:: + + def g (n): + """>>> g(4) + here + is + a + lengthy + >>>""" + L = 'here is a rather lengthy list of words'.split() + for word in L[:n]: + print word + +Running the above function's tests with :const:`doctest.REPORT_UDIFF` specified, +you get the following output:: + + ********************************************************************** + File ``t.py'', line 15, in g + Failed example: + g(4) + Differences (unified diff with -expected +actual): + @@ -2,3 +2,3 @@ + is + a + -lengthy + +rather + ********************************************************************** + +.. % ====================================================================== + + +Build and C API Changes +======================= + +Some of the changes to Python's build process and to the C API are: + +* Three new convenience macros were added for common return values from + extension functions: :cmacro:`Py_RETURN_NONE`, :cmacro:`Py_RETURN_TRUE`, and + :cmacro:`Py_RETURN_FALSE`. (Contributed by Brett Cannon.) + +* Another new macro, :cmacro:`Py_CLEAR(obj)`, decreases the reference count of + *obj* and sets *obj* to the null pointer. (Contributed by Jim Fulton.) + +* A new function, :cfunc:`PyTuple_Pack(N, obj1, obj2, ..., objN)`, constructs + tuples from a variable length argument list of Python objects. (Contributed by + Raymond Hettinger.) + +* A new function, :cfunc:`PyDict_Contains(d, k)`, implements fast dictionary + lookups without masking exceptions raised during the look-up process. + (Contributed by Raymond Hettinger.) + +* The :cmacro:`Py_IS_NAN(X)` macro returns 1 if its float or double argument + *X* is a NaN. (Contributed by Tim Peters.) + +* C code can avoid unnecessary locking by using the new + :cfunc:`PyEval_ThreadsInitialized` function to tell if any thread operations + have been performed. If this function returns false, no lock operations are + needed. (Contributed by Nick Coghlan.) + +* A new function, :cfunc:`PyArg_VaParseTupleAndKeywords`, is the same as + :cfunc:`PyArg_ParseTupleAndKeywords` but takes a :ctype:`va_list` instead of a + number of arguments. (Contributed by Greg Chapman.) + +* A new method flag, :const:`METH_COEXISTS`, allows a function defined in slots + to co-exist with a :ctype:`PyCFunction` having the same name. This can halve + the access time for a method such as :meth:`set.__contains__`. (Contributed by + Raymond Hettinger.) + +* Python can now be built with additional profiling for the interpreter itself, + intended as an aid to people developing the Python core. Providing + :option:`----enable-profiling` to the :program:`configure` script will let you + profile the interpreter with :program:`gprof`, and providing the + :option:`----with-tsc` switch enables profiling using the Pentium's Time-Stamp- + Counter register. Note that the :option:`----with-tsc` switch is slightly + misnamed, because the profiling feature also works on the PowerPC platform, + though that processor architecture doesn't call that register "the TSC + register". (Contributed by Jeremy Hylton.) + +* The :ctype:`tracebackobject` type has been renamed to + :ctype:`PyTracebackObject`. + +.. % ====================================================================== + + +Port-Specific Changes +--------------------- + +* The Windows port now builds under MSVC++ 7.1 as well as version 6. + (Contributed by Martin von Löwis.) + +.. % ====================================================================== + + +Porting to Python 2.4 +===================== + +This section lists previously described changes that may require changes to your +code: + +* Left shifts and hexadecimal/octal constants that are too large no longer + trigger a :exc:`FutureWarning` and return a value limited to 32 or 64 bits; + instead they return a long integer. + +* Integer operations will no longer trigger an :exc:`OverflowWarning`. The + :exc:`OverflowWarning` warning will disappear in Python 2.5. + +* The :func:`zip` built-in function and :func:`itertools.izip` now return an + empty list instead of raising a :exc:`TypeError` exception if called with no + arguments. + +* You can no longer compare the :class:`date` and :class:`datetime` instances + provided by the :mod:`datetime` module. Two instances of different classes + will now always be unequal, and relative comparisons (``<``, ``>``) will raise + a :exc:`TypeError`. + +* :func:`dircache.listdir` now passes exceptions to the caller instead of + returning empty lists. + +* :func:`LexicalHandler.startDTD` used to receive the public and system IDs in + the wrong order. This has been corrected; applications relying on the wrong + order need to be fixed. + +* :func:`fcntl.ioctl` now warns if the *mutate* argument is omitted and + relevant. + +* The :mod:`tarfile` module now generates GNU-format tar files by default. + +* Encountering a failure while importing a module no longer leaves a partially- + initialized module object in ``sys.modules``. + +* :const:`None` is now a constant; code that binds a new value to the name + ``None`` is now a syntax error. + +* The :func:`signals.signal` function now raises a :exc:`RuntimeError` exception + for certain illegal values; previously these errors would pass silently. For + example, you can no longer set a handler on the :const:`SIGKILL` signal. + +.. % ====================================================================== + + +.. _acks: + +Acknowledgements +================ + +The author would like to thank the following people for offering suggestions, +corrections and assistance with various drafts of this article: Koray Can, Hye- +Shik Chang, Michael Dyck, Raymond Hettinger, Brian Hurt, Hamish Lawson, Fredrik +Lundh, Sean Reifschneider, Sadruddin Rejeb. + diff --git a/Doc/whatsnew/2.5.rst b/Doc/whatsnew/2.5.rst new file mode 100644 index 0000000..f0429ec --- /dev/null +++ b/Doc/whatsnew/2.5.rst @@ -0,0 +1,2286 @@ +**************************** + What's New in Python 2.5 +**************************** + +:Author: A.M. Kuchling + +.. |release| replace:: 1.01 + +.. % $Id: whatsnew25.tex 56611 2007-07-29 08:26:10Z georg.brandl $ +.. % Fix XXX comments + +This article explains the new features in Python 2.5. The final release of +Python 2.5 is scheduled for August 2006; :pep:`356` describes the planned +release schedule. + +The changes in Python 2.5 are an interesting mix of language and library +improvements. The library enhancements will be more important to Python's user +community, I think, because several widely-useful packages were added. New +modules include ElementTree for XML processing (section :ref:`module-etree`), +the SQLite database module (section :ref:`module-sqlite`), and the :mod:`ctypes` +module for calling C functions (section :ref:`module-ctypes`). + +The language changes are of middling significance. Some pleasant new features +were added, but most of them aren't features that you'll use every day. +Conditional expressions were finally added to the language using a novel syntax; +see section :ref:`pep-308`. The new ':keyword:`with`' statement will make +writing cleanup code easier (section :ref:`pep-343`). Values can now be passed +into generators (section :ref:`pep-342`). Imports are now visible as either +absolute or relative (section :ref:`pep-328`). Some corner cases of exception +handling are handled better (section :ref:`pep-341`). All these improvements +are worthwhile, but they're improvements to one specific language feature or +another; none of them are broad modifications to Python's semantics. + +As well as the language and library additions, other improvements and bugfixes +were made throughout the source tree. A search through the SVN change logs +finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and +2.5. (Both figures are likely to be underestimates.) + +This article doesn't try to be a complete specification of the new features; +instead changes are briefly introduced using helpful examples. For full +details, you should always refer to the documentation for Python 2.5 at +http://docs.python.org. If you want to understand the complete implementation +and design rationale, refer to the PEP for a particular new feature. + +Comments, suggestions, and error reports for this document are welcome; please +e-mail them to the author or open a bug in the Python bug tracker. + +.. % ====================================================================== + + +.. _pep-308: + +PEP 308: Conditional Expressions +================================ + +For a long time, people have been requesting a way to write conditional +expressions, which are expressions that return value A or value B depending on +whether a Boolean value is true or false. A conditional expression lets you +write a single assignment statement that has the same effect as the following:: + + if condition: + x = true_value + else: + x = false_value + +There have been endless tedious discussions of syntax on both python-dev and +comp.lang.python. A vote was even held that found the majority of voters wanted +conditional expressions in some form, but there was no syntax that was preferred +by a clear majority. Candidates included C's ``cond ? true_v : false_v``, ``if +cond then true_v else false_v``, and 16 other variations. + +Guido van Rossum eventually chose a surprising syntax:: + + x = true_value if condition else false_value + +Evaluation is still lazy as in existing Boolean expressions, so the order of +evaluation jumps around a bit. The *condition* expression in the middle is +evaluated first, and the *true_value* expression is evaluated only if the +condition was true. Similarly, the *false_value* expression is only evaluated +when the condition is false. + +This syntax may seem strange and backwards; why does the condition go in the +*middle* of the expression, and not in the front as in C's ``c ? x : y``? The +decision was checked by applying the new syntax to the modules in the standard +library and seeing how the resulting code read. In many cases where a +conditional expression is used, one value seems to be the 'common case' and one +value is an 'exceptional case', used only on rarer occasions when the condition +isn't met. The conditional syntax makes this pattern a bit more obvious:: + + contents = ((doc + '\n') if doc else '') + +I read the above statement as meaning "here *contents* is usually assigned a +value of ``doc+'\n'``; sometimes *doc* is empty, in which special case an empty +string is returned." I doubt I will use conditional expressions very often +where there isn't a clear common and uncommon case. + +There was some discussion of whether the language should require surrounding +conditional expressions with parentheses. The decision was made to *not* +require parentheses in the Python language's grammar, but as a matter of style I +think you should always use them. Consider these two statements:: + + # First version -- no parens + level = 1 if logging else 0 + + # Second version -- with parens + level = (1 if logging else 0) + +In the first version, I think a reader's eye might group the statement into +'level = 1', 'if logging', 'else 0', and think that the condition decides +whether the assignment to *level* is performed. The second version reads +better, in my opinion, because it makes it clear that the assignment is always +performed and the choice is being made between two values. + +Another reason for including the brackets: a few odd combinations of list +comprehensions and lambdas could look like incorrect conditional expressions. +See :pep:`308` for some examples. If you put parentheses around your +conditional expressions, you won't run into this case. + + +.. seealso:: + + :pep:`308` - Conditional Expressions + PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas + Wouters. + +.. % ====================================================================== + + +.. _pep-309: + +PEP 309: Partial Function Application +===================================== + +The :mod:`functools` module is intended to contain tools for functional-style +programming. + +One useful tool in this module is the :func:`partial` function. For programs +written in a functional style, you'll sometimes want to construct variants of +existing functions that have some of the parameters filled in. Consider a +Python function ``f(a, b, c)``; you could create a new function ``g(b, c)`` that +was equivalent to ``f(1, b, c)``. This is called "partial function +application". + +:func:`partial` takes the arguments ``(function, arg1, arg2, ... kwarg1=value1, +kwarg2=value2)``. The resulting object is callable, so you can just call it to +invoke *function* with the filled-in arguments. + +Here's a small but realistic example:: + + import functools + + def log (message, subsystem): + "Write the contents of 'message' to the specified subsystem." + print '%s: %s' % (subsystem, message) + ... + + server_log = functools.partial(log, subsystem='server') + server_log('Unable to open socket') + +Here's another example, from a program that uses PyGTK. Here a context- +sensitive pop-up menu is being constructed dynamically. The callback provided +for the menu option is a partially applied version of the :meth:`open_item` +method, where the first argument has been provided. :: + + ... + class Application: + def open_item(self, path): + ... + def init (self): + open_func = functools.partial(self.open_item, item_path) + popup_menu.append( ("Open", open_func, 1) ) + +Another function in the :mod:`functools` module is the +:func:`update_wrapper(wrapper, wrapped)` function that helps you write well- +behaved decorators. :func:`update_wrapper` copies the name, module, and +docstring attribute to a wrapper function so that tracebacks inside the wrapped +function are easier to understand. For example, you might write:: + + def my_decorator(f): + def wrapper(*args, **kwds): + print 'Calling decorated function' + return f(*args, **kwds) + functools.update_wrapper(wrapper, f) + return wrapper + +:func:`wraps` is a decorator that can be used inside your own decorators to copy +the wrapped function's information. An alternate version of the previous +example would be:: + + def my_decorator(f): + @functools.wraps(f) + def wrapper(*args, **kwds): + print 'Calling decorated function' + return f(*args, **kwds) + return wrapper + + +.. seealso:: + + :pep:`309` - Partial Function Application + PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick + Coghlan, with adaptations by Raymond Hettinger. + +.. % ====================================================================== + + +.. _pep-314: + +PEP 314: Metadata for Python Software Packages v1.1 +=================================================== + +Some simple dependency support was added to Distutils. The :func:`setup` +function now has ``requires``, ``provides``, and ``obsoletes`` keyword +parameters. When you build a source distribution using the ``sdist`` command, +the dependency information will be recorded in the :file:`PKG-INFO` file. + +Another new keyword parameter is ``download_url``, which should be set to a URL +for the package's source code. This means it's now possible to look up an entry +in the package index, determine the dependencies for a package, and download the +required packages. :: + + VERSION = '1.0' + setup(name='PyPackage', + version=VERSION, + requires=['numarray', 'zlib (>=1.1.4)'], + obsoletes=['OldPackage'] + download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz' + % VERSION), + ) + +Another new enhancement to the Python package index at +http://cheeseshop.python.org is storing source and binary archives for a +package. The new :command:`upload` Distutils command will upload a package to +the repository. + +Before a package can be uploaded, you must be able to build a distribution using +the :command:`sdist` Distutils command. Once that works, you can run ``python +setup.py upload`` to add your package to the PyPI archive. Optionally you can +GPG-sign the package by supplying the :option:`--sign` and :option:`--identity` +options. + +Package uploading was implemented by Martin von Löwis and Richard Jones. + + +.. seealso:: + + :pep:`314` - Metadata for Python Software Packages v1.1 + PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake; + implemented by Richard Jones and Fred Drake. + +.. % ====================================================================== + + +.. _pep-328: + +PEP 328: Absolute and Relative Imports +====================================== + +The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now +be used to enclose the names imported from a module using the ``from ... import +...`` statement, making it easier to import many different names. + +The more complicated part has been implemented in Python 2.5: importing a module +can be specified to use absolute or package-relative imports. The plan is to +move toward making absolute imports the default in future versions of Python. + +Let's say you have a package directory like this:: + + pkg/ + pkg/__init__.py + pkg/main.py + pkg/string.py + +This defines a package named :mod:`pkg` containing the :mod:`pkg.main` and +:mod:`pkg.string` submodules. + +Consider the code in the :file:`main.py` module. What happens if it executes +the statement ``import string``? In Python 2.4 and earlier, it will first look +in the package's directory to perform a relative import, finds +:file:`pkg/string.py`, imports the contents of that file as the +:mod:`pkg.string` module, and that module is bound to the name ``string`` in the +:mod:`pkg.main` module's namespace. + +That's fine if :mod:`pkg.string` was what you wanted. But what if you wanted +Python's standard :mod:`string` module? There's no clean way to ignore +:mod:`pkg.string` and look for the standard module; generally you had to look at +the contents of ``sys.modules``, which is slightly unclean. Holger Krekel's +:mod:`py.std` package provides a tidier way to perform imports from the standard +library, ``import py ; py.std.string.join()``, but that package isn't available +on all Python installations. + +Reading code which relies on relative imports is also less clear, because a +reader may be confused about which module, :mod:`string` or :mod:`pkg.string`, +is intended to be used. Python users soon learned not to duplicate the names of +standard library modules in the names of their packages' submodules, but you +can't protect against having your submodule's name being used for a new module +added in a future version of Python. + +In Python 2.5, you can switch :keyword:`import`'s behaviour to absolute imports +using a ``from __future__ import absolute_import`` directive. This absolute- +import behaviour will become the default in a future version (probably Python +2.7). Once absolute imports are the default, ``import string`` will always +find the standard library's version. It's suggested that users should begin +using absolute imports as much as possible, so it's preferable to begin writing +``from pkg import string`` in your code. + +Relative imports are still possible by adding a leading period to the module +name when using the ``from ... import`` form:: + + # Import names from pkg.string + from .string import name1, name2 + # Import pkg.string + from . import string + +This imports the :mod:`string` module relative to the current package, so in +:mod:`pkg.main` this will import *name1* and *name2* from :mod:`pkg.string`. +Additional leading periods perform the relative import starting from the parent +of the current package. For example, code in the :mod:`A.B.C` module can do:: + + from . import D # Imports A.B.D + from .. import E # Imports A.E + from ..F import G # Imports A.F.G + +Leading periods cannot be used with the ``import modname`` form of the import +statement, only the ``from ... import`` form. + + +.. seealso:: + + :pep:`328` - Imports: Multi-Line and Absolute/Relative + PEP written by Aahz; implemented by Thomas Wouters. + + http://codespeak.net/py/current/doc/index.html + The py library by Holger Krekel, which contains the :mod:`py.std` package. + +.. % ====================================================================== + + +.. _pep-338: + +PEP 338: Executing Modules as Scripts +===================================== + +The :option:`-m` switch added in Python 2.4 to execute a module as a script +gained a few more abilities. Instead of being implemented in C code inside the +Python interpreter, the switch now uses an implementation in a new module, +:mod:`runpy`. + +The :mod:`runpy` module implements a more sophisticated import mechanism so that +it's now possible to run modules in a package such as :mod:`pychecker.checker`. +The module also supports alternative import mechanisms such as the +:mod:`zipimport` module. This means you can add a .zip archive's path to +``sys.path`` and then use the :option:`-m` switch to execute code from the +archive. + + +.. seealso:: + + :pep:`338` - Executing modules as scripts + PEP written and implemented by Nick Coghlan. + +.. % ====================================================================== + + +.. _pep-341: + +PEP 341: Unified try/except/finally +=================================== + +Until Python 2.5, the :keyword:`try` statement came in two flavours. You could +use a :keyword:`finally` block to ensure that code is always executed, or one or +more :keyword:`except` blocks to catch specific exceptions. You couldn't +combine both :keyword:`except` blocks and a :keyword:`finally` block, because +generating the right bytecode for the combined version was complicated and it +wasn't clear what the semantics of the combined statement should be. + +Guido van Rossum spent some time working with Java, which does support the +equivalent of combining :keyword:`except` blocks and a :keyword:`finally` block, +and this clarified what the statement should mean. In Python 2.5, you can now +write:: + + try: + block-1 ... + except Exception1: + handler-1 ... + except Exception2: + handler-2 ... + else: + else-block + finally: + final-block + +The code in *block-1* is executed. If the code raises an exception, the various +:keyword:`except` blocks are tested: if the exception is of class +:class:`Exception1`, *handler-1* is executed; otherwise if it's of class +:class:`Exception2`, *handler-2* is executed, and so forth. If no exception is +raised, the *else-block* is executed. + +No matter what happened previously, the *final-block* is executed once the code +block is complete and any raised exceptions handled. Even if there's an error in +an exception handler or the *else-block* and a new exception is raised, the code +in the *final-block* is still run. + + +.. seealso:: + + :pep:`341` - Unifying try-except and try-finally + PEP written by Georg Brandl; implementation by Thomas Lee. + +.. % ====================================================================== + + +.. _pep-342: + +PEP 342: New Generator Features +=============================== + +Python 2.5 adds a simple way to pass values *into* a generator. As introduced in +Python 2.3, generators only produce output; once a generator's code was invoked +to create an iterator, there was no way to pass any new information into the +function when its execution is resumed. Sometimes the ability to pass in some +information would be useful. Hackish solutions to this include making the +generator's code look at a global variable and then changing the global +variable's value, or passing in some mutable object that callers then modify. + +To refresh your memory of basic generators, here's a simple example:: + + def counter (maximum): + i = 0 + while i < maximum: + yield i + i += 1 + +When you call ``counter(10)``, the result is an iterator that returns the values +from 0 up to 9. On encountering the :keyword:`yield` statement, the iterator +returns the provided value and suspends the function's execution, preserving the +local variables. Execution resumes on the following call to the iterator's +:meth:`next` method, picking up after the :keyword:`yield` statement. + +In Python 2.3, :keyword:`yield` was a statement; it didn't return any value. In +2.5, :keyword:`yield` is now an expression, returning a value that can be +assigned to a variable or otherwise operated on:: + + val = (yield i) + +I recommend that you always put parentheses around a :keyword:`yield` expression +when you're doing something with the returned value, as in the above example. +The parentheses aren't always necessary, but it's easier to always add them +instead of having to remember when they're needed. + +(:pep:`342` explains the exact rules, which are that a :keyword:`yield`\ +-expression must always be parenthesized except when it occurs at the top-level +expression on the right-hand side of an assignment. This means you can write +``val = yield i`` but have to use parentheses when there's an operation, as in +``val = (yield i) + 12``.) + +Values are sent into a generator by calling its :meth:`send(value)` method. The +generator's code is then resumed and the :keyword:`yield` expression returns the +specified *value*. If the regular :meth:`next` method is called, the +:keyword:`yield` returns :const:`None`. + +Here's the previous example, modified to allow changing the value of the +internal counter. :: + + def counter (maximum): + i = 0 + while i < maximum: + val = (yield i) + # If value provided, change counter + if val is not None: + i = val + else: + i += 1 + +And here's an example of changing the counter:: + + >>> it = counter(10) + >>> print it.next() + 0 + >>> print it.next() + 1 + >>> print it.send(8) + 8 + >>> print it.next() + 9 + >>> print it.next() + Traceback (most recent call last): + File ``t.py'', line 15, in ? + print it.next() + StopIteration + +:keyword:`yield` will usually return :const:`None`, so you should always check +for this case. Don't just use its value in expressions unless you're sure that +the :meth:`send` method will be the only method used to resume your generator +function. + +In addition to :meth:`send`, there are two other new methods on generators: + +* :meth:`throw(type, value=None, traceback=None)` is used to raise an exception + inside the generator; the exception is raised by the :keyword:`yield` expression + where the generator's execution is paused. + +* :meth:`close` raises a new :exc:`GeneratorExit` exception inside the generator + to terminate the iteration. On receiving this exception, the generator's code + must either raise :exc:`GeneratorExit` or :exc:`StopIteration`. Catching the + :exc:`GeneratorExit` exception and returning a value is illegal and will trigger + a :exc:`RuntimeError`; if the function raises some other exception, that + exception is propagated to the caller. :meth:`close` will also be called by + Python's garbage collector when the generator is garbage-collected. + + If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest + using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`. + +The cumulative effect of these changes is to turn generators from one-way +producers of information into both producers and consumers. + +Generators also become *coroutines*, a more generalized form of subroutines. +Subroutines are entered at one point and exited at another point (the top of the +function, and a :keyword:`return` statement), but coroutines can be entered, +exited, and resumed at many different points (the :keyword:`yield` statements). +We'll have to figure out patterns for using coroutines effectively in Python. + +The addition of the :meth:`close` method has one side effect that isn't obvious. +:meth:`close` is called when a generator is garbage-collected, so this means the +generator's code gets one last chance to run before the generator is destroyed. +This last chance means that ``try...finally`` statements in generators can now +be guaranteed to work; the :keyword:`finally` clause will now always get a +chance to run. The syntactic restriction that you couldn't mix :keyword:`yield` +statements with a ``try...finally`` suite has therefore been removed. This +seems like a minor bit of language trivia, but using generators and +``try...finally`` is actually necessary in order to implement the +:keyword:`with` statement described by PEP 343. I'll look at this new statement +in the following section. + +Another even more esoteric effect of this change: previously, the +:attr:`gi_frame` attribute of a generator was always a frame object. It's now +possible for :attr:`gi_frame` to be ``None`` once the generator has been +exhausted. + + +.. seealso:: + + :pep:`342` - Coroutines via Enhanced Generators + PEP written by Guido van Rossum and Phillip J. Eby; implemented by Phillip J. + Eby. Includes examples of some fancier uses of generators as coroutines. + + Earlier versions of these features were proposed in :pep:`288` by Raymond + Hettinger and :pep:`325` by Samuele Pedroni. + + http://en.wikipedia.org/wiki/Coroutine + The Wikipedia entry for coroutines. + + http://www.sidhe.org/~dan/blog/archives/000178.html + An explanation of coroutines from a Perl point of view, written by Dan Sugalski. + +.. % ====================================================================== + + +.. _pep-343: + +PEP 343: The 'with' statement +============================= + +The ':keyword:`with`' statement clarifies code that previously would use +``try...finally`` blocks to ensure that clean-up code is executed. In this +section, I'll discuss the statement as it will commonly be used. In the next +section, I'll examine the implementation details and show how to write objects +for use with this statement. + +The ':keyword:`with`' statement is a new control-flow structure whose basic +structure is:: + + with expression [as variable]: + with-block + +The expression is evaluated, and it should result in an object that supports the +context management protocol (that is, has :meth:`__enter__` and :meth:`__exit__` +methods. + +The object's :meth:`__enter__` is called before *with-block* is executed and +therefore can run set-up code. It also may return a value that is bound to the +name *variable*, if given. (Note carefully that *variable* is *not* assigned +the result of *expression*.) + +After execution of the *with-block* is finished, the object's :meth:`__exit__` +method is called, even if the block raised an exception, and can therefore run +clean-up code. + +To enable the statement in Python 2.5, you need to add the following directive +to your module:: + + from __future__ import with_statement + +The statement will always be enabled in Python 2.6. + +Some standard Python objects now support the context management protocol and can +be used with the ':keyword:`with`' statement. File objects are one example:: + + with open('/etc/passwd', 'r') as f: + for line in f: + print line + ... more processing code ... + +After this statement has executed, the file object in *f* will have been +automatically closed, even if the :keyword:`for` loop raised an exception part- +way through the block. + +.. note:: + + In this case, *f* is the same object created by :func:`open`, because + :meth:`file.__enter__` returns *self*. + +The :mod:`threading` module's locks and condition variables also support the +':keyword:`with`' statement:: + + lock = threading.Lock() + with lock: + # Critical section of code + ... + +The lock is acquired before the block is executed and always released once the +block is complete. + +The new :func:`localcontext` function in the :mod:`decimal` module makes it easy +to save and restore the current decimal context, which encapsulates the desired +precision and rounding characteristics for computations:: + + from decimal import Decimal, Context, localcontext + + # Displays with default precision of 28 digits + v = Decimal('578') + print v.sqrt() + + with localcontext(Context(prec=16)): + # All code in this block uses a precision of 16 digits. + # The original context is restored on exiting the block. + print v.sqrt() + + +.. _context-managers: + +Writing Context Managers +------------------------ + +Under the hood, the ':keyword:`with`' statement is fairly complicated. Most +people will only use ':keyword:`with`' in company with existing objects and +don't need to know these details, so you can skip the rest of this section if +you like. Authors of new objects will need to understand the details of the +underlying implementation and should keep reading. + +A high-level explanation of the context management protocol is: + +* The expression is evaluated and should result in an object called a "context + manager". The context manager must have :meth:`__enter__` and :meth:`__exit__` + methods. + +* The context manager's :meth:`__enter__` method is called. The value returned + is assigned to *VAR*. If no ``'as VAR'`` clause is present, the value is simply + discarded. + +* The code in *BLOCK* is executed. + +* If *BLOCK* raises an exception, the :meth:`__exit__(type, value, traceback)` + is called with the exception details, the same values returned by + :func:`sys.exc_info`. The method's return value controls whether the exception + is re-raised: any false value re-raises the exception, and ``True`` will result + in suppressing it. You'll only rarely want to suppress the exception, because + if you do the author of the code containing the ':keyword:`with`' statement will + never realize anything went wrong. + +* If *BLOCK* didn't raise an exception, the :meth:`__exit__` method is still + called, but *type*, *value*, and *traceback* are all ``None``. + +Let's think through an example. I won't present detailed code but will only +sketch the methods necessary for a database that supports transactions. + +(For people unfamiliar with database terminology: a set of changes to the +database are grouped into a transaction. Transactions can be either committed, +meaning that all the changes are written into the database, or rolled back, +meaning that the changes are all discarded and the database is unchanged. See +any database textbook for more information.) + +Let's assume there's an object representing a database connection. Our goal will +be to let the user write code like this:: + + db_connection = DatabaseConnection() + with db_connection as cursor: + cursor.execute('insert into ...') + cursor.execute('delete from ...') + # ... more operations ... + +The transaction should be committed if the code in the block runs flawlessly or +rolled back if there's an exception. Here's the basic interface for +:class:`DatabaseConnection` that I'll assume:: + + class DatabaseConnection: + # Database interface + def cursor (self): + "Returns a cursor object and starts a new transaction" + def commit (self): + "Commits current transaction" + def rollback (self): + "Rolls back current transaction" + +The :meth:`__enter__` method is pretty easy, having only to start a new +transaction. For this application the resulting cursor object would be a useful +result, so the method will return it. The user can then add ``as cursor`` to +their ':keyword:`with`' statement to bind the cursor to a variable name. :: + + class DatabaseConnection: + ... + def __enter__ (self): + # Code to start a new transaction + cursor = self.cursor() + return cursor + +The :meth:`__exit__` method is the most complicated because it's where most of +the work has to be done. The method has to check if an exception occurred. If +there was no exception, the transaction is committed. The transaction is rolled +back if there was an exception. + +In the code below, execution will just fall off the end of the function, +returning the default value of ``None``. ``None`` is false, so the exception +will be re-raised automatically. If you wished, you could be more explicit and +add a :keyword:`return` statement at the marked location. :: + + class DatabaseConnection: + ... + def __exit__ (self, type, value, tb): + if tb is None: + # No exception, so commit + self.commit() + else: + # Exception occurred, so rollback. + self.rollback() + # return False + + +.. _module-contextlib: + +The contextlib module +--------------------- + +The new :mod:`contextlib` module provides some functions and a decorator that +are useful for writing objects for use with the ':keyword:`with`' statement. + +The decorator is called :func:`contextmanager`, and lets you write a single +generator function instead of defining a new class. The generator should yield +exactly one value. The code up to the :keyword:`yield` will be executed as the +:meth:`__enter__` method, and the value yielded will be the method's return +value that will get bound to the variable in the ':keyword:`with`' statement's +:keyword:`as` clause, if any. The code after the :keyword:`yield` will be +executed in the :meth:`__exit__` method. Any exception raised in the block will +be raised by the :keyword:`yield` statement. + +Our database example from the previous section could be written using this +decorator as:: + + from contextlib import contextmanager + + @contextmanager + def db_transaction (connection): + cursor = connection.cursor() + try: + yield cursor + except: + connection.rollback() + raise + else: + connection.commit() + + db = DatabaseConnection() + with db_transaction(db) as cursor: + ... + +The :mod:`contextlib` module also has a :func:`nested(mgr1, mgr2, ...)` function +that combines a number of context managers so you don't need to write nested +':keyword:`with`' statements. In this example, the single ':keyword:`with`' +statement both starts a database transaction and acquires a thread lock:: + + lock = threading.Lock() + with nested (db_transaction(db), lock) as (cursor, locked): + ... + +Finally, the :func:`closing(object)` function returns *object* so that it can be +bound to a variable, and calls ``object.close`` at the end of the block. :: + + import urllib, sys + from contextlib import closing + + with closing(urllib.urlopen('http://www.yahoo.com')) as f: + for line in f: + sys.stdout.write(line) + + +.. seealso:: + + :pep:`343` - The "with" statement + PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland, + Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a + ':keyword:`with`' statement, which can be helpful in learning how the statement + works. + + The documentation for the :mod:`contextlib` module. + +.. % ====================================================================== + + +.. _pep-352: + +PEP 352: Exceptions as New-Style Classes +======================================== + +Exception classes can now be new-style classes, not just classic classes, and +the built-in :exc:`Exception` class and all the standard built-in exceptions +(:exc:`NameError`, :exc:`ValueError`, etc.) are now new-style classes. + +The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the +inheritance relationships are:: + + BaseException # New in Python 2.5 + |- KeyboardInterrupt + |- SystemExit + |- Exception + |- (all other current built-in exceptions) + +This rearrangement was done because people often want to catch all exceptions +that indicate program errors. :exc:`KeyboardInterrupt` and :exc:`SystemExit` +aren't errors, though, and usually represent an explicit action such as the user +hitting Control-C or code calling :func:`sys.exit`. A bare ``except:`` will +catch all exceptions, so you commonly need to list :exc:`KeyboardInterrupt` and +:exc:`SystemExit` in order to re-raise them. The usual pattern is:: + + try: + ... + except (KeyboardInterrupt, SystemExit): + raise + except: + # Log error... + # Continue running program... + +In Python 2.5, you can now write ``except Exception`` to achieve the same +result, catching all the exceptions that usually indicate errors but leaving +:exc:`KeyboardInterrupt` and :exc:`SystemExit` alone. As in previous versions, +a bare ``except:`` still catches all exceptions. + +The goal for Python 3.0 is to require any class raised as an exception to derive +from :exc:`BaseException` or some descendant of :exc:`BaseException`, and future +releases in the Python 2.x series may begin to enforce this constraint. +Therefore, I suggest you begin making all your exception classes derive from +:exc:`Exception` now. It's been suggested that the bare ``except:`` form should +be removed in Python 3.0, but Guido van Rossum hasn't decided whether to do this +or not. + +Raising of strings as exceptions, as in the statement ``raise "Error +occurred"``, is deprecated in Python 2.5 and will trigger a warning. The aim is +to be able to remove the string-exception feature in a few releases. + + +.. seealso:: + + :pep:`352` - Required Superclass for Exceptions + PEP written by Brett Cannon and Guido van Rossum; implemented by Brett Cannon. + +.. % ====================================================================== + + +.. _pep-353: + +PEP 353: Using ssize_t as the index type +======================================== + +A wide-ranging change to Python's C API, using a new :ctype:`Py_ssize_t` type +definition instead of :ctype:`int`, will permit the interpreter to handle more +data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit +platforms. + +Various pieces of the Python interpreter used C's :ctype:`int` type to store +sizes or counts; for example, the number of items in a list or tuple were stored +in an :ctype:`int`. The C compilers for most 64-bit platforms still define +:ctype:`int` as a 32-bit type, so that meant that lists could only hold up to +``2**31 - 1`` = 2147483647 items. (There are actually a few different +programming models that 64-bit C compilers can use -- see +http://www.unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the +most commonly available model leaves :ctype:`int` as 32 bits.) + +A limit of 2147483647 items doesn't really matter on a 32-bit platform because +you'll run out of memory before hitting the length limit. Each list item +requires space for a pointer, which is 4 bytes, plus space for a +:ctype:`PyObject` representing the item. 2147483647\*4 is already more bytes +than a 32-bit address space can contain. + +It's possible to address that much memory on a 64-bit platform, however. The +pointers for a list that size would only require 16 GiB of space, so it's not +unreasonable that Python programmers might construct lists that large. +Therefore, the Python interpreter had to be changed to use some type other than +:ctype:`int`, and this will be a 64-bit type on 64-bit platforms. The change +will cause incompatibilities on 64-bit machines, so it was deemed worth making +the transition now, while the number of 64-bit users is still relatively small. +(In 5 or 10 years, we may *all* be on 64-bit machines, and the transition would +be more painful then.) + +This change most strongly affects authors of C extension modules. Python +strings and container types such as lists and tuples now use +:ctype:`Py_ssize_t` to store their size. Functions such as +:cfunc:`PyList_Size` now return :ctype:`Py_ssize_t`. Code in extension modules +may therefore need to have some variables changed to :ctype:`Py_ssize_t`. + +The :cfunc:`PyArg_ParseTuple` and :cfunc:`Py_BuildValue` functions have a new +conversion code, ``n``, for :ctype:`Py_ssize_t`. :cfunc:`PyArg_ParseTuple`'s +``s#`` and ``t#`` still output :ctype:`int` by default, but you can define the +macro :cmacro:`PY_SSIZE_T_CLEAN` before including :file:`Python.h` to make +them return :ctype:`Py_ssize_t`. + +:pep:`353` has a section on conversion guidelines that extension authors should +read to learn about supporting 64-bit platforms. + + +.. seealso:: + + :pep:`353` - Using ssize_t as the index type + PEP written and implemented by Martin von Löwis. + +.. % ====================================================================== + + +.. _pep-357: + +PEP 357: The '__index__' method +=============================== + +The NumPy developers had a problem that could only be solved by adding a new +special method, :meth:`__index__`. When using slice notation, as in +``[start:stop:step]``, the values of the *start*, *stop*, and *step* indexes +must all be either integers or long integers. NumPy defines a variety of +specialized integer types corresponding to unsigned and signed integers of 8, +16, 32, and 64 bits, but there was no way to signal that these types could be +used as slice indexes. + +Slicing can't just use the existing :meth:`__int__` method because that method +is also used to implement coercion to integers. If slicing used +:meth:`__int__`, floating-point numbers would also become legal slice indexes +and that's clearly an undesirable behaviour. + +Instead, a new special method called :meth:`__index__` was added. It takes no +arguments and returns an integer giving the slice index to use. For example:: + + class C: + def __index__ (self): + return self.value + +The return value must be either a Python integer or long integer. The +interpreter will check that the type returned is correct, and raises a +:exc:`TypeError` if this requirement isn't met. + +A corresponding :attr:`nb_index` slot was added to the C-level +:ctype:`PyNumberMethods` structure to let C extensions implement this protocol. +:cfunc:`PyNumber_Index(obj)` can be used in extension code to call the +:meth:`__index__` function and retrieve its result. + + +.. seealso:: + + :pep:`357` - Allowing Any Object to be Used for Slicing + PEP written and implemented by Travis Oliphant. + +.. % ====================================================================== + + +.. _other-lang: + +Other Language Changes +====================== + +Here are all of the changes that Python 2.5 makes to the core Python language. + +* The :class:`dict` type has a new hook for letting subclasses provide a default + value when a key isn't contained in the dictionary. When a key isn't found, the + dictionary's :meth:`__missing__(key)` method will be called. This hook is used + to implement the new :class:`defaultdict` class in the :mod:`collections` + module. The following example defines a dictionary that returns zero for any + missing key:: + + class zerodict (dict): + def __missing__ (self, key): + return 0 + + d = zerodict({1:1, 2:2}) + print d[1], d[2] # Prints 1, 2 + print d[3], d[4] # Prints 0, 0 + +* Both 8-bit and Unicode strings have new :meth:`partition(sep)` and + :meth:`rpartition(sep)` methods that simplify a common use case. + + The :meth:`find(S)` method is often used to get an index which is then used to + slice the string and obtain the pieces that are before and after the separator. + :meth:`partition(sep)` condenses this pattern into a single method call that + returns a 3-tuple containing the substring before the separator, the separator + itself, and the substring after the separator. If the separator isn't found, + the first element of the tuple is the entire string and the other two elements + are empty. :meth:`rpartition(sep)` also returns a 3-tuple but starts searching + from the end of the string; the ``r`` stands for 'reverse'. + + Some examples:: + + >>> ('http://www.python.org').partition('://') + ('http', '://', 'www.python.org') + >>> ('file:/usr/share/doc/index.html').partition('://') + ('file:/usr/share/doc/index.html', '', '') + >>> (u'Subject: a quick question').partition(':') + (u'Subject', u':', u' a quick question') + >>> 'www.python.org'.rpartition('.') + ('www.python', '.', 'org') + >>> 'www.python.org'.rpartition(':') + ('', '', 'www.python.org') + + (Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.) + +* The :meth:`startswith` and :meth:`endswith` methods of string types now accept + tuples of strings to check for. :: + + def is_image_file (filename): + return filename.endswith(('.gif', '.jpg', '.tiff')) + + (Implemented by Georg Brandl following a suggestion by Tom Lynn.) + + .. % RFE #1491485 + +* The :func:`min` and :func:`max` built-in functions gained a ``key`` keyword + parameter analogous to the ``key`` argument for :meth:`sort`. This parameter + supplies a function that takes a single argument and is called for every value + in the list; :func:`min`/:func:`max` will return the element with the + smallest/largest return value from this function. For example, to find the + longest string in a list, you can do:: + + L = ['medium', 'longest', 'short'] + # Prints 'longest' + print max(L, key=len) + # Prints 'short', because lexicographically 'short' has the largest value + print max(L) + + (Contributed by Steven Bethard and Raymond Hettinger.) + +* Two new built-in functions, :func:`any` and :func:`all`, evaluate whether an + iterator contains any true or false values. :func:`any` returns :const:`True` + if any value returned by the iterator is true; otherwise it will return + :const:`False`. :func:`all` returns :const:`True` only if all of the values + returned by the iterator evaluate as true. (Suggested by Guido van Rossum, and + implemented by Raymond Hettinger.) + +* The result of a class's :meth:`__hash__` method can now be either a long + integer or a regular integer. If a long integer is returned, the hash of that + value is taken. In earlier versions the hash value was required to be a + regular integer, but in 2.5 the :func:`id` built-in was changed to always + return non-negative numbers, and users often seem to use ``id(self)`` in + :meth:`__hash__` methods (though this is discouraged). + + .. % Bug #1536021 + +* ASCII is now the default encoding for modules. It's now a syntax error if a + module contains string literals with 8-bit characters but doesn't have an + encoding declaration. In Python 2.4 this triggered a warning, not a syntax + error. See :pep:`263` for how to declare a module's encoding; for example, you + might add a line like this near the top of the source file:: + + # -*- coding: latin1 -*- + +* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to + compare a Unicode string and an 8-bit string that can't be converted to Unicode + using the default ASCII encoding. The result of the comparison is false:: + + >>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode + __main__:1: UnicodeWarning: Unicode equal comparison failed + to convert both arguments to Unicode - interpreting them + as being unequal + False + >>> chr(127) == unichr(127) # chr(127) can be converted + True + + Previously this would raise a :class:`UnicodeDecodeError` exception, but in 2.5 + this could result in puzzling problems when accessing a dictionary. If you + looked up ``unichr(128)`` and ``chr(128)`` was being used as a key, you'd get a + :class:`UnicodeDecodeError` exception. Other changes in 2.5 resulted in this + exception being raised instead of suppressed by the code in :file:`dictobject.c` + that implements dictionaries. + + Raising an exception for such a comparison is strictly correct, but the change + might have broken code, so instead :class:`UnicodeWarning` was introduced. + + (Implemented by Marc-André Lemburg.) + +* One error that Python programmers sometimes make is forgetting to include an + :file:`__init__.py` module in a package directory. Debugging this mistake can be + confusing, and usually requires running Python with the :option:`-v` switch to + log all the paths searched. In Python 2.5, a new :exc:`ImportWarning` warning is + triggered when an import would have picked up a directory as a package but no + :file:`__init__.py` was found. This warning is silently ignored by default; + provide the :option:`-Wd` option when running the Python executable to display + the warning message. (Implemented by Thomas Wouters.) + +* The list of base classes in a class definition can now be empty. As an + example, this is now legal:: + + class C(): + pass + + (Implemented by Brett Cannon.) + +.. % ====================================================================== + + +.. _interactive: + +Interactive Interpreter Changes +------------------------------- + +In the interactive interpreter, ``quit`` and ``exit`` have long been strings so +that new users get a somewhat helpful message when they try to quit:: + + >>> quit + 'Use Ctrl-D (i.e. EOF) to exit.' + +In Python 2.5, ``quit`` and ``exit`` are now objects that still produce string +representations of themselves, but are also callable. Newbies who try ``quit()`` +or ``exit()`` will now exit the interpreter as they expect. (Implemented by +Georg Brandl.) + +The Python executable now accepts the standard long options :option:`--help` +and :option:`--version`; on Windows, it also accepts the :option:`/?` option +for displaying a help message. (Implemented by Georg Brandl.) + +.. % ====================================================================== + + +.. _opts: + +Optimizations +------------- + +Several of the optimizations were developed at the NeedForSpeed sprint, an event +held in Reykjavik, Iceland, from May 21--28 2006. The sprint focused on speed +enhancements to the CPython implementation and was funded by EWT LLC with local +support from CCP Games. Those optimizations added at this sprint are specially +marked in the following list. + +* When they were introduced in Python 2.4, the built-in :class:`set` and + :class:`frozenset` types were built on top of Python's dictionary type. In 2.5 + the internal data structure has been customized for implementing sets, and as a + result sets will use a third less memory and are somewhat faster. (Implemented + by Raymond Hettinger.) + +* The speed of some Unicode operations, such as finding substrings, string + splitting, and character map encoding and decoding, has been improved. + (Substring search and splitting improvements were added by Fredrik Lundh and + Andrew Dalke at the NeedForSpeed sprint. Character maps were improved by Walter + Dörwald and Martin von Löwis.) + + .. % Patch 1313939, 1359618 + +* The :func:`long(str, base)` function is now faster on long digit strings + because fewer intermediate results are calculated. The peak is for strings of + around 800--1000 digits where the function is 6 times faster. (Contributed by + Alan McIntyre and committed at the NeedForSpeed sprint.) + + .. % Patch 1442927 + +* It's now illegal to mix iterating over a file with ``for line in file`` and + calling the file object's :meth:`read`/:meth:`readline`/:meth:`readlines` + methods. Iteration uses an internal buffer and the :meth:`read\*` methods + don't use that buffer. Instead they would return the data following the + buffer, causing the data to appear out of order. Mixing iteration and these + methods will now trigger a :exc:`ValueError` from the :meth:`read\*` method. + (Implemented by Thomas Wouters.) + + .. % Patch 1397960 + +* The :mod:`struct` module now compiles structure format strings into an + internal representation and caches this representation, yielding a 20% speedup. + (Contributed by Bob Ippolito at the NeedForSpeed sprint.) + +* The :mod:`re` module got a 1 or 2% speedup by switching to Python's allocator + functions instead of the system's :cfunc:`malloc` and :cfunc:`free`. + (Contributed by Jack Diederich at the NeedForSpeed sprint.) + +* The code generator's peephole optimizer now performs simple constant folding + in expressions. If you write something like ``a = 2+3``, the code generator + will do the arithmetic and produce code corresponding to ``a = 5``. (Proposed + and implemented by Raymond Hettinger.) + +* Function calls are now faster because code objects now keep the most recently + finished frame (a "zombie frame") in an internal field of the code object, + reusing it the next time the code object is invoked. (Original patch by Michael + Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed + sprint.) Frame objects are also slightly smaller, which may improve cache + locality and reduce memory usage a bit. (Contributed by Neal Norwitz.) + + .. % Patch 876206 + .. % Patch 1337051 + +* Python's built-in exceptions are now new-style classes, a change that speeds + up instantiation considerably. Exception handling in Python 2.5 is therefore + about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and + Sean Reifschneider at the NeedForSpeed sprint.) + +* Importing now caches the paths tried, recording whether they exist or not so + that the interpreter makes fewer :cfunc:`open` and :cfunc:`stat` calls on + startup. (Contributed by Martin von Löwis and Georg Brandl.) + + .. % Patch 921466 + +.. % ====================================================================== + + +.. _modules: + +New, Improved, and Removed Modules +================================== + +The standard library received many enhancements and bug fixes in Python 2.5. +Here's a partial list of the most notable changes, sorted alphabetically by +module name. Consult the :file:`Misc/NEWS` file in the source tree for a more +complete list of changes, or look through the SVN logs for all the details. + +* The :mod:`audioop` module now supports the a-LAW encoding, and the code for + u-LAW encoding has been improved. (Contributed by Lars Immisch.) + +* The :mod:`codecs` module gained support for incremental codecs. The + :func:`codec.lookup` function now returns a :class:`CodecInfo` instance instead + of a tuple. :class:`CodecInfo` instances behave like a 4-tuple to preserve + backward compatibility but also have the attributes :attr:`encode`, + :attr:`decode`, :attr:`incrementalencoder`, :attr:`incrementaldecoder`, + :attr:`streamwriter`, and :attr:`streamreader`. Incremental codecs can receive + input and produce output in multiple chunks; the output is the same as if the + entire input was fed to the non-incremental codec. See the :mod:`codecs` module + documentation for details. (Designed and implemented by Walter Dörwald.) + + .. % Patch 1436130 + +* The :mod:`collections` module gained a new type, :class:`defaultdict`, that + subclasses the standard :class:`dict` type. The new type mostly behaves like a + dictionary but constructs a default value when a key isn't present, + automatically adding it to the dictionary for the requested key value. + + The first argument to :class:`defaultdict`'s constructor is a factory function + that gets called whenever a key is requested but not found. This factory + function receives no arguments, so you can use built-in type constructors such + as :func:`list` or :func:`int`. For example, you can make an index of words + based on their initial letter like this:: + + words = """Nel mezzo del cammin di nostra vita + mi ritrovai per una selva oscura + che la diritta via era smarrita""".lower().split() + + index = defaultdict(list) + + for w in words: + init_letter = w[0] + index[init_letter].append(w) + + Printing ``index`` results in the following output:: + + defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'], + 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'], + 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'], + 'p': ['per'], 's': ['selva', 'smarrita'], + 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']} + + (Contributed by Guido van Rossum.) + +* The :class:`deque` double-ended queue type supplied by the :mod:`collections` + module now has a :meth:`remove(value)` method that removes the first occurrence + of *value* in the queue, raising :exc:`ValueError` if the value isn't found. + (Contributed by Raymond Hettinger.) + +* New module: The :mod:`contextlib` module contains helper functions for use + with the new ':keyword:`with`' statement. See section :ref:`module-contextlib` + for more about this module. + +* New module: The :mod:`cProfile` module is a C implementation of the existing + :mod:`profile` module that has much lower overhead. The module's interface is + the same as :mod:`profile`: you run ``cProfile.run('main()')`` to profile a + function, can save profile data to a file, etc. It's not yet known if the + Hotshot profiler, which is also written in C but doesn't match the + :mod:`profile` module's interface, will continue to be maintained in future + versions of Python. (Contributed by Armin Rigo.) + + Also, the :mod:`pstats` module for analyzing the data measured by the profiler + now supports directing the output to any file object by supplying a *stream* + argument to the :class:`Stats` constructor. (Contributed by Skip Montanaro.) + +* The :mod:`csv` module, which parses files in comma-separated value format, + received several enhancements and a number of bugfixes. You can now set the + maximum size in bytes of a field by calling the + :meth:`csv.field_size_limit(new_limit)` function; omitting the *new_limit* + argument will return the currently-set limit. The :class:`reader` class now has + a :attr:`line_num` attribute that counts the number of physical lines read from + the source; records can span multiple physical lines, so :attr:`line_num` is not + the same as the number of records read. + + The CSV parser is now stricter about multi-line quoted fields. Previously, if a + line ended within a quoted field without a terminating newline character, a + newline would be inserted into the returned field. This behavior caused problems + when reading files that contained carriage return characters within fields, so + the code was changed to return the field without inserting newlines. As a + consequence, if newlines embedded within fields are important, the input should + be split into lines in a manner that preserves the newline characters. + + (Contributed by Skip Montanaro and Andrew McNamara.) + +* The :class:`datetime` class in the :mod:`datetime` module now has a + :meth:`strptime(string, format)` method for parsing date strings, contributed + by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and + :func:`time.strftime`:: + + from datetime import datetime + + ts = datetime.strptime('10:13:15 2006-03-07', + '%H:%M:%S %Y-%m-%d') + +* The :meth:`SequenceMatcher.get_matching_blocks` method in the :mod:`difflib` + module now guarantees to return a minimal list of blocks describing matching + subsequences. Previously, the algorithm would occasionally break a block of + matching elements into two list entries. (Enhancement by Tim Peters.) + +* The :mod:`doctest` module gained a ``SKIP`` option that keeps an example from + being executed at all. This is intended for code snippets that are usage + examples intended for the reader and aren't actually test cases. + + An *encoding* parameter was added to the :func:`testfile` function and the + :class:`DocFileSuite` class to specify the file's encoding. This makes it + easier to use non-ASCII characters in tests contained within a docstring. + (Contributed by Bjorn Tillenius.) + + .. % Patch 1080727 + +* The :mod:`email` package has been updated to version 4.0. (Contributed by + Barry Warsaw.) + + .. % XXX need to provide some more detail here + +* The :mod:`fileinput` module was made more flexible. Unicode filenames are now + supported, and a *mode* parameter that defaults to ``"r"`` was added to the + :func:`input` function to allow opening files in binary or universal-newline + mode. Another new parameter, *openhook*, lets you use a function other than + :func:`open` to open the input files. Once you're iterating over the set of + files, the :class:`FileInput` object's new :meth:`fileno` returns the file + descriptor for the currently opened file. (Contributed by Georg Brandl.) + +* In the :mod:`gc` module, the new :func:`get_count` function returns a 3-tuple + containing the current collection counts for the three GC generations. This is + accounting information for the garbage collector; when these counts reach a + specified threshold, a garbage collection sweep will be made. The existing + :func:`gc.collect` function now takes an optional *generation* argument of 0, 1, + or 2 to specify which generation to collect. (Contributed by Barry Warsaw.) + +* The :func:`nsmallest` and :func:`nlargest` functions in the :mod:`heapq` + module now support a ``key`` keyword parameter similar to the one provided by + the :func:`min`/:func:`max` functions and the :meth:`sort` methods. For + example:: + + >>> import heapq + >>> L = ["short", 'medium', 'longest', 'longer still'] + >>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically + ['longer still', 'longest'] + >>> heapq.nsmallest(2, L, key=len) # Return two shortest elements + ['short', 'medium'] + + (Contributed by Raymond Hettinger.) + +* The :func:`itertools.islice` function now accepts ``None`` for the start and + step arguments. This makes it more compatible with the attributes of slice + objects, so that you can now write the following:: + + s = slice(5) # Create slice object + itertools.islice(iterable, s.start, s.stop, s.step) + + (Contributed by Raymond Hettinger.) + +* The :func:`format` function in the :mod:`locale` module has been modified and + two new functions were added, :func:`format_string` and :func:`currency`. + + The :func:`format` function's *val* parameter could previously be a string as + long as no more than one %char specifier appeared; now the parameter must be + exactly one %char specifier with no surrounding text. An optional *monetary* + parameter was also added which, if ``True``, will use the locale's rules for + formatting currency in placing a separator between groups of three digits. + + To format strings with multiple %char specifiers, use the new + :func:`format_string` function that works like :func:`format` but also supports + mixing %char specifiers with arbitrary text. + + A new :func:`currency` function was also added that formats a number according + to the current locale's settings. + + (Contributed by Georg Brandl.) + + .. % Patch 1180296 + +* The :mod:`mailbox` module underwent a massive rewrite to add the capability to + modify mailboxes in addition to reading them. A new set of classes that include + :class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and + have an :meth:`add(message)` method to add messages, :meth:`remove(key)` to + remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox. + The following example converts a maildir-format mailbox into an mbox-format + one:: + + import mailbox + + # 'factory=None' uses email.Message.Message as the class representing + # individual messages. + src = mailbox.Maildir('maildir', factory=None) + dest = mailbox.mbox('/tmp/mbox') + + for msg in src: + dest.add(msg) + + (Contributed by Gregory K. Johnson. Funding was provided by Google's 2005 + Summer of Code.) + +* New module: the :mod:`msilib` module allows creating Microsoft Installer + :file:`.msi` files and CAB files. Some support for reading the :file:`.msi` + database is also included. (Contributed by Martin von Löwis.) + +* The :mod:`nis` module now supports accessing domains other than the system + default domain by supplying a *domain* argument to the :func:`nis.match` and + :func:`nis.maps` functions. (Contributed by Ben Bell.) + +* The :mod:`operator` module's :func:`itemgetter` and :func:`attrgetter` + functions now support multiple fields. A call such as + ``operator.attrgetter('a', 'b')`` will return a function that retrieves the + :attr:`a` and :attr:`b` attributes. Combining this new feature with the + :meth:`sort` method's ``key`` parameter lets you easily sort lists using + multiple fields. (Contributed by Raymond Hettinger.) + +* The :mod:`optparse` module was updated to version 1.5.1 of the Optik library. + The :class:`OptionParser` class gained an :attr:`epilog` attribute, a string + that will be printed after the help message, and a :meth:`destroy` method to + break reference cycles created by the object. (Contributed by Greg Ward.) + +* The :mod:`os` module underwent several changes. The :attr:`stat_float_times` + variable now defaults to true, meaning that :func:`os.stat` will now return time + values as floats. (This doesn't necessarily mean that :func:`os.stat` will + return times that are precise to fractions of a second; not all systems support + such precision.) + + Constants named :attr:`os.SEEK_SET`, :attr:`os.SEEK_CUR`, and + :attr:`os.SEEK_END` have been added; these are the parameters to the + :func:`os.lseek` function. Two new constants for locking are + :attr:`os.O_SHLOCK` and :attr:`os.O_EXLOCK`. + + Two new functions, :func:`wait3` and :func:`wait4`, were added. They're similar + the :func:`waitpid` function which waits for a child process to exit and returns + a tuple of the process ID and its exit status, but :func:`wait3` and + :func:`wait4` return additional information. :func:`wait3` doesn't take a + process ID as input, so it waits for any child process to exit and returns a + 3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the + :func:`resource.getrusage` function. :func:`wait4(pid)` does take a process ID. + (Contributed by Chad J. Schroeder.) + + On FreeBSD, the :func:`os.stat` function now returns times with nanosecond + resolution, and the returned object now has :attr:`st_gen` and + :attr:`st_birthtime`. The :attr:`st_flags` member is also available, if the + platform supports it. (Contributed by Antti Louko and Diego Pettenò.) + + .. % (Patch 1180695, 1212117) + +* The Python debugger provided by the :mod:`pdb` module can now store lists of + commands to execute when a breakpoint is reached and execution stops. Once + breakpoint #1 has been created, enter ``commands 1`` and enter a series of + commands to be executed, finishing the list with ``end``. The command list can + include commands that resume execution, such as ``continue`` or ``next``. + (Contributed by Grégoire Dooms.) + + .. % Patch 790710 + +* The :mod:`pickle` and :mod:`cPickle` modules no longer accept a return value + of ``None`` from the :meth:`__reduce__` method; the method must return a tuple + of arguments instead. The ability to return ``None`` was deprecated in Python + 2.4, so this completes the removal of the feature. + +* The :mod:`pkgutil` module, containing various utility functions for finding + packages, was enhanced to support PEP 302's import hooks and now also works for + packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.) + +* The pybench benchmark suite by Marc-André Lemburg is now included in the + :file:`Tools/pybench` directory. The pybench suite is an improvement on the + commonly used :file:`pystone.py` program because pybench provides a more + detailed measurement of the interpreter's speed. It times particular operations + such as function calls, tuple slicing, method lookups, and numeric operations, + instead of performing many different operations and reducing the result to a + single number as :file:`pystone.py` does. + +* The :mod:`pyexpat` module now uses version 2.0 of the Expat parser. + (Contributed by Trent Mick.) + +* The :class:`Queue` class provided by the :mod:`Queue` module gained two new + methods. :meth:`join` blocks until all items in the queue have been retrieved + and all processing work on the items have been completed. Worker threads call + the other new method, :meth:`task_done`, to signal that processing for an item + has been completed. (Contributed by Raymond Hettinger.) + +* The old :mod:`regex` and :mod:`regsub` modules, which have been deprecated + ever since Python 2.0, have finally been deleted. Other deleted modules: + :mod:`statcache`, :mod:`tzparse`, :mod:`whrandom`. + +* Also deleted: the :file:`lib-old` directory, which includes ancient modules + such as :mod:`dircmp` and :mod:`ni`, was removed. :file:`lib-old` wasn't on the + default ``sys.path``, so unless your programs explicitly added the directory to + ``sys.path``, this removal shouldn't affect your code. + +* The :mod:`rlcompleter` module is no longer dependent on importing the + :mod:`readline` module and therefore now works on non-Unix platforms. (Patch + from Robert Kiendl.) + + .. % Patch #1472854 + +* The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now have a + :attr:`rpc_paths` attribute that constrains XML-RPC operations to a limited set + of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. Setting + :attr:`rpc_paths` to ``None`` or an empty tuple disables this path checking. + + .. % Bug #1473048 + +* The :mod:`socket` module now supports :const:`AF_NETLINK` sockets on Linux, + thanks to a patch from Philippe Biondi. Netlink sockets are a Linux-specific + mechanism for communications between a user-space process and kernel code; an + introductory article about them is at http://www.linuxjournal.com/article/7356. + In Python code, netlink addresses are represented as a tuple of 2 integers, + ``(pid, group_mask)``. + + Two new methods on socket objects, :meth:`recv_into(buffer)` and + :meth:`recvfrom_into(buffer)`, store the received data in an object that + supports the buffer protocol instead of returning the data as a string. This + means you can put the data directly into an array or a memory-mapped file. + + Socket objects also gained :meth:`getfamily`, :meth:`gettype`, and + :meth:`getproto` accessor methods to retrieve the family, type, and protocol + values for the socket. + +* New module: the :mod:`spwd` module provides functions for accessing the shadow + password database on systems that support shadow passwords. + +* The :mod:`struct` is now faster because it compiles format strings into + :class:`Struct` objects with :meth:`pack` and :meth:`unpack` methods. This is + similar to how the :mod:`re` module lets you create compiled regular expression + objects. You can still use the module-level :func:`pack` and :func:`unpack` + functions; they'll create :class:`Struct` objects and cache them. Or you can + use :class:`Struct` instances directly:: + + s = struct.Struct('ih3s') + + data = s.pack(1972, 187, 'abc') + year, number, name = s.unpack(data) + + You can also pack and unpack data to and from buffer objects directly using the + :meth:`pack_into(buffer, offset, v1, v2, ...)` and :meth:`unpack_from(buffer, + offset)` methods. This lets you store data directly into an array or a memory- + mapped file. + + (:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed + sprint. Support for buffer objects was added by Martin Blais, also at the + NeedForSpeed sprint.) + +* The Python developers switched from CVS to Subversion during the 2.5 + development process. Information about the exact build version is available as + the ``sys.subversion`` variable, a 3-tuple of ``(interpreter-name, branch-name, + revision-range)``. For example, at the time of writing my copy of 2.5 was + reporting ``('CPython', 'trunk', '45313:45315')``. + + This information is also available to C extensions via the + :cfunc:`Py_GetBuildInfo` function that returns a string of build information + like this: ``"trunk:45355:45356M, Apr 13 2006, 07:42:19"``. (Contributed by + Barry Warsaw.) + +* Another new function, :func:`sys._current_frames`, returns the current stack + frames for all running threads as a dictionary mapping thread identifiers to the + topmost stack frame currently active in that thread at the time the function is + called. (Contributed by Tim Peters.) + +* The :class:`TarFile` class in the :mod:`tarfile` module now has an + :meth:`extractall` method that extracts all members from the archive into the + current working directory. It's also possible to set a different directory as + the extraction target, and to unpack only a subset of the archive's members. + + The compression used for a tarfile opened in stream mode can now be autodetected + using the mode ``'r|*'``. (Contributed by Lars Gustäbel.) + + .. % patch 918101 + +* The :mod:`threading` module now lets you set the stack size used when new + threads are created. The :func:`stack_size([*size*])` function returns the + currently configured stack size, and supplying the optional *size* parameter + sets a new value. Not all platforms support changing the stack size, but + Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.) + + .. % Patch 1454481 + +* The :mod:`unicodedata` module has been updated to use version 4.1.0 of the + Unicode character database. Version 3.2.0 is required by some specifications, + so it's still available as :attr:`unicodedata.ucd_3_2_0`. + +* New module: the :mod:`uuid` module generates universally unique identifiers + (UUIDs) according to :rfc:`4122`. The RFC defines several different UUID + versions that are generated from a starting string, from system properties, or + purely randomly. This module contains a :class:`UUID` class and functions + named :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, and :func:`uuid5` to + generate different versions of UUID. (Version 2 UUIDs are not specified in + :rfc:`4122` and are not supported by this module.) :: + + >>> import uuid + >>> # make a UUID based on the host ID and current time + >>> uuid.uuid1() + UUID('a8098c1a-f86e-11da-bd1a-00112444be1e') + + >>> # make a UUID using an MD5 hash of a namespace UUID and a name + >>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org') + UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e') + + >>> # make a random UUID + >>> uuid.uuid4() + UUID('16fd2706-8baf-433b-82eb-8c7fada847da') + + >>> # make a UUID using a SHA-1 hash of a namespace UUID and a name + >>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org') + UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d') + + (Contributed by Ka-Ping Yee.) + +* The :mod:`weakref` module's :class:`WeakKeyDictionary` and + :class:`WeakValueDictionary` types gained new methods for iterating over the + weak references contained in the dictionary. :meth:`iterkeyrefs` and + :meth:`keyrefs` methods were added to :class:`WeakKeyDictionary`, and + :meth:`itervaluerefs` and :meth:`valuerefs` were added to + :class:`WeakValueDictionary`. (Contributed by Fred L. Drake, Jr.) + +* The :mod:`webbrowser` module received a number of enhancements. It's now + usable as a script with ``python -m webbrowser``, taking a URL as the argument; + there are a number of switches to control the behaviour (:option:`-n` for a new + browser window, :option:`-t` for a new tab). New module-level functions, + :func:`open_new` and :func:`open_new_tab`, were added to support this. The + module's :func:`open` function supports an additional feature, an *autoraise* + parameter that signals whether to raise the open window when possible. A number + of additional browsers were added to the supported list such as Firefox, Opera, + Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.) + + .. % Patch #754022 + +* The :mod:`xmlrpclib` module now supports returning :class:`datetime` objects + for the XML-RPC date type. Supply ``use_datetime=True`` to the :func:`loads` + function or the :class:`Unmarshaller` class to enable this feature. (Contributed + by Skip Montanaro.) + + .. % Patch 1120353 + +* The :mod:`zipfile` module now supports the ZIP64 version of the format, + meaning that a .zip archive can now be larger than 4 GiB and can contain + individual files larger than 4 GiB. (Contributed by Ronald Oussoren.) + + .. % Patch 1446489 + +* The :mod:`zlib` module's :class:`Compress` and :class:`Decompress` objects now + support a :meth:`copy` method that makes a copy of the object's internal state + and returns a new :class:`Compress` or :class:`Decompress` object. + (Contributed by Chris AtLee.) + + .. % Patch 1435422 + +.. % ====================================================================== + + +.. _module-ctypes: + +The ctypes package +------------------ + +The :mod:`ctypes` package, written by Thomas Heller, has been added to the +standard library. :mod:`ctypes` lets you call arbitrary functions in shared +libraries or DLLs. Long-time users may remember the :mod:`dl` module, which +provides functions for loading shared libraries and calling functions in them. +The :mod:`ctypes` package is much fancier. + +To load a shared library or DLL, you must create an instance of the +:class:`CDLL` class and provide the name or path of the shared library or DLL. +Once that's done, you can call arbitrary functions by accessing them as +attributes of the :class:`CDLL` object. :: + + import ctypes + + libc = ctypes.CDLL('libc.so.6') + result = libc.printf("Line of output\n") + +Type constructors for the various C types are provided: :func:`c_int`, +:func:`c_float`, :func:`c_double`, :func:`c_char_p` (equivalent to :ctype:`char +\*`), and so forth. Unlike Python's types, the C versions are all mutable; you +can assign to their :attr:`value` attribute to change the wrapped value. Python +integers and strings will be automatically converted to the corresponding C +types, but for other types you must call the correct type constructor. (And I +mean *must*; getting it wrong will often result in the interpreter crashing +with a segmentation fault.) + +You shouldn't use :func:`c_char_p` with a Python string when the C function will +be modifying the memory area, because Python strings are supposed to be +immutable; breaking this rule will cause puzzling bugs. When you need a +modifiable memory area, use :func:`create_string_buffer`:: + + s = "this is a string" + buf = ctypes.create_string_buffer(s) + libc.strfry(buf) + +C functions are assumed to return integers, but you can set the :attr:`restype` +attribute of the function object to change this:: + + >>> libc.atof('2.71828') + -1783957616 + >>> libc.atof.restype = ctypes.c_double + >>> libc.atof('2.71828') + 2.71828 + +:mod:`ctypes` also provides a wrapper for Python's C API as the +``ctypes.pythonapi`` object. This object does *not* release the global +interpreter lock before calling a function, because the lock must be held when +calling into the interpreter's code. There's a :class:`py_object()` type +constructor that will create a :ctype:`PyObject \*` pointer. A simple usage:: + + import ctypes + + d = {} + ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d), + ctypes.py_object("abc"), ctypes.py_object(1)) + # d is now {'abc', 1}. + +Don't forget to use :class:`py_object()`; if it's omitted you end up with a +segmentation fault. + +:mod:`ctypes` has been around for a while, but people still write and +distribution hand-coded extension modules because you can't rely on +:mod:`ctypes` being present. Perhaps developers will begin to write Python +wrappers atop a library accessed through :mod:`ctypes` instead of extension +modules, now that :mod:`ctypes` is included with core Python. + + +.. seealso:: + + http://starship.python.net/crew/theller/ctypes/ + The ctypes web page, with a tutorial, reference, and FAQ. + + The documentation for the :mod:`ctypes` module. + +.. % ====================================================================== + + +.. _module-etree: + +The ElementTree package +----------------------- + +A subset of Fredrik Lundh's ElementTree library for processing XML has been +added to the standard library as :mod:`xml.etree`. The available modules are +:mod:`ElementTree`, :mod:`ElementPath`, and :mod:`ElementInclude` from +ElementTree 1.2.6. The :mod:`cElementTree` accelerator module is also +included. + +The rest of this section will provide a brief overview of using ElementTree. +Full documentation for ElementTree is available at http://effbot.org/zone +/element-index.htm. + +ElementTree represents an XML document as a tree of element nodes. The text +content of the document is stored as the :attr:`.text` and :attr:`.tail` +attributes of (This is one of the major differences between ElementTree and +the Document Object Model; in the DOM there are many different types of node, +including :class:`TextNode`.) + +The most commonly used parsing function is :func:`parse`, that takes either a +string (assumed to contain a filename) or a file-like object and returns an +:class:`ElementTree` instance:: + + from xml.etree import ElementTree as ET + + tree = ET.parse('ex-1.xml') + + feed = urllib.urlopen( + 'http://planet.python.org/rss10.xml') + tree = ET.parse(feed) + +Once you have an :class:`ElementTree` instance, you can call its :meth:`getroot` +method to get the root :class:`Element` node. + +There's also an :func:`XML` function that takes a string literal and returns an +:class:`Element` node (not an :class:`ElementTree`). This function provides a +tidy way to incorporate XML fragments, approaching the convenience of an XML +literal:: + + svg = ET.XML("""<svg width="10px" version="1.0"> + </svg>""") + svg.set('height', '320px') + svg.append(elem1) + +Each XML element supports some dictionary-like and some list-like access +methods. Dictionary-like operations are used to access attribute values, and +list-like operations are used to access child nodes. + ++-------------------------------+--------------------------------------------+ +| Operation | Result | ++===============================+============================================+ +| ``elem[n]`` | Returns n'th child element. | ++-------------------------------+--------------------------------------------+ +| ``elem[m:n]`` | Returns list of m'th through n'th child | +| | elements. | ++-------------------------------+--------------------------------------------+ +| ``len(elem)`` | Returns number of child elements. | ++-------------------------------+--------------------------------------------+ +| ``list(elem)`` | Returns list of child elements. | ++-------------------------------+--------------------------------------------+ +| ``elem.append(elem2)`` | Adds *elem2* as a child. | ++-------------------------------+--------------------------------------------+ +| ``elem.insert(index, elem2)`` | Inserts *elem2* at the specified location. | ++-------------------------------+--------------------------------------------+ +| ``del elem[n]`` | Deletes n'th child element. | ++-------------------------------+--------------------------------------------+ +| ``elem.keys()`` | Returns list of attribute names. | ++-------------------------------+--------------------------------------------+ +| ``elem.get(name)`` | Returns value of attribute *name*. | ++-------------------------------+--------------------------------------------+ +| ``elem.set(name, value)`` | Sets new value for attribute *name*. | ++-------------------------------+--------------------------------------------+ +| ``elem.attrib`` | Retrieves the dictionary containing | +| | attributes. | ++-------------------------------+--------------------------------------------+ +| ``del elem.attrib[name]`` | Deletes attribute *name*. | ++-------------------------------+--------------------------------------------+ + +Comments and processing instructions are also represented as :class:`Element` +nodes. To check if a node is a comment or processing instructions:: + + if elem.tag is ET.Comment: + ... + elif elem.tag is ET.ProcessingInstruction: + ... + +To generate XML output, you should call the :meth:`ElementTree.write` method. +Like :func:`parse`, it can take either a string or a file-like object:: + + # Encoding is US-ASCII + tree.write('output.xml') + + # Encoding is UTF-8 + f = open('output.xml', 'w') + tree.write(f, encoding='utf-8') + +(Caution: the default encoding used for output is ASCII. For general XML work, +where an element's name may contain arbitrary Unicode characters, ASCII isn't a +very useful encoding because it will raise an exception if an element's name +contains any characters with values greater than 127. Therefore, it's best to +specify a different encoding such as UTF-8 that can handle any Unicode +character.) + +This section is only a partial description of the ElementTree interfaces. Please +read the package's official documentation for more details. + + +.. seealso:: + + http://effbot.org/zone/element-index.htm + Official documentation for ElementTree. + +.. % ====================================================================== + + +.. _module-hashlib: + +The hashlib package +------------------- + +A new :mod:`hashlib` module, written by Gregory P. Smith, has been added to +replace the :mod:`md5` and :mod:`sha` modules. :mod:`hashlib` adds support for +additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When +available, the module uses OpenSSL for fast platform optimized implementations +of algorithms. + +The old :mod:`md5` and :mod:`sha` modules still exist as wrappers around hashlib +to preserve backwards compatibility. The new module's interface is very close +to that of the old modules, but not identical. The most significant difference +is that the constructor functions for creating new hashing objects are named +differently. :: + + # Old versions + h = md5.md5() + h = md5.new() + + # New version + h = hashlib.md5() + + # Old versions + h = sha.sha() + h = sha.new() + + # New version + h = hashlib.sha1() + + # Hash that weren't previously available + h = hashlib.sha224() + h = hashlib.sha256() + h = hashlib.sha384() + h = hashlib.sha512() + + # Alternative form + h = hashlib.new('md5') # Provide algorithm as a string + +Once a hash object has been created, its methods are the same as before: +:meth:`update(string)` hashes the specified string into the current digest +state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary +string or a string of hex digits, and :meth:`copy` returns a new hashing object +with the same digest state. + + +.. seealso:: + + The documentation for the :mod:`hashlib` module. + +.. % ====================================================================== + + +.. _module-sqlite: + +The sqlite3 package +------------------- + +The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded +database, has been added to the standard library under the package name +:mod:`sqlite3`. + +SQLite is a C library that provides a lightweight disk-based database that +doesn't require a separate server process and allows accessing the database +using a nonstandard variant of the SQL query language. Some applications can use +SQLite for internal data storage. It's also possible to prototype an +application using SQLite and then port the code to a larger database such as +PostgreSQL or Oracle. + +pysqlite was written by Gerhard Häring and provides a SQL interface compliant +with the DB-API 2.0 specification described by :pep:`249`. + +If you're compiling the Python source yourself, note that the source tree +doesn't include the SQLite code, only the wrapper module. You'll need to have +the SQLite libraries and headers installed before compiling Python, and the +build process will compile the module when the necessary headers are available. + +To use the module, you must first create a :class:`Connection` object that +represents the database. Here the data will be stored in the +:file:`/tmp/example` file:: + + conn = sqlite3.connect('/tmp/example') + +You can also supply the special name ``:memory:`` to create a database in RAM. + +Once you have a :class:`Connection`, you can create a :class:`Cursor` object +and call its :meth:`execute` method to perform SQL commands:: + + c = conn.cursor() + + # Create table + c.execute('''create table stocks + (date text, trans text, symbol text, + qty real, price real)''') + + # Insert a row of data + c.execute("""insert into stocks + values ('2006-01-05','BUY','RHAT',100,35.14)""") + +Usually your SQL operations will need to use values from Python variables. You +shouldn't assemble your query using Python's string operations because doing so +is insecure; it makes your program vulnerable to an SQL injection attack. + +Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder +wherever you want to use a value, and then provide a tuple of values as the +second argument to the cursor's :meth:`execute` method. (Other database modules +may use a different placeholder, such as ``%s`` or ``:1``.) For example:: + + # Never do this -- insecure! + symbol = 'IBM' + c.execute("... where symbol = '%s'" % symbol) + + # Do this instead + t = (symbol,) + c.execute('select * from stocks where symbol=?', t) + + # Larger example + for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), + ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), + ('2006-04-06', 'SELL', 'IBM', 500, 53.00), + ): + c.execute('insert into stocks values (?,?,?,?,?)', t) + +To retrieve data after executing a SELECT statement, you can either treat the +cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a +single matching row, or call :meth:`fetchall` to get a list of the matching +rows. + +This example uses the iterator form:: + + >>> c = conn.cursor() + >>> c.execute('select * from stocks order by price') + >>> for row in c: + ... print row + ... + (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) + (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) + (u'2006-04-06', u'SELL', u'IBM', 500, 53.0) + (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0) + >>> + +For more information about the SQL dialect supported by SQLite, see +http://www.sqlite.org. + + +.. seealso:: + + http://www.pysqlite.org + The pysqlite web page. + + http://www.sqlite.org + The SQLite web page; the documentation describes the syntax and the available + data types for the supported SQL dialect. + + The documentation for the :mod:`sqlite3` module. + + :pep:`249` - Database API Specification 2.0 + PEP written by Marc-André Lemburg. + +.. % ====================================================================== + + +.. _module-wsgiref: + +The wsgiref package +------------------- + +The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface +between web servers and Python web applications and is described in :pep:`333`. +The :mod:`wsgiref` package is a reference implementation of the WSGI +specification. + +.. % XXX should this be in a PEP 333 section instead? + +The package includes a basic HTTP server that will run a WSGI application; this +server is useful for debugging but isn't intended for production use. Setting +up a server takes only a few lines of code:: + + from wsgiref import simple_server + + wsgi_app = ... + + host = '' + port = 8000 + httpd = simple_server.make_server(host, port, wsgi_app) + httpd.serve_forever() + +.. % XXX discuss structure of WSGI applications? +.. % XXX provide an example using Django or some other framework? + + +.. seealso:: + + http://www.wsgi.org + A central web site for WSGI-related resources. + + :pep:`333` - Python Web Server Gateway Interface v1.0 + PEP written by Phillip J. Eby. + +.. % ====================================================================== + + +.. _build-api: + +Build and C API Changes +======================= + +Changes to Python's build process and to the C API include: + +* The Python source tree was converted from CVS to Subversion, in a complex + migration procedure that was supervised and flawlessly carried out by Martin von + Löwis. The procedure was developed as :pep:`347`. + +* Coverity, a company that markets a source code analysis tool called Prevent, + provided the results of their examination of the Python source code. The + analysis found about 60 bugs that were quickly fixed. Many of the bugs were + refcounting problems, often occurring in error-handling code. See + http://scan.coverity.com for the statistics. + +* The largest change to the C API came from :pep:`353`, which modifies the + interpreter to use a :ctype:`Py_ssize_t` type definition instead of + :ctype:`int`. See the earlier section :ref:`pep-353` for a discussion of this + change. + +* The design of the bytecode compiler has changed a great deal, no longer + generating bytecode by traversing the parse tree. Instead the parse tree is + converted to an abstract syntax tree (or AST), and it is the abstract syntax + tree that's traversed to produce the bytecode. + + It's possible for Python code to obtain AST objects by using the + :func:`compile` built-in and specifying ``_ast.PyCF_ONLY_AST`` as the value of + the *flags* parameter:: + + from _ast import PyCF_ONLY_AST + ast = compile("""a=0 + for i in range(10): + a += i + """, "<string>", 'exec', PyCF_ONLY_AST) + + assignment = ast.body[0] + for_loop = ast.body[1] + + No official documentation has been written for the AST code yet, but :pep:`339` + discusses the design. To start learning about the code, read the definition of + the various AST nodes in :file:`Parser/Python.asdl`. A Python script reads this + file and generates a set of C structure definitions in + :file:`Include/Python-ast.h`. The :cfunc:`PyParser_ASTFromString` and + :cfunc:`PyParser_ASTFromFile`, defined in :file:`Include/pythonrun.h`, take + Python source as input and return the root of an AST representing the contents. + This AST can then be turned into a code object by :cfunc:`PyAST_Compile`. For + more information, read the source code, and then ask questions on python-dev. + + The AST code was developed under Jeremy Hylton's management, and implemented by + (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John + Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil + Schemenauer, plus the participants in a number of AST sprints at conferences + such as PyCon. + + .. % List of names taken from Jeremy's python-dev post at + .. % http://mail.python.org/pipermail/python-dev/2005-October/057500.html + +* Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005, + was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never + freed arenas. With this patch, Python will free arenas when they're empty. The + net effect is that on some platforms, when you allocate many objects, Python's + memory usage may actually drop when you delete them and the memory may be + returned to the operating system. (Implemented by Evan Jones, and reworked by + Tim Peters.) + + Note that this change means extension modules must be more careful when + allocating memory. Python's API has many different functions for allocating + memory that are grouped into families. For example, :cfunc:`PyMem_Malloc`, + :cfunc:`PyMem_Realloc`, and :cfunc:`PyMem_Free` are one family that allocates + raw memory, while :cfunc:`PyObject_Malloc`, :cfunc:`PyObject_Realloc`, and + :cfunc:`PyObject_Free` are another family that's supposed to be used for + creating Python objects. + + Previously these different families all reduced to the platform's + :cfunc:`malloc` and :cfunc:`free` functions. This meant it didn't matter if + you got things wrong and allocated memory with the :cfunc:`PyMem` function but + freed it with the :cfunc:`PyObject` function. With 2.5's changes to obmalloc, + these families now do different things and mismatches will probably result in a + segfault. You should carefully test your C extension modules with Python 2.5. + +* The built-in set types now have an official C API. Call :cfunc:`PySet_New` + and :cfunc:`PyFrozenSet_New` to create a new set, :cfunc:`PySet_Add` and + :cfunc:`PySet_Discard` to add and remove elements, and :cfunc:`PySet_Contains` + and :cfunc:`PySet_Size` to examine the set's state. (Contributed by Raymond + Hettinger.) + +* C code can now obtain information about the exact revision of the Python + interpreter by calling the :cfunc:`Py_GetBuildInfo` function that returns a + string of build information like this: ``"trunk:45355:45356M, Apr 13 2006, + 07:42:19"``. (Contributed by Barry Warsaw.) + +* Two new macros can be used to indicate C functions that are local to the + current file so that a faster calling convention can be used. + :cfunc:`Py_LOCAL(type)` declares the function as returning a value of the + specified *type* and uses a fast-calling qualifier. + :cfunc:`Py_LOCAL_INLINE(type)` does the same thing and also requests the + function be inlined. If :cfunc:`PY_LOCAL_AGGRESSIVE` is defined before + :file:`python.h` is included, a set of more aggressive optimizations are enabled + for the module; you should benchmark the results to find out if these + optimizations actually make the code faster. (Contributed by Fredrik Lundh at + the NeedForSpeed sprint.) + +* :cfunc:`PyErr_NewException(name, base, dict)` can now accept a tuple of base + classes as its *base* argument. (Contributed by Georg Brandl.) + +* The :cfunc:`PyErr_Warn` function for issuing warnings is now deprecated in + favour of :cfunc:`PyErr_WarnEx(category, message, stacklevel)` which lets you + specify the number of stack frames separating this function and the caller. A + *stacklevel* of 1 is the function calling :cfunc:`PyErr_WarnEx`, 2 is the + function above that, and so forth. (Added by Neal Norwitz.) + +* The CPython interpreter is still written in C, but the code can now be + compiled with a C++ compiler without errors. (Implemented by Anthony Baxter, + Martin von Löwis, Skip Montanaro.) + +* The :cfunc:`PyRange_New` function was removed. It was never documented, never + used in the core code, and had dangerously lax error checking. In the unlikely + case that your extensions were using it, you can replace it by something like + the following:: + + range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll", + start, stop, step); + +.. % ====================================================================== + + +.. _ports: + +Port-Specific Changes +--------------------- + +* MacOS X (10.3 and higher): dynamic loading of modules now uses the + :cfunc:`dlopen` function instead of MacOS-specific functions. + +* MacOS X: a :option:`--enable-universalsdk` switch was added to the + :program:`configure` script that compiles the interpreter as a universal binary + able to run on both PowerPC and Intel processors. (Contributed by Ronald + Oussoren.) + +* Windows: :file:`.dll` is no longer supported as a filename extension for + extension modules. :file:`.pyd` is now the only filename extension that will be + searched for. + +.. % ====================================================================== + + +.. _porting: + +Porting to Python 2.5 +===================== + +This section lists previously described changes that may require changes to your +code: + +* ASCII is now the default encoding for modules. It's now a syntax error if a + module contains string literals with 8-bit characters but doesn't have an + encoding declaration. In Python 2.4 this triggered a warning, not a syntax + error. + +* Previously, the :attr:`gi_frame` attribute of a generator was always a frame + object. Because of the :pep:`342` changes described in section :ref:`pep-342`, + it's now possible for :attr:`gi_frame` to be ``None``. + +* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to + compare a Unicode string and an 8-bit string that can't be converted to Unicode + using the default ASCII encoding. Previously such comparisons would raise a + :class:`UnicodeDecodeError` exception. + +* Library: the :mod:`csv` module is now stricter about multi-line quoted fields. + If your files contain newlines embedded within fields, the input should be split + into lines in a manner which preserves the newline characters. + +* Library: the :mod:`locale` module's :func:`format` function's would + previously accept any string as long as no more than one %char specifier + appeared. In Python 2.5, the argument must be exactly one %char specifier with + no surrounding text. + +* Library: The :mod:`pickle` and :mod:`cPickle` modules no longer accept a + return value of ``None`` from the :meth:`__reduce__` method; the method must + return a tuple of arguments instead. The modules also no longer accept the + deprecated *bin* keyword parameter. + +* Library: The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now + have a :attr:`rpc_paths` attribute that constrains XML-RPC operations to a + limited set of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. + Setting :attr:`rpc_paths` to ``None`` or an empty tuple disables this path + checking. + +* C API: Many functions now use :ctype:`Py_ssize_t` instead of :ctype:`int` to + allow processing more data on 64-bit machines. Extension code may need to make + the same change to avoid warnings and to support 64-bit machines. See the + earlier section :ref:`pep-353` for a discussion of this change. + +* C API: The obmalloc changes mean that you must be careful to not mix usage + of the :cfunc:`PyMem_\*` and :cfunc:`PyObject_\*` families of functions. Memory + allocated with one family's :cfunc:`\*_Malloc` must be freed with the + corresponding family's :cfunc:`\*_Free` function. + +.. % ====================================================================== + + +.. _acks: + +Acknowledgements +================ + +The author would like to thank the following people for offering suggestions, +corrections and assistance with various drafts of this article: Georg Brandl, +Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. Grosse- +Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew +McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike +Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters. + diff --git a/Doc/whatsnew/2.6.rst b/Doc/whatsnew/2.6.rst new file mode 100644 index 0000000..b0e731a --- /dev/null +++ b/Doc/whatsnew/2.6.rst @@ -0,0 +1,236 @@ +**************************** + What's New in Python 2.6 +**************************** + +:Author: A.M. Kuchling +:Release: |release| +:Date: |today| + +.. % $Id: whatsnew26.tex 55963 2007-06-13 18:07:49Z guido.van.rossum $ +.. % Rules for maintenance: +.. % +.. % * Anyone can add text to this document. Do not spend very much time +.. % on the wording of your changes, because your text will probably +.. % get rewritten to some degree. +.. % +.. % * The maintainer will go through Misc/NEWS periodically and add +.. % changes; it's therefore more important to add your changes to +.. % Misc/NEWS than to this file. +.. % +.. % * This is not a complete list of every single change; completeness +.. % is the purpose of Misc/NEWS. Some changes I consider too small +.. % or esoteric to include. If such a change is added to the text, +.. % I'll just remove it. (This is another reason you shouldn't spend +.. % too much time on writing your addition.) +.. % +.. % * If you want to draw your new text to the attention of the +.. % maintainer, add 'XXX' to the beginning of the paragraph or +.. % section. +.. % +.. % * It's OK to just add a fragmentary note about a change. For +.. % example: "XXX Describe the transmogrify() function added to the +.. % socket module." The maintainer will research the change and +.. % write the necessary text. +.. % +.. % * You can comment out your additions if you like, but it's not +.. % necessary (especially when a final release is some months away). +.. % +.. % * Credit the author of a patch or bugfix. Just the name is +.. % sufficient; the e-mail address isn't necessary. +.. % +.. % * It's helpful to add the bug/patch number as a comment: +.. % +.. % % Patch 12345 +.. % XXX Describe the transmogrify() function added to the socket +.. % module. +.. % (Contributed by P.Y. Developer.) +.. % +.. % This saves the maintainer the effort of going through the SVN log +.. % when researching a change. + +This article explains the new features in Python 2.6. No release date for +Python 2.6 has been set; it will probably be released in mid 2008. + +This article doesn't attempt to provide a complete specification of the new +features, but instead provides a convenient overview. For full details, you +should refer to the documentation for Python 2.6. If you want to understand the +complete implementation and design rationale, refer to the PEP for a particular +new feature. + +.. % Compare with previous release in 2 - 3 sentences here. +.. % add hyperlink when the documentation becomes available online. + +.. % ====================================================================== +.. % Large, PEP-level features and changes should be described here. +.. % Should there be a new section here for 3k migration? +.. % Or perhaps a more general section describing module changes/deprecation? +.. % sets module deprecated +.. % ====================================================================== + + +Other Language Changes +====================== + +Here are all of the changes that Python 2.6 makes to the core Python language. + +* An obscure change: when you use the the :func:`locals` function inside a + :keyword:`class` statement, the resulting dictionary no longer returns free + variables. (Free variables, in this case, are variables referred to in the + :keyword:`class` statement that aren't attributes of the class.) + +.. % ====================================================================== + + +Optimizations +------------- + +* Internally, a bit is now set in type objects to indicate some of the standard + built-in types. This speeds up checking if an object is a subclass of one of + these types. (Contributed by Neal Norwitz.) + +The net result of the 2.6 optimizations is that Python 2.6 runs the pystone +benchmark around XX% faster than Python 2.5. + +.. % ====================================================================== + + +New, Improved, and Deprecated Modules +===================================== + +As usual, Python's standard library received a number of enhancements and bug +fixes. Here's a partial list of the most notable changes, sorted alphabetically +by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more +complete list of changes, or look through the CVS logs for all the details. + +* A new data type in the :mod:`collections` module: :class:`NamedTuple(typename, + fieldnames)` is a factory function that creates subclasses of the standard tuple + whose fields are accessible by name as well as index. For example:: + + var_type = collections.NamedTuple('variable', + 'id name type size') + var = var_type(1, 'frequency', 'int', 4) + + print var[0], var.id # Equivalent + print var[2], var.type # Equivalent + + (Contributed by Raymond Hettinger.) + +* A new method in the :mod:`curses` module: for a window, :meth:`chgat` changes + the display characters for a certain number of characters on a single line. :: + + # Boldface text starting at y=0,x=21 + # and affecting the rest of the line. + stdscr.chgat(0,21, curses.A_BOLD) + + (Contributed by Fabian Kreutz.) + +* The :func:`glob.glob` function can now return Unicode filenames if + a Unicode path was used and Unicode filenames are matched within the directory. + + .. % Patch #1001604 + +* The :mod:`gopherlib` module has been removed. + +* A new function in the :mod:`heapq` module: ``merge(iter1, iter2, ...)`` + takes any number of iterables that return data *in sorted order*, and returns + a new iterator that returns the contents of all the iterators, also in sorted + order. For example:: + + heapq.merge([1, 3, 5, 9], [2, 8, 16]) -> + [1, 2, 3, 5, 8, 9, 16] + + (Contributed by Raymond Hettinger.) + +* A new function in the :mod:`itertools` module: ``izip_longest(iter1, iter2, + ...[, fillvalue])`` makes tuples from each of the elements; if some of the + iterables are shorter than others, the missing values are set to *fillvalue*. + For example:: + + itertools.izip_longest([1,2,3], [1,2,3,4,5]) -> + [(1, 1), (2, 2), (3, 3), (None, 4), (None, 5)] + + (Contributed by Raymond Hettinger.) + +* The :mod:`macfs` module has been removed. This in turn required the + :func:`macostools.touched` function to be removed because it depended on the + :mod:`macfs` module. + + .. % Patch #1490190 + +* New functions in the :mod:`posix` module: :func:`chflags` and :func:`lchflags` + are wrappers for the corresponding system calls (where they're available). + Constants for the flag values are defined in the :mod:`stat` module; some + possible values include :const:`UF_IMMUTABLE` to signal the file may not be + changed and :const:`UF_APPEND` to indicate that data can only be appended to the + file. (Contributed by M. Levinson.) + +* The :mod:`rgbimg` module has been removed. + +* The :mod:`smtplib` module now supports SMTP over SSL thanks to the addition + of the :class:`SMTP_SSL` class. This class supports an interface identical to + the existing :class:`SMTP` class. (Contributed by Monty Taylor.) + +* The :mod:`test.test_support` module now contains a :func:`EnvironmentVarGuard` + context manager that supports temporarily changing environment variables and + automatically restores them to their old values. (Contributed by Brett Cannon.) + +.. % ====================================================================== +.. % whole new modules get described in \subsections here + +.. % ====================================================================== + + +Build and C API Changes +======================= + +Changes to Python's build process and to the C API include: + +* Detailed changes are listed here. + +.. % ====================================================================== + + +Port-Specific Changes +--------------------- + +Platform-specific changes go here. + +.. % ====================================================================== + + +.. _section-other: + +Other Changes and Fixes +======================= + +As usual, there were a bunch of other improvements and bugfixes scattered +throughout the source tree. A search through the change logs finds there were +XXX patches applied and YYY bugs fixed between Python 2.5 and 2.6. Both figures +are likely to be underestimates. + +Some of the more notable changes are: + +* Details go here. + +.. % ====================================================================== + + +Porting to Python 2.6 +===================== + +This section lists previously described changes that may require changes to your +code: + +* Everything is all in the details! + +.. % ====================================================================== + + +.. _acks: + +Acknowledgements +================ + +The author would like to thank the following people for offering suggestions, +corrections and assistance with various drafts of this article: . + diff --git a/Doc/whatsnew/3.0.rst b/Doc/whatsnew/3.0.rst new file mode 100644 index 0000000..ac82317 --- /dev/null +++ b/Doc/whatsnew/3.0.rst @@ -0,0 +1,161 @@ +**************************** + What's New in Python 3.0 +**************************** + +:Author: A.M. Kuchling + +.. |release| replace:: 0.0 + +.. % $Id: whatsnew26.tex 55506 2007-05-22 07:43:29Z neal.norwitz $ +.. % Rules for maintenance: +.. % +.. % * Anyone can add text to this document. Do not spend very much time +.. % on the wording of your changes, because your text will probably +.. % get rewritten to some degree. +.. % +.. % * The maintainer will go through Misc/NEWS periodically and add +.. % changes; it's therefore more important to add your changes to +.. % Misc/NEWS than to this file. +.. % +.. % * This is not a complete list of every single change; completeness +.. % is the purpose of Misc/NEWS. Some changes I consider too small +.. % or esoteric to include. If such a change is added to the text, +.. % I'll just remove it. (This is another reason you shouldn't spend +.. % too much time on writing your addition.) +.. % +.. % * If you want to draw your new text to the attention of the +.. % maintainer, add 'XXX' to the beginning of the paragraph or +.. % section. +.. % +.. % * It's OK to just add a fragmentary note about a change. For +.. % example: "XXX Describe the transmogrify() function added to the +.. % socket module." The maintainer will research the change and +.. % write the necessary text. +.. % +.. % * You can comment out your additions if you like, but it's not +.. % necessary (especially when a final release is some months away). +.. % +.. % * Credit the author of a patch or bugfix. Just the name is +.. % sufficient; the e-mail address isn't necessary. +.. % +.. % * It's helpful to add the bug/patch number as a comment: +.. % +.. % % Patch 12345 +.. % XXX Describe the transmogrify() function added to the socket +.. % module. +.. % (Contributed by P.Y. Developer.) +.. % +.. % This saves the maintainer the effort of going through the SVN log +.. % when researching a change. + +This article explains the new features in Python 3.0. No release date for +Python 3.0 has been set; it will probably be released in mid 2008. + +This article doesn't attempt to provide a complete specification of the new +features, but instead provides a convenient overview. For full details, you +should refer to the documentation for Python 3.0. If you want to understand the +complete implementation and design rationale, refer to the PEP for a particular +new feature. + +.. % Compare with previous release in 2 - 3 sentences here. +.. % add hyperlink when the documentation becomes available online. + +.. % ====================================================================== +.. % Large, PEP-level features and changes should be described here. +.. % Should there be a new section here for 3k migration? +.. % Or perhaps a more general section describing module changes/deprecation? +.. % sets module deprecated +.. % ====================================================================== + + +Other Language Changes +====================== + +Here are all of the changes that Python 2.6 makes to the core Python language. + +* Detailed changes are listed here. + +.. % ====================================================================== + + +Optimizations +------------- + +* Detailed changes are listed here. + +The net result of the 3.0 optimizations is that Python 3.0 runs the pystone +benchmark around XX% slower than Python 2.6. + +.. % ====================================================================== + + +New, Improved, and Deprecated Modules +===================================== + +As usual, Python's standard library received a number of enhancements and bug +fixes. Here's a partial list of the most notable changes, sorted alphabetically +by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more +complete list of changes, or look through the CVS logs for all the details. + +* Detailed changes are listed here. + +.. % ====================================================================== +.. % whole new modules get described in \subsections here + +.. % ====================================================================== + + +Build and C API Changes +======================= + +Changes to Python's build process and to the C API include: + +* Detailed changes are listed here. + +.. % ====================================================================== + + +Port-Specific Changes +--------------------- + +Platform-specific changes go here. + +.. % ====================================================================== + + +.. _section-other: + +Other Changes and Fixes +======================= + +As usual, there were a bunch of other improvements and bugfixes scattered +throughout the source tree. A search through the change logs finds there were +XXX patches applied and YYY bugs fixed between Python 2.6 and 3.0. Both figures +are likely to be underestimates. + +Some of the more notable changes are: + +* Details go here. + +.. % ====================================================================== + + +Porting to Python 3.0 +===================== + +This section lists previously described changes that may require changes to your +code: + +* Everything is all in the details! + +.. % ====================================================================== + + +.. _acks: + +Acknowledgements +================ + +The author would like to thank the following people for offering suggestions, +corrections and assistance with various drafts of this article: . + |