diff options
author | Georg Brandl <georg@python.org> | 2007-08-15 14:27:07 (GMT) |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2007-08-15 14:27:07 (GMT) |
commit | 739c01d47b9118d04e5722333f0e6b4d0c8bdd9e (patch) | |
tree | f82b450d291927fc1758b96d981aa0610947b529 /Doc/whatsnew/whatsnew25.tex | |
parent | 2d1649094402ef393ea2b128ba2c08c3937e6b93 (diff) | |
download | cpython-739c01d47b9118d04e5722333f0e6b4d0c8bdd9e.zip cpython-739c01d47b9118d04e5722333f0e6b4d0c8bdd9e.tar.gz cpython-739c01d47b9118d04e5722333f0e6b4d0c8bdd9e.tar.bz2 |
Delete the LaTeX doc tree.
Diffstat (limited to 'Doc/whatsnew/whatsnew25.tex')
-rw-r--r-- | Doc/whatsnew/whatsnew25.tex | 2539 |
1 files changed, 0 insertions, 2539 deletions
diff --git a/Doc/whatsnew/whatsnew25.tex b/Doc/whatsnew/whatsnew25.tex deleted file mode 100644 index b6bac49..0000000 --- a/Doc/whatsnew/whatsnew25.tex +++ /dev/null @@ -1,2539 +0,0 @@ -\documentclass{howto} -\usepackage{distutils} -% $Id$ - -% Fix XXX comments - -\title{What's New in Python 2.5} -\release{1.01} -\author{A.M. Kuchling} -\authoraddress{\email{amk@amk.ca}} - -\begin{document} -\maketitle -\tableofcontents - -This article explains the new features in Python 2.5. The final -release of Python 2.5 is scheduled for August 2006; -\pep{356} describes the planned release schedule. - -The changes in Python 2.5 are an interesting mix of language and -library improvements. The library enhancements will be more important -to Python's user community, I think, because several widely-useful -packages were added. New modules include ElementTree for XML -processing (section~\ref{module-etree}), the SQLite database module -(section~\ref{module-sqlite}), and the \module{ctypes} module for -calling C functions (section~\ref{module-ctypes}). - -The language changes are of middling significance. Some pleasant new -features were added, but most of them aren't features that you'll use -every day. Conditional expressions were finally added to the language -using a novel syntax; see section~\ref{pep-308}. The new -'\keyword{with}' statement will make writing cleanup code easier -(section~\ref{pep-343}). Values can now be passed into generators -(section~\ref{pep-342}). Imports are now visible as either absolute -or relative (section~\ref{pep-328}). Some corner cases of exception -handling are handled better (section~\ref{pep-341}). All these -improvements are worthwhile, but they're improvements to one specific -language feature or another; none of them are broad modifications to -Python's semantics. - -As well as the language and library additions, other improvements and -bugfixes were made throughout the source tree. A search through the -SVN change logs finds there were 353 patches applied and 458 bugs -fixed between Python 2.4 and 2.5. (Both figures are likely to be -underestimates.) - -This article doesn't try to be a complete specification of the new -features; instead changes are briefly introduced using helpful -examples. For full details, you should always refer to the -documentation for Python 2.5 at \url{http://docs.python.org}. -If you want to understand the complete implementation and design -rationale, refer to the PEP for a particular new feature. - -Comments, suggestions, and error reports for this document are -welcome; please e-mail them to the author or open a bug in the Python -bug tracker. - -%====================================================================== -\section{PEP 308: Conditional Expressions\label{pep-308}} - -For a long time, people have been requesting a way to write -conditional expressions, which are expressions that return value A or -value B depending on whether a Boolean value is true or false. A -conditional expression lets you write a single assignment statement -that has the same effect as the following: - -\begin{verbatim} -if condition: - x = true_value -else: - x = false_value -\end{verbatim} - -There have been endless tedious discussions of syntax on both -python-dev and comp.lang.python. A vote was even held that found the -majority of voters wanted conditional expressions in some form, -but there was no syntax that was preferred by a clear majority. -Candidates included C's \code{cond ? true_v : false_v}, -\code{if cond then true_v else false_v}, and 16 other variations. - -Guido van~Rossum eventually chose a surprising syntax: - -\begin{verbatim} -x = true_value if condition else false_value -\end{verbatim} - -Evaluation is still lazy as in existing Boolean expressions, so the -order of evaluation jumps around a bit. The \var{condition} -expression in the middle is evaluated first, and the \var{true_value} -expression is evaluated only if the condition was true. Similarly, -the \var{false_value} expression is only evaluated when the condition -is false. - -This syntax may seem strange and backwards; why does the condition go -in the \emph{middle} of the expression, and not in the front as in C's -\code{c ? x : y}? The decision was checked by applying the new syntax -to the modules in the standard library and seeing how the resulting -code read. In many cases where a conditional expression is used, one -value seems to be the 'common case' and one value is an 'exceptional -case', used only on rarer occasions when the condition isn't met. The -conditional syntax makes this pattern a bit more obvious: - -\begin{verbatim} -contents = ((doc + '\n') if doc else '') -\end{verbatim} - -I read the above statement as meaning ``here \var{contents} is -usually assigned a value of \code{doc+'\e n'}; sometimes -\var{doc} is empty, in which special case an empty string is returned.'' -I doubt I will use conditional expressions very often where there -isn't a clear common and uncommon case. - -There was some discussion of whether the language should require -surrounding conditional expressions with parentheses. The decision -was made to \emph{not} require parentheses in the Python language's -grammar, but as a matter of style I think you should always use them. -Consider these two statements: - -\begin{verbatim} -# First version -- no parens -level = 1 if logging else 0 - -# Second version -- with parens -level = (1 if logging else 0) -\end{verbatim} - -In the first version, I think a reader's eye might group the statement -into 'level = 1', 'if logging', 'else 0', and think that the condition -decides whether the assignment to \var{level} is performed. The -second version reads better, in my opinion, because it makes it clear -that the assignment is always performed and the choice is being made -between two values. - -Another reason for including the brackets: a few odd combinations of -list comprehensions and lambdas could look like incorrect conditional -expressions. See \pep{308} for some examples. If you put parentheses -around your conditional expressions, you won't run into this case. - - -\begin{seealso} - -\seepep{308}{Conditional Expressions}{PEP written by -Guido van~Rossum and Raymond D. Hettinger; implemented by Thomas -Wouters.} - -\end{seealso} - - -%====================================================================== -\section{PEP 309: Partial Function Application\label{pep-309}} - -The \module{functools} module is intended to contain tools for -functional-style programming. - -One useful tool in this module is the \function{partial()} function. -For programs written in a functional style, you'll sometimes want to -construct variants of existing functions that have some of the -parameters filled in. Consider a Python function \code{f(a, b, c)}; -you could create a new function \code{g(b, c)} that was equivalent to -\code{f(1, b, c)}. This is called ``partial function application''. - -\function{partial} takes the arguments -\code{(\var{function}, \var{arg1}, \var{arg2}, ... -\var{kwarg1}=\var{value1}, \var{kwarg2}=\var{value2})}. The resulting -object is callable, so you can just call it to invoke \var{function} -with the filled-in arguments. - -Here's a small but realistic example: - -\begin{verbatim} -import functools - -def log (message, subsystem): - "Write the contents of 'message' to the specified subsystem." - print '%s: %s' % (subsystem, message) - ... - -server_log = functools.partial(log, subsystem='server') -server_log('Unable to open socket') -\end{verbatim} - -Here's another example, from a program that uses PyGTK. Here a -context-sensitive pop-up menu is being constructed dynamically. The -callback provided for the menu option is a partially applied version -of the \method{open_item()} method, where the first argument has been -provided. - -\begin{verbatim} -... -class Application: - def open_item(self, path): - ... - def init (self): - open_func = functools.partial(self.open_item, item_path) - popup_menu.append( ("Open", open_func, 1) ) -\end{verbatim} - - -Another function in the \module{functools} module is the -\function{update_wrapper(\var{wrapper}, \var{wrapped})} function that -helps you write well-behaved decorators. \function{update_wrapper()} -copies the name, module, and docstring attribute to a wrapper function -so that tracebacks inside the wrapped function are easier to -understand. For example, you might write: - -\begin{verbatim} -def my_decorator(f): - def wrapper(*args, **kwds): - print 'Calling decorated function' - return f(*args, **kwds) - functools.update_wrapper(wrapper, f) - return wrapper -\end{verbatim} - -\function{wraps()} is a decorator that can be used inside your own -decorators to copy the wrapped function's information. An alternate -version of the previous example would be: - -\begin{verbatim} -def my_decorator(f): - @functools.wraps(f) - def wrapper(*args, **kwds): - print 'Calling decorated function' - return f(*args, **kwds) - return wrapper -\end{verbatim} - -\begin{seealso} - -\seepep{309}{Partial Function Application}{PEP proposed and written by -Peter Harris; implemented by Hye-Shik Chang and Nick Coghlan, with -adaptations by Raymond Hettinger.} - -\end{seealso} - - -%====================================================================== -\section{PEP 314: Metadata for Python Software Packages v1.1\label{pep-314}} - -Some simple dependency support was added to Distutils. The -\function{setup()} function now has \code{requires}, \code{provides}, -and \code{obsoletes} keyword parameters. When you build a source -distribution using the \code{sdist} command, the dependency -information will be recorded in the \file{PKG-INFO} file. - -Another new keyword parameter is \code{download_url}, which should be -set to a URL for the package's source code. This means it's now -possible to look up an entry in the package index, determine the -dependencies for a package, and download the required packages. - -\begin{verbatim} -VERSION = '1.0' -setup(name='PyPackage', - version=VERSION, - requires=['numarray', 'zlib (>=1.1.4)'], - obsoletes=['OldPackage'] - download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz' - % VERSION), - ) -\end{verbatim} - -Another new enhancement to the Python package index at -\url{http://cheeseshop.python.org} is storing source and binary -archives for a package. The new \command{upload} Distutils command -will upload a package to the repository. - -Before a package can be uploaded, you must be able to build a -distribution using the \command{sdist} Distutils command. Once that -works, you can run \code{python setup.py upload} to add your package -to the PyPI archive. Optionally you can GPG-sign the package by -supplying the \longprogramopt{sign} and -\longprogramopt{identity} options. - -Package uploading was implemented by Martin von~L\"owis and Richard Jones. - -\begin{seealso} - -\seepep{314}{Metadata for Python Software Packages v1.1}{PEP proposed -and written by A.M. Kuchling, Richard Jones, and Fred Drake; -implemented by Richard Jones and Fred Drake.} - -\end{seealso} - - -%====================================================================== -\section{PEP 328: Absolute and Relative Imports\label{pep-328}} - -The simpler part of PEP 328 was implemented in Python 2.4: parentheses -could now be used to enclose the names imported from a module using -the \code{from ... import ...} statement, making it easier to import -many different names. - -The more complicated part has been implemented in Python 2.5: -importing a module can be specified to use absolute or -package-relative imports. The plan is to move toward making absolute -imports the default in future versions of Python. - -Let's say you have a package directory like this: -\begin{verbatim} -pkg/ -pkg/__init__.py -pkg/main.py -pkg/string.py -\end{verbatim} - -This defines a package named \module{pkg} containing the -\module{pkg.main} and \module{pkg.string} submodules. - -Consider the code in the \file{main.py} module. What happens if it -executes the statement \code{import string}? In Python 2.4 and -earlier, it will first look in the package's directory to perform a -relative import, finds \file{pkg/string.py}, imports the contents of -that file as the \module{pkg.string} module, and that module is bound -to the name \samp{string} in the \module{pkg.main} module's namespace. - -That's fine if \module{pkg.string} was what you wanted. But what if -you wanted Python's standard \module{string} module? There's no clean -way to ignore \module{pkg.string} and look for the standard module; -generally you had to look at the contents of \code{sys.modules}, which -is slightly unclean. -Holger Krekel's \module{py.std} package provides a tidier way to perform -imports from the standard library, \code{import py ; py.std.string.join()}, -but that package isn't available on all Python installations. - -Reading code which relies on relative imports is also less clear, -because a reader may be confused about which module, \module{string} -or \module{pkg.string}, is intended to be used. Python users soon -learned not to duplicate the names of standard library modules in the -names of their packages' submodules, but you can't protect against -having your submodule's name being used for a new module added in a -future version of Python. - -In Python 2.5, you can switch \keyword{import}'s behaviour to -absolute imports using a \code{from __future__ import absolute_import} -directive. This absolute-import behaviour will become the default in -a future version (probably Python 2.7). Once absolute imports -are the default, \code{import string} will -always find the standard library's version. -It's suggested that users should begin using absolute imports as much -as possible, so it's preferable to begin writing \code{from pkg import -string} in your code. - -Relative imports are still possible by adding a leading period -to the module name when using the \code{from ... import} form: - -\begin{verbatim} -# Import names from pkg.string -from .string import name1, name2 -# Import pkg.string -from . import string -\end{verbatim} - -This imports the \module{string} module relative to the current -package, so in \module{pkg.main} this will import \var{name1} and -\var{name2} from \module{pkg.string}. Additional leading periods -perform the relative import starting from the parent of the current -package. For example, code in the \module{A.B.C} module can do: - -\begin{verbatim} -from . import D # Imports A.B.D -from .. import E # Imports A.E -from ..F import G # Imports A.F.G -\end{verbatim} - -Leading periods cannot be used with the \code{import \var{modname}} -form of the import statement, only the \code{from ... import} form. - -\begin{seealso} - -\seepep{328}{Imports: Multi-Line and Absolute/Relative} -{PEP written by Aahz; implemented by Thomas Wouters.} - -\seeurl{http://codespeak.net/py/current/doc/index.html} -{The py library by Holger Krekel, which contains the \module{py.std} package.} - -\end{seealso} - - -%====================================================================== -\section{PEP 338: Executing Modules as Scripts\label{pep-338}} - -The \programopt{-m} switch added in Python 2.4 to execute a module as -a script gained a few more abilities. Instead of being implemented in -C code inside the Python interpreter, the switch now uses an -implementation in a new module, \module{runpy}. - -The \module{runpy} module implements a more sophisticated import -mechanism so that it's now possible to run modules in a package such -as \module{pychecker.checker}. The module also supports alternative -import mechanisms such as the \module{zipimport} module. This means -you can add a .zip archive's path to \code{sys.path} and then use the -\programopt{-m} switch to execute code from the archive. - - -\begin{seealso} - -\seepep{338}{Executing modules as scripts}{PEP written and -implemented by Nick Coghlan.} - -\end{seealso} - - -%====================================================================== -\section{PEP 341: Unified try/except/finally\label{pep-341}} - -Until Python 2.5, the \keyword{try} statement came in two -flavours. You could use a \keyword{finally} block to ensure that code -is always executed, or one or more \keyword{except} blocks to catch -specific exceptions. You couldn't combine both \keyword{except} blocks and a -\keyword{finally} block, because generating the right bytecode for the -combined version was complicated and it wasn't clear what the -semantics of the combined statement should be. - -Guido van~Rossum spent some time working with Java, which does support the -equivalent of combining \keyword{except} blocks and a -\keyword{finally} block, and this clarified what the statement should -mean. In Python 2.5, you can now write: - -\begin{verbatim} -try: - block-1 ... -except Exception1: - handler-1 ... -except Exception2: - handler-2 ... -else: - else-block -finally: - final-block -\end{verbatim} - -The code in \var{block-1} is executed. If the code raises an -exception, the various \keyword{except} blocks are tested: if the -exception is of class \class{Exception1}, \var{handler-1} is executed; -otherwise if it's of class \class{Exception2}, \var{handler-2} is -executed, and so forth. If no exception is raised, the -\var{else-block} is executed. - -No matter what happened previously, the \var{final-block} is executed -once the code block is complete and any raised exceptions handled. -Even if there's an error in an exception handler or the -\var{else-block} and a new exception is raised, the -code in the \var{final-block} is still run. - -\begin{seealso} - -\seepep{341}{Unifying try-except and try-finally}{PEP written by Georg Brandl; -implementation by Thomas Lee.} - -\end{seealso} - - -%====================================================================== -\section{PEP 342: New Generator Features\label{pep-342}} - -Python 2.5 adds a simple way to pass values \emph{into} a generator. -As introduced in Python 2.3, generators only produce output; once a -generator's code was invoked to create an iterator, there was no way to -pass any new information into the function when its execution is -resumed. Sometimes the ability to pass in some information would be -useful. Hackish solutions to this include making the generator's code -look at a global variable and then changing the global variable's -value, or passing in some mutable object that callers then modify. - -To refresh your memory of basic generators, here's a simple example: - -\begin{verbatim} -def counter (maximum): - i = 0 - while i < maximum: - yield i - i += 1 -\end{verbatim} - -When you call \code{counter(10)}, the result is an iterator that -returns the values from 0 up to 9. On encountering the -\keyword{yield} statement, the iterator returns the provided value and -suspends the function's execution, preserving the local variables. -Execution resumes on the following call to the iterator's -\method{next()} method, picking up after the \keyword{yield} statement. - -In Python 2.3, \keyword{yield} was a statement; it didn't return any -value. In 2.5, \keyword{yield} is now an expression, returning a -value that can be assigned to a variable or otherwise operated on: - -\begin{verbatim} -val = (yield i) -\end{verbatim} - -I recommend that you always put parentheses around a \keyword{yield} -expression when you're doing something with the returned value, as in -the above example. The parentheses aren't always necessary, but it's -easier to always add them instead of having to remember when they're -needed. - -(\pep{342} explains the exact rules, which are that a -\keyword{yield}-expression must always be parenthesized except when it -occurs at the top-level expression on the right-hand side of an -assignment. This means you can write \code{val = yield i} but have to -use parentheses when there's an operation, as in \code{val = (yield i) -+ 12}.) - -Values are sent into a generator by calling its -\method{send(\var{value})} method. The generator's code is then -resumed and the \keyword{yield} expression returns the specified -\var{value}. If the regular \method{next()} method is called, the -\keyword{yield} returns \constant{None}. - -Here's the previous example, modified to allow changing the value of -the internal counter. - -\begin{verbatim} -def counter (maximum): - i = 0 - while i < maximum: - val = (yield i) - # If value provided, change counter - if val is not None: - i = val - else: - i += 1 -\end{verbatim} - -And here's an example of changing the counter: - -\begin{verbatim} ->>> it = counter(10) ->>> print it.next() -0 ->>> print it.next() -1 ->>> print it.send(8) -8 ->>> print it.next() -9 ->>> print it.next() -Traceback (most recent call last): - File ``t.py'', line 15, in ? - print it.next() -StopIteration -\end{verbatim} - -\keyword{yield} will usually return \constant{None}, so you -should always check for this case. Don't just use its value in -expressions unless you're sure that the \method{send()} method -will be the only method used to resume your generator function. - -In addition to \method{send()}, there are two other new methods on -generators: - -\begin{itemize} - - \item \method{throw(\var{type}, \var{value}=None, - \var{traceback}=None)} is used to raise an exception inside the - generator; the exception is raised by the \keyword{yield} expression - where the generator's execution is paused. - - \item \method{close()} raises a new \exception{GeneratorExit} - exception inside the generator to terminate the iteration. On - receiving this exception, the generator's code must either raise - \exception{GeneratorExit} or \exception{StopIteration}. Catching - the \exception{GeneratorExit} exception and returning a value is - illegal and will trigger a \exception{RuntimeError}; if the function - raises some other exception, that exception is propagated to the - caller. \method{close()} will also be called by Python's garbage - collector when the generator is garbage-collected. - - If you need to run cleanup code when a \exception{GeneratorExit} occurs, - I suggest using a \code{try: ... finally:} suite instead of - catching \exception{GeneratorExit}. - -\end{itemize} - -The cumulative effect of these changes is to turn generators from -one-way producers of information into both producers and consumers. - -Generators also become \emph{coroutines}, a more generalized form of -subroutines. Subroutines are entered at one point and exited at -another point (the top of the function, and a \keyword{return} -statement), but coroutines can be entered, exited, and resumed at -many different points (the \keyword{yield} statements). We'll have to -figure out patterns for using coroutines effectively in Python. - -The addition of the \method{close()} method has one side effect that -isn't obvious. \method{close()} is called when a generator is -garbage-collected, so this means the generator's code gets one last -chance to run before the generator is destroyed. This last chance -means that \code{try...finally} statements in generators can now be -guaranteed to work; the \keyword{finally} clause will now always get a -chance to run. The syntactic restriction that you couldn't mix -\keyword{yield} statements with a \code{try...finally} suite has -therefore been removed. This seems like a minor bit of language -trivia, but using generators and \code{try...finally} is actually -necessary in order to implement the \keyword{with} statement -described by PEP 343. I'll look at this new statement in the following -section. - -Another even more esoteric effect of this change: previously, the -\member{gi_frame} attribute of a generator was always a frame object. -It's now possible for \member{gi_frame} to be \code{None} -once the generator has been exhausted. - -\begin{seealso} - -\seepep{342}{Coroutines via Enhanced Generators}{PEP written by -Guido van~Rossum and Phillip J. Eby; -implemented by Phillip J. Eby. Includes examples of -some fancier uses of generators as coroutines. - -Earlier versions of these features were proposed in -\pep{288} by Raymond Hettinger and \pep{325} by Samuele Pedroni. -} - -\seeurl{http://en.wikipedia.org/wiki/Coroutine}{The Wikipedia entry for -coroutines.} - -\seeurl{http://www.sidhe.org/\~{}dan/blog/archives/000178.html}{An -explanation of coroutines from a Perl point of view, written by Dan -Sugalski.} - -\end{seealso} - - -%====================================================================== -\section{PEP 343: The 'with' statement\label{pep-343}} - -The '\keyword{with}' statement clarifies code that previously would -use \code{try...finally} blocks to ensure that clean-up code is -executed. In this section, I'll discuss the statement as it will -commonly be used. In the next section, I'll examine the -implementation details and show how to write objects for use with this -statement. - -The '\keyword{with}' statement is a new control-flow structure whose -basic structure is: - -\begin{verbatim} -with expression [as variable]: - with-block -\end{verbatim} - -The expression is evaluated, and it should result in an object that -supports the context management protocol (that is, has \method{__enter__()} -and \method{__exit__()} methods. - -The object's \method{__enter__()} is called before \var{with-block} is -executed and therefore can run set-up code. It also may return a value -that is bound to the name \var{variable}, if given. (Note carefully -that \var{variable} is \emph{not} assigned the result of \var{expression}.) - -After execution of the \var{with-block} is finished, the object's -\method{__exit__()} method is called, even if the block raised an exception, -and can therefore run clean-up code. - -To enable the statement in Python 2.5, you need to add the following -directive to your module: - -\begin{verbatim} -from __future__ import with_statement -\end{verbatim} - -The statement will always be enabled in Python 2.6. - -Some standard Python objects now support the context management -protocol and can be used with the '\keyword{with}' statement. File -objects are one example: - -\begin{verbatim} -with open('/etc/passwd', 'r') as f: - for line in f: - print line - ... more processing code ... -\end{verbatim} - -After this statement has executed, the file object in \var{f} will -have been automatically closed, even if the \keyword{for} loop -raised an exception part-way through the block. - -\note{In this case, \var{f} is the same object created by - \function{open()}, because \method{file.__enter__()} returns - \var{self}.} - -The \module{threading} module's locks and condition variables -also support the '\keyword{with}' statement: - -\begin{verbatim} -lock = threading.Lock() -with lock: - # Critical section of code - ... -\end{verbatim} - -The lock is acquired before the block is executed and always released once -the block is complete. - -The new \function{localcontext()} function in the \module{decimal} module -makes it easy to save and restore the current decimal context, which -encapsulates the desired precision and rounding characteristics for -computations: - -\begin{verbatim} -from decimal import Decimal, Context, localcontext - -# Displays with default precision of 28 digits -v = Decimal('578') -print v.sqrt() - -with localcontext(Context(prec=16)): - # All code in this block uses a precision of 16 digits. - # The original context is restored on exiting the block. - print v.sqrt() -\end{verbatim} - -\subsection{Writing Context Managers\label{context-managers}} - -Under the hood, the '\keyword{with}' statement is fairly complicated. -Most people will only use '\keyword{with}' in company with existing -objects and don't need to know these details, so you can skip the rest -of this section if you like. Authors of new objects will need to -understand the details of the underlying implementation and should -keep reading. - -A high-level explanation of the context management protocol is: - -\begin{itemize} - -\item The expression is evaluated and should result in an object -called a ``context manager''. The context manager must have -\method{__enter__()} and \method{__exit__()} methods. - -\item The context manager's \method{__enter__()} method is called. The value -returned is assigned to \var{VAR}. If no \code{'as \var{VAR}'} clause -is present, the value is simply discarded. - -\item The code in \var{BLOCK} is executed. - -\item If \var{BLOCK} raises an exception, the -\method{__exit__(\var{type}, \var{value}, \var{traceback})} is called -with the exception details, the same values returned by -\function{sys.exc_info()}. The method's return value controls whether -the exception is re-raised: any false value re-raises the exception, -and \code{True} will result in suppressing it. You'll only rarely -want to suppress the exception, because if you do -the author of the code containing the -'\keyword{with}' statement will never realize anything went wrong. - -\item If \var{BLOCK} didn't raise an exception, -the \method{__exit__()} method is still called, -but \var{type}, \var{value}, and \var{traceback} are all \code{None}. - -\end{itemize} - -Let's think through an example. I won't present detailed code but -will only sketch the methods necessary for a database that supports -transactions. - -(For people unfamiliar with database terminology: a set of changes to -the database are grouped into a transaction. Transactions can be -either committed, meaning that all the changes are written into the -database, or rolled back, meaning that the changes are all discarded -and the database is unchanged. See any database textbook for more -information.) - -Let's assume there's an object representing a database connection. -Our goal will be to let the user write code like this: - -\begin{verbatim} -db_connection = DatabaseConnection() -with db_connection as cursor: - cursor.execute('insert into ...') - cursor.execute('delete from ...') - # ... more operations ... -\end{verbatim} - -The transaction should be committed if the code in the block -runs flawlessly or rolled back if there's an exception. -Here's the basic interface -for \class{DatabaseConnection} that I'll assume: - -\begin{verbatim} -class DatabaseConnection: - # Database interface - def cursor (self): - "Returns a cursor object and starts a new transaction" - def commit (self): - "Commits current transaction" - def rollback (self): - "Rolls back current transaction" -\end{verbatim} - -The \method {__enter__()} method is pretty easy, having only to start -a new transaction. For this application the resulting cursor object -would be a useful result, so the method will return it. The user can -then add \code{as cursor} to their '\keyword{with}' statement to bind -the cursor to a variable name. - -\begin{verbatim} -class DatabaseConnection: - ... - def __enter__ (self): - # Code to start a new transaction - cursor = self.cursor() - return cursor -\end{verbatim} - -The \method{__exit__()} method is the most complicated because it's -where most of the work has to be done. The method has to check if an -exception occurred. If there was no exception, the transaction is -committed. The transaction is rolled back if there was an exception. - -In the code below, execution will just fall off the end of the -function, returning the default value of \code{None}. \code{None} is -false, so the exception will be re-raised automatically. If you -wished, you could be more explicit and add a \keyword{return} -statement at the marked location. - -\begin{verbatim} -class DatabaseConnection: - ... - def __exit__ (self, type, value, tb): - if tb is None: - # No exception, so commit - self.commit() - else: - # Exception occurred, so rollback. - self.rollback() - # return False -\end{verbatim} - - -\subsection{The contextlib module\label{module-contextlib}} - -The new \module{contextlib} module provides some functions and a -decorator that are useful for writing objects for use with the -'\keyword{with}' statement. - -The decorator is called \function{contextmanager}, and lets you write -a single generator function instead of defining a new class. The generator -should yield exactly one value. The code up to the \keyword{yield} -will be executed as the \method{__enter__()} method, and the value -yielded will be the method's return value that will get bound to the -variable in the '\keyword{with}' statement's \keyword{as} clause, if -any. The code after the \keyword{yield} will be executed in the -\method{__exit__()} method. Any exception raised in the block will be -raised by the \keyword{yield} statement. - -Our database example from the previous section could be written -using this decorator as: - -\begin{verbatim} -from contextlib import contextmanager - -@contextmanager -def db_transaction (connection): - cursor = connection.cursor() - try: - yield cursor - except: - connection.rollback() - raise - else: - connection.commit() - -db = DatabaseConnection() -with db_transaction(db) as cursor: - ... -\end{verbatim} - -The \module{contextlib} module also has a \function{nested(\var{mgr1}, -\var{mgr2}, ...)} function that combines a number of context managers so you -don't need to write nested '\keyword{with}' statements. In this -example, the single '\keyword{with}' statement both starts a database -transaction and acquires a thread lock: - -\begin{verbatim} -lock = threading.Lock() -with nested (db_transaction(db), lock) as (cursor, locked): - ... -\end{verbatim} - -Finally, the \function{closing(\var{object})} function -returns \var{object} so that it can be bound to a variable, -and calls \code{\var{object}.close()} at the end of the block. - -\begin{verbatim} -import urllib, sys -from contextlib import closing - -with closing(urllib.urlopen('http://www.yahoo.com')) as f: - for line in f: - sys.stdout.write(line) -\end{verbatim} - -\begin{seealso} - -\seepep{343}{The ``with'' statement}{PEP written by Guido van~Rossum -and Nick Coghlan; implemented by Mike Bland, Guido van~Rossum, and -Neal Norwitz. The PEP shows the code generated for a '\keyword{with}' -statement, which can be helpful in learning how the statement works.} - -\seeurl{../lib/module-contextlib.html}{The documentation -for the \module{contextlib} module.} - -\end{seealso} - - -%====================================================================== -\section{PEP 352: Exceptions as New-Style Classes\label{pep-352}} - -Exception classes can now be new-style classes, not just classic -classes, and the built-in \exception{Exception} class and all the -standard built-in exceptions (\exception{NameError}, -\exception{ValueError}, etc.) are now new-style classes. - -The inheritance hierarchy for exceptions has been rearranged a bit. -In 2.5, the inheritance relationships are: - -\begin{verbatim} -BaseException # New in Python 2.5 -|- KeyboardInterrupt -|- SystemExit -|- Exception - |- (all other current built-in exceptions) -\end{verbatim} - -This rearrangement was done because people often want to catch all -exceptions that indicate program errors. \exception{KeyboardInterrupt} and -\exception{SystemExit} aren't errors, though, and usually represent an explicit -action such as the user hitting Control-C or code calling -\function{sys.exit()}. A bare \code{except:} will catch all exceptions, -so you commonly need to list \exception{KeyboardInterrupt} and -\exception{SystemExit} in order to re-raise them. The usual pattern is: - -\begin{verbatim} -try: - ... -except (KeyboardInterrupt, SystemExit): - raise -except: - # Log error... - # Continue running program... -\end{verbatim} - -In Python 2.5, you can now write \code{except Exception} to achieve -the same result, catching all the exceptions that usually indicate errors -but leaving \exception{KeyboardInterrupt} and -\exception{SystemExit} alone. As in previous versions, -a bare \code{except:} still catches all exceptions. - -The goal for Python 3.0 is to require any class raised as an exception -to derive from \exception{BaseException} or some descendant of -\exception{BaseException}, and future releases in the -Python 2.x series may begin to enforce this constraint. Therefore, I -suggest you begin making all your exception classes derive from -\exception{Exception} now. It's been suggested that the bare -\code{except:} form should be removed in Python 3.0, but Guido van~Rossum -hasn't decided whether to do this or not. - -Raising of strings as exceptions, as in the statement \code{raise -"Error occurred"}, is deprecated in Python 2.5 and will trigger a -warning. The aim is to be able to remove the string-exception feature -in a few releases. - - -\begin{seealso} - -\seepep{352}{Required Superclass for Exceptions}{PEP written by -Brett Cannon and Guido van~Rossum; implemented by Brett Cannon.} - -\end{seealso} - - -%====================================================================== -\section{PEP 353: Using ssize_t as the index type\label{pep-353}} - -A wide-ranging change to Python's C API, using a new -\ctype{Py_ssize_t} type definition instead of \ctype{int}, -will permit the interpreter to handle more data on 64-bit platforms. -This change doesn't affect Python's capacity on 32-bit platforms. - -Various pieces of the Python interpreter used C's \ctype{int} type to -store sizes or counts; for example, the number of items in a list or -tuple were stored in an \ctype{int}. The C compilers for most 64-bit -platforms still define \ctype{int} as a 32-bit type, so that meant -that lists could only hold up to \code{2**31 - 1} = 2147483647 items. -(There are actually a few different programming models that 64-bit C -compilers can use -- see -\url{http://www.unix.org/version2/whatsnew/lp64_wp.html} for a -discussion -- but the most commonly available model leaves \ctype{int} -as 32 bits.) - -A limit of 2147483647 items doesn't really matter on a 32-bit platform -because you'll run out of memory before hitting the length limit. -Each list item requires space for a pointer, which is 4 bytes, plus -space for a \ctype{PyObject} representing the item. 2147483647*4 is -already more bytes than a 32-bit address space can contain. - -It's possible to address that much memory on a 64-bit platform, -however. The pointers for a list that size would only require 16~GiB -of space, so it's not unreasonable that Python programmers might -construct lists that large. Therefore, the Python interpreter had to -be changed to use some type other than \ctype{int}, and this will be a -64-bit type on 64-bit platforms. The change will cause -incompatibilities on 64-bit machines, so it was deemed worth making -the transition now, while the number of 64-bit users is still -relatively small. (In 5 or 10 years, we may \emph{all} be on 64-bit -machines, and the transition would be more painful then.) - -This change most strongly affects authors of C extension modules. -Python strings and container types such as lists and tuples -now use \ctype{Py_ssize_t} to store their size. -Functions such as \cfunction{PyList_Size()} -now return \ctype{Py_ssize_t}. Code in extension modules -may therefore need to have some variables changed to -\ctype{Py_ssize_t}. - -The \cfunction{PyArg_ParseTuple()} and \cfunction{Py_BuildValue()} functions -have a new conversion code, \samp{n}, for \ctype{Py_ssize_t}. -\cfunction{PyArg_ParseTuple()}'s \samp{s\#} and \samp{t\#} still output -\ctype{int} by default, but you can define the macro -\csimplemacro{PY_SSIZE_T_CLEAN} before including \file{Python.h} -to make them return \ctype{Py_ssize_t}. - -\pep{353} has a section on conversion guidelines that -extension authors should read to learn about supporting 64-bit -platforms. - -\begin{seealso} - -\seepep{353}{Using ssize_t as the index type}{PEP written and implemented by Martin von~L\"owis.} - -\end{seealso} - - -%====================================================================== -\section{PEP 357: The '__index__' method\label{pep-357}} - -The NumPy developers had a problem that could only be solved by adding -a new special method, \method{__index__}. When using slice notation, -as in \code{[\var{start}:\var{stop}:\var{step}]}, the values of the -\var{start}, \var{stop}, and \var{step} indexes must all be either -integers or long integers. NumPy defines a variety of specialized -integer types corresponding to unsigned and signed integers of 8, 16, -32, and 64 bits, but there was no way to signal that these types could -be used as slice indexes. - -Slicing can't just use the existing \method{__int__} method because -that method is also used to implement coercion to integers. If -slicing used \method{__int__}, floating-point numbers would also -become legal slice indexes and that's clearly an undesirable -behaviour. - -Instead, a new special method called \method{__index__} was added. It -takes no arguments and returns an integer giving the slice index to -use. For example: - -\begin{verbatim} -class C: - def __index__ (self): - return self.value -\end{verbatim} - -The return value must be either a Python integer or long integer. -The interpreter will check that the type returned is correct, and -raises a \exception{TypeError} if this requirement isn't met. - -A corresponding \member{nb_index} slot was added to the C-level -\ctype{PyNumberMethods} structure to let C extensions implement this -protocol. \cfunction{PyNumber_Index(\var{obj})} can be used in -extension code to call the \method{__index__} function and retrieve -its result. - -\begin{seealso} - -\seepep{357}{Allowing Any Object to be Used for Slicing}{PEP written -and implemented by Travis Oliphant.} - -\end{seealso} - - -%====================================================================== -\section{Other Language Changes\label{other-lang}} - -Here are all of the changes that Python 2.5 makes to the core Python -language. - -\begin{itemize} - -\item The \class{dict} type has a new hook for letting subclasses -provide a default value when a key isn't contained in the dictionary. -When a key isn't found, the dictionary's -\method{__missing__(\var{key})} -method will be called. This hook is used to implement -the new \class{defaultdict} class in the \module{collections} -module. The following example defines a dictionary -that returns zero for any missing key: - -\begin{verbatim} -class zerodict (dict): - def __missing__ (self, key): - return 0 - -d = zerodict({1:1, 2:2}) -print d[1], d[2] # Prints 1, 2 -print d[3], d[4] # Prints 0, 0 -\end{verbatim} - -\item Both 8-bit and Unicode strings have new \method{partition(sep)} -and \method{rpartition(sep)} methods that simplify a common use case. - -The \method{find(S)} method is often used to get an index which is -then used to slice the string and obtain the pieces that are before -and after the separator. -\method{partition(sep)} condenses this -pattern into a single method call that returns a 3-tuple containing -the substring before the separator, the separator itself, and the -substring after the separator. If the separator isn't found, the -first element of the tuple is the entire string and the other two -elements are empty. \method{rpartition(sep)} also returns a 3-tuple -but starts searching from the end of the string; the \samp{r} stands -for 'reverse'. - -Some examples: - -\begin{verbatim} ->>> ('http://www.python.org').partition('://') -('http', '://', 'www.python.org') ->>> ('file:/usr/share/doc/index.html').partition('://') -('file:/usr/share/doc/index.html', '', '') ->>> (u'Subject: a quick question').partition(':') -(u'Subject', u':', u' a quick question') ->>> 'www.python.org'.rpartition('.') -('www.python', '.', 'org') ->>> 'www.python.org'.rpartition(':') -('', '', 'www.python.org') -\end{verbatim} - -(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.) - -\item The \method{startswith()} and \method{endswith()} methods -of string types now accept tuples of strings to check for. - -\begin{verbatim} -def is_image_file (filename): - return filename.endswith(('.gif', '.jpg', '.tiff')) -\end{verbatim} - -(Implemented by Georg Brandl following a suggestion by Tom Lynn.) -% RFE #1491485 - -\item The \function{min()} and \function{max()} built-in functions -gained a \code{key} keyword parameter analogous to the \code{key} -argument for \method{sort()}. This parameter supplies a function that -takes a single argument and is called for every value in the list; -\function{min()}/\function{max()} will return the element with the -smallest/largest return value from this function. -For example, to find the longest string in a list, you can do: - -\begin{verbatim} -L = ['medium', 'longest', 'short'] -# Prints 'longest' -print max(L, key=len) -# Prints 'short', because lexicographically 'short' has the largest value -print max(L) -\end{verbatim} - -(Contributed by Steven Bethard and Raymond Hettinger.) - -\item Two new built-in functions, \function{any()} and -\function{all()}, evaluate whether an iterator contains any true or -false values. \function{any()} returns \constant{True} if any value -returned by the iterator is true; otherwise it will return -\constant{False}. \function{all()} returns \constant{True} only if -all of the values returned by the iterator evaluate as true. -(Suggested by Guido van~Rossum, and implemented by Raymond Hettinger.) - -\item The result of a class's \method{__hash__()} method can now -be either a long integer or a regular integer. If a long integer is -returned, the hash of that value is taken. In earlier versions the -hash value was required to be a regular integer, but in 2.5 the -\function{id()} built-in was changed to always return non-negative -numbers, and users often seem to use \code{id(self)} in -\method{__hash__()} methods (though this is discouraged). -% Bug #1536021 - -\item ASCII is now the default encoding for modules. It's now -a syntax error if a module contains string literals with 8-bit -characters but doesn't have an encoding declaration. In Python 2.4 -this triggered a warning, not a syntax error. See \pep{263} -for how to declare a module's encoding; for example, you might add -a line like this near the top of the source file: - -\begin{verbatim} -# -*- coding: latin1 -*- -\end{verbatim} - -\item A new warning, \class{UnicodeWarning}, is triggered when -you attempt to compare a Unicode string and an 8-bit string -that can't be converted to Unicode using the default ASCII encoding. -The result of the comparison is false: - -\begin{verbatim} ->>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode -__main__:1: UnicodeWarning: Unicode equal comparison failed - to convert both arguments to Unicode - interpreting them - as being unequal -False ->>> chr(127) == unichr(127) # chr(127) can be converted -True -\end{verbatim} - -Previously this would raise a \class{UnicodeDecodeError} exception, -but in 2.5 this could result in puzzling problems when accessing a -dictionary. If you looked up \code{unichr(128)} and \code{chr(128)} -was being used as a key, you'd get a \class{UnicodeDecodeError} -exception. Other changes in 2.5 resulted in this exception being -raised instead of suppressed by the code in \file{dictobject.c} that -implements dictionaries. - -Raising an exception for such a comparison is strictly correct, but -the change might have broken code, so instead -\class{UnicodeWarning} was introduced. - -(Implemented by Marc-Andr\'e Lemburg.) - -\item One error that Python programmers sometimes make is forgetting -to include an \file{__init__.py} module in a package directory. -Debugging this mistake can be confusing, and usually requires running -Python with the \programopt{-v} switch to log all the paths searched. -In Python 2.5, a new \exception{ImportWarning} warning is triggered when -an import would have picked up a directory as a package but no -\file{__init__.py} was found. This warning is silently ignored by default; -provide the \programopt{-Wd} option when running the Python executable -to display the warning message. -(Implemented by Thomas Wouters.) - -\item The list of base classes in a class definition can now be empty. -As an example, this is now legal: - -\begin{verbatim} -class C(): - pass -\end{verbatim} -(Implemented by Brett Cannon.) - -\end{itemize} - - -%====================================================================== -\subsection{Interactive Interpreter Changes\label{interactive}} - -In the interactive interpreter, \code{quit} and \code{exit} -have long been strings so that new users get a somewhat helpful message -when they try to quit: - -\begin{verbatim} ->>> quit -'Use Ctrl-D (i.e. EOF) to exit.' -\end{verbatim} - -In Python 2.5, \code{quit} and \code{exit} are now objects that still -produce string representations of themselves, but are also callable. -Newbies who try \code{quit()} or \code{exit()} will now exit the -interpreter as they expect. (Implemented by Georg Brandl.) - -The Python executable now accepts the standard long options -\longprogramopt{help} and \longprogramopt{version}; on Windows, -it also accepts the \programopt{/?} option for displaying a help message. -(Implemented by Georg Brandl.) - - -%====================================================================== -\subsection{Optimizations\label{opts}} - -Several of the optimizations were developed at the NeedForSpeed -sprint, an event held in Reykjavik, Iceland, from May 21--28 2006. -The sprint focused on speed enhancements to the CPython implementation -and was funded by EWT LLC with local support from CCP Games. Those -optimizations added at this sprint are specially marked in the -following list. - -\begin{itemize} - -\item When they were introduced -in Python 2.4, the built-in \class{set} and \class{frozenset} types -were built on top of Python's dictionary type. -In 2.5 the internal data structure has been customized for implementing sets, -and as a result sets will use a third less memory and are somewhat faster. -(Implemented by Raymond Hettinger.) - -\item The speed of some Unicode operations, such as finding -substrings, string splitting, and character map encoding and decoding, -has been improved. (Substring search and splitting improvements were -added by Fredrik Lundh and Andrew Dalke at the NeedForSpeed -sprint. Character maps were improved by Walter D\"orwald and -Martin von~L\"owis.) -% Patch 1313939, 1359618 - -\item The \function{long(\var{str}, \var{base})} function is now -faster on long digit strings because fewer intermediate results are -calculated. The peak is for strings of around 800--1000 digits where -the function is 6 times faster. -(Contributed by Alan McIntyre and committed at the NeedForSpeed sprint.) -% Patch 1442927 - -\item It's now illegal to mix iterating over a file -with \code{for line in \var{file}} and calling -the file object's \method{read()}/\method{readline()}/\method{readlines()} -methods. Iteration uses an internal buffer and the -\method{read*()} methods don't use that buffer. -Instead they would return the data following the buffer, causing the -data to appear out of order. Mixing iteration and these methods will -now trigger a \exception{ValueError} from the \method{read*()} method. -(Implemented by Thomas Wouters.) -% Patch 1397960 - -\item The \module{struct} module now compiles structure format -strings into an internal representation and caches this -representation, yielding a 20\% speedup. (Contributed by Bob Ippolito -at the NeedForSpeed sprint.) - -\item The \module{re} module got a 1 or 2\% speedup by switching to -Python's allocator functions instead of the system's -\cfunction{malloc()} and \cfunction{free()}. -(Contributed by Jack Diederich at the NeedForSpeed sprint.) - -\item The code generator's peephole optimizer now performs -simple constant folding in expressions. If you write something like -\code{a = 2+3}, the code generator will do the arithmetic and produce -code corresponding to \code{a = 5}. (Proposed and implemented -by Raymond Hettinger.) - -\item Function calls are now faster because code objects now keep -the most recently finished frame (a ``zombie frame'') in an internal -field of the code object, reusing it the next time the code object is -invoked. (Original patch by Michael Hudson, modified by Armin Rigo -and Richard Jones; committed at the NeedForSpeed sprint.) -% Patch 876206 - -Frame objects are also slightly smaller, which may improve cache locality -and reduce memory usage a bit. (Contributed by Neal Norwitz.) -% Patch 1337051 - -\item Python's built-in exceptions are now new-style classes, a change -that speeds up instantiation considerably. Exception handling in -Python 2.5 is therefore about 30\% faster than in 2.4. -(Contributed by Richard Jones, Georg Brandl and Sean Reifschneider at -the NeedForSpeed sprint.) - -\item Importing now caches the paths tried, recording whether -they exist or not so that the interpreter makes fewer -\cfunction{open()} and \cfunction{stat()} calls on startup. -(Contributed by Martin von~L\"owis and Georg Brandl.) -% Patch 921466 - -\end{itemize} - - -%====================================================================== -\section{New, Improved, and Removed Modules\label{modules}} - -The standard library received many enhancements and bug fixes in -Python 2.5. Here's a partial list of the most notable changes, sorted -alphabetically by module name. Consult the \file{Misc/NEWS} file in -the source tree for a more complete list of changes, or look through -the SVN logs for all the details. - -\begin{itemize} - -\item The \module{audioop} module now supports the a-LAW encoding, -and the code for u-LAW encoding has been improved. (Contributed by -Lars Immisch.) - -\item The \module{codecs} module gained support for incremental -codecs. The \function{codec.lookup()} function now -returns a \class{CodecInfo} instance instead of a tuple. -\class{CodecInfo} instances behave like a 4-tuple to preserve backward -compatibility but also have the attributes \member{encode}, -\member{decode}, \member{incrementalencoder}, \member{incrementaldecoder}, -\member{streamwriter}, and \member{streamreader}. Incremental codecs -can receive input and produce output in multiple chunks; the output is -the same as if the entire input was fed to the non-incremental codec. -See the \module{codecs} module documentation for details. -(Designed and implemented by Walter D\"orwald.) -% Patch 1436130 - -\item The \module{collections} module gained a new type, -\class{defaultdict}, that subclasses the standard \class{dict} -type. The new type mostly behaves like a dictionary but constructs a -default value when a key isn't present, automatically adding it to the -dictionary for the requested key value. - -The first argument to \class{defaultdict}'s constructor is a factory -function that gets called whenever a key is requested but not found. -This factory function receives no arguments, so you can use built-in -type constructors such as \function{list()} or \function{int()}. For -example, -you can make an index of words based on their initial letter like this: - -\begin{verbatim} -words = """Nel mezzo del cammin di nostra vita -mi ritrovai per una selva oscura -che la diritta via era smarrita""".lower().split() - -index = defaultdict(list) - -for w in words: - init_letter = w[0] - index[init_letter].append(w) -\end{verbatim} - -Printing \code{index} results in the following output: - -\begin{verbatim} -defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'], - 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'], - 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'], - 'p': ['per'], 's': ['selva', 'smarrita'], - 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']} -\end{verbatim} - -(Contributed by Guido van~Rossum.) - -\item The \class{deque} double-ended queue type supplied by the -\module{collections} module now has a \method{remove(\var{value})} -method that removes the first occurrence of \var{value} in the queue, -raising \exception{ValueError} if the value isn't found. -(Contributed by Raymond Hettinger.) - -\item New module: The \module{contextlib} module contains helper functions for use -with the new '\keyword{with}' statement. See -section~\ref{module-contextlib} for more about this module. - -\item New module: The \module{cProfile} module is a C implementation of -the existing \module{profile} module that has much lower overhead. -The module's interface is the same as \module{profile}: you run -\code{cProfile.run('main()')} to profile a function, can save profile -data to a file, etc. It's not yet known if the Hotshot profiler, -which is also written in C but doesn't match the \module{profile} -module's interface, will continue to be maintained in future versions -of Python. (Contributed by Armin Rigo.) - -Also, the \module{pstats} module for analyzing the data measured by -the profiler now supports directing the output to any file object -by supplying a \var{stream} argument to the \class{Stats} constructor. -(Contributed by Skip Montanaro.) - -\item The \module{csv} module, which parses files in -comma-separated value format, received several enhancements and a -number of bugfixes. You can now set the maximum size in bytes of a -field by calling the \method{csv.field_size_limit(\var{new_limit})} -function; omitting the \var{new_limit} argument will return the -currently-set limit. The \class{reader} class now has a -\member{line_num} attribute that counts the number of physical lines -read from the source; records can span multiple physical lines, so -\member{line_num} is not the same as the number of records read. - -The CSV parser is now stricter about multi-line quoted -fields. Previously, if a line ended within a quoted field without a -terminating newline character, a newline would be inserted into the -returned field. This behavior caused problems when reading files that -contained carriage return characters within fields, so the code was -changed to return the field without inserting newlines. As a -consequence, if newlines embedded within fields are important, the -input should be split into lines in a manner that preserves the -newline characters. - -(Contributed by Skip Montanaro and Andrew McNamara.) - -\item The \class{datetime} class in the \module{datetime} -module now has a \method{strptime(\var{string}, \var{format})} -method for parsing date strings, contributed by Josh Spoerri. -It uses the same format characters as \function{time.strptime()} and -\function{time.strftime()}: - -\begin{verbatim} -from datetime import datetime - -ts = datetime.strptime('10:13:15 2006-03-07', - '%H:%M:%S %Y-%m-%d') -\end{verbatim} - -\item The \method{SequenceMatcher.get_matching_blocks()} method -in the \module{difflib} module now guarantees to return a minimal list -of blocks describing matching subsequences. Previously, the algorithm would -occasionally break a block of matching elements into two list entries. -(Enhancement by Tim Peters.) - -\item The \module{doctest} module gained a \code{SKIP} option that -keeps an example from being executed at all. This is intended for -code snippets that are usage examples intended for the reader and -aren't actually test cases. - -An \var{encoding} parameter was added to the \function{testfile()} -function and the \class{DocFileSuite} class to specify the file's -encoding. This makes it easier to use non-ASCII characters in -tests contained within a docstring. (Contributed by Bjorn Tillenius.) -% Patch 1080727 - -\item The \module{email} package has been updated to version 4.0. -% XXX need to provide some more detail here -(Contributed by Barry Warsaw.) - -\item The \module{fileinput} module was made more flexible. -Unicode filenames are now supported, and a \var{mode} parameter that -defaults to \code{"r"} was added to the -\function{input()} function to allow opening files in binary or -universal-newline mode. Another new parameter, \var{openhook}, -lets you use a function other than \function{open()} -to open the input files. Once you're iterating over -the set of files, the \class{FileInput} object's new -\method{fileno()} returns the file descriptor for the currently opened file. -(Contributed by Georg Brandl.) - -\item In the \module{gc} module, the new \function{get_count()} function -returns a 3-tuple containing the current collection counts for the -three GC generations. This is accounting information for the garbage -collector; when these counts reach a specified threshold, a garbage -collection sweep will be made. The existing \function{gc.collect()} -function now takes an optional \var{generation} argument of 0, 1, or 2 -to specify which generation to collect. -(Contributed by Barry Warsaw.) - -\item The \function{nsmallest()} and -\function{nlargest()} functions in the \module{heapq} module -now support a \code{key} keyword parameter similar to the one -provided by the \function{min()}/\function{max()} functions -and the \method{sort()} methods. For example: - -\begin{verbatim} ->>> import heapq ->>> L = ["short", 'medium', 'longest', 'longer still'] ->>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically -['longer still', 'longest'] ->>> heapq.nsmallest(2, L, key=len) # Return two shortest elements -['short', 'medium'] -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item The \function{itertools.islice()} function now accepts -\code{None} for the start and step arguments. This makes it more -compatible with the attributes of slice objects, so that you can now write -the following: - -\begin{verbatim} -s = slice(5) # Create slice object -itertools.islice(iterable, s.start, s.stop, s.step) -\end{verbatim} - -(Contributed by Raymond Hettinger.) - -\item The \function{format()} function in the \module{locale} module -has been modified and two new functions were added, -\function{format_string()} and \function{currency()}. - -The \function{format()} function's \var{val} parameter could -previously be a string as long as no more than one \%char specifier -appeared; now the parameter must be exactly one \%char specifier with -no surrounding text. An optional \var{monetary} parameter was also -added which, if \code{True}, will use the locale's rules for -formatting currency in placing a separator between groups of three -digits. - -To format strings with multiple \%char specifiers, use the new -\function{format_string()} function that works like \function{format()} -but also supports mixing \%char specifiers with -arbitrary text. - -A new \function{currency()} function was also added that formats a -number according to the current locale's settings. - -(Contributed by Georg Brandl.) -% Patch 1180296 - -\item The \module{mailbox} module underwent a massive rewrite to add -the capability to modify mailboxes in addition to reading them. A new -set of classes that include \class{mbox}, \class{MH}, and -\class{Maildir} are used to read mailboxes, and have an -\method{add(\var{message})} method to add messages, -\method{remove(\var{key})} to remove messages, and -\method{lock()}/\method{unlock()} to lock/unlock the mailbox. The -following example converts a maildir-format mailbox into an mbox-format one: - -\begin{verbatim} -import mailbox - -# 'factory=None' uses email.Message.Message as the class representing -# individual messages. -src = mailbox.Maildir('maildir', factory=None) -dest = mailbox.mbox('/tmp/mbox') - -for msg in src: - dest.add(msg) -\end{verbatim} - -(Contributed by Gregory K. Johnson. Funding was provided by Google's -2005 Summer of Code.) - -\item New module: the \module{msilib} module allows creating -Microsoft Installer \file{.msi} files and CAB files. Some support -for reading the \file{.msi} database is also included. -(Contributed by Martin von~L\"owis.) - -\item The \module{nis} module now supports accessing domains other -than the system default domain by supplying a \var{domain} argument to -the \function{nis.match()} and \function{nis.maps()} functions. -(Contributed by Ben Bell.) - -\item The \module{operator} module's \function{itemgetter()} -and \function{attrgetter()} functions now support multiple fields. -A call such as \code{operator.attrgetter('a', 'b')} -will return a function -that retrieves the \member{a} and \member{b} attributes. Combining -this new feature with the \method{sort()} method's \code{key} parameter -lets you easily sort lists using multiple fields. -(Contributed by Raymond Hettinger.) - -\item The \module{optparse} module was updated to version 1.5.1 of the -Optik library. The \class{OptionParser} class gained an -\member{epilog} attribute, a string that will be printed after the -help message, and a \method{destroy()} method to break reference -cycles created by the object. (Contributed by Greg Ward.) - -\item The \module{os} module underwent several changes. The -\member{stat_float_times} variable now defaults to true, meaning that -\function{os.stat()} will now return time values as floats. (This -doesn't necessarily mean that \function{os.stat()} will return times -that are precise to fractions of a second; not all systems support -such precision.) - -Constants named \member{os.SEEK_SET}, \member{os.SEEK_CUR}, and -\member{os.SEEK_END} have been added; these are the parameters to the -\function{os.lseek()} function. Two new constants for locking are -\member{os.O_SHLOCK} and \member{os.O_EXLOCK}. - -Two new functions, \function{wait3()} and \function{wait4()}, were -added. They're similar the \function{waitpid()} function which waits -for a child process to exit and returns a tuple of the process ID and -its exit status, but \function{wait3()} and \function{wait4()} return -additional information. \function{wait3()} doesn't take a process ID -as input, so it waits for any child process to exit and returns a -3-tuple of \var{process-id}, \var{exit-status}, \var{resource-usage} -as returned from the \function{resource.getrusage()} function. -\function{wait4(\var{pid})} does take a process ID. -(Contributed by Chad J. Schroeder.) - -On FreeBSD, the \function{os.stat()} function now returns -times with nanosecond resolution, and the returned object -now has \member{st_gen} and \member{st_birthtime}. -The \member{st_flags} member is also available, if the platform supports it. -(Contributed by Antti Louko and Diego Petten\`o.) -% (Patch 1180695, 1212117) - -\item The Python debugger provided by the \module{pdb} module -can now store lists of commands to execute when a breakpoint is -reached and execution stops. Once breakpoint \#1 has been created, -enter \samp{commands 1} and enter a series of commands to be executed, -finishing the list with \samp{end}. The command list can include -commands that resume execution, such as \samp{continue} or -\samp{next}. (Contributed by Gr\'egoire Dooms.) -% Patch 790710 - -\item The \module{pickle} and \module{cPickle} modules no -longer accept a return value of \code{None} from the -\method{__reduce__()} method; the method must return a tuple of -arguments instead. The ability to return \code{None} was deprecated -in Python 2.4, so this completes the removal of the feature. - -\item The \module{pkgutil} module, containing various utility -functions for finding packages, was enhanced to support PEP 302's -import hooks and now also works for packages stored in ZIP-format archives. -(Contributed by Phillip J. Eby.) - -\item The pybench benchmark suite by Marc-Andr\'e~Lemburg is now -included in the \file{Tools/pybench} directory. The pybench suite is -an improvement on the commonly used \file{pystone.py} program because -pybench provides a more detailed measurement of the interpreter's -speed. It times particular operations such as function calls, -tuple slicing, method lookups, and numeric operations, instead of -performing many different operations and reducing the result to a -single number as \file{pystone.py} does. - -\item The \module{pyexpat} module now uses version 2.0 of the Expat parser. -(Contributed by Trent Mick.) - -\item The \class{Queue} class provided by the \module{Queue} module -gained two new methods. \method{join()} blocks until all items in -the queue have been retrieved and all processing work on the items -have been completed. Worker threads call the other new method, -\method{task_done()}, to signal that processing for an item has been -completed. (Contributed by Raymond Hettinger.) - -\item The old \module{regex} and \module{regsub} modules, which have been -deprecated ever since Python 2.0, have finally been deleted. -Other deleted modules: \module{statcache}, \module{tzparse}, -\module{whrandom}. - -\item Also deleted: the \file{lib-old} directory, -which includes ancient modules such as \module{dircmp} and -\module{ni}, was removed. \file{lib-old} wasn't on the default -\code{sys.path}, so unless your programs explicitly added the directory to -\code{sys.path}, this removal shouldn't affect your code. - -\item The \module{rlcompleter} module is no longer -dependent on importing the \module{readline} module and -therefore now works on non-{\UNIX} platforms. -(Patch from Robert Kiendl.) -% Patch #1472854 - -\item The \module{SimpleXMLRPCServer} and \module{DocXMLRPCServer} -classes now have a \member{rpc_paths} attribute that constrains -XML-RPC operations to a limited set of URL paths; the default is -to allow only \code{'/'} and \code{'/RPC2'}. Setting -\member{rpc_paths} to \code{None} or an empty tuple disables -this path checking. -% Bug #1473048 - -\item The \module{socket} module now supports \constant{AF_NETLINK} -sockets on Linux, thanks to a patch from Philippe Biondi. -Netlink sockets are a Linux-specific mechanism for communications -between a user-space process and kernel code; an introductory -article about them is at \url{http://www.linuxjournal.com/article/7356}. -In Python code, netlink addresses are represented as a tuple of 2 integers, -\code{(\var{pid}, \var{group_mask})}. - -Two new methods on socket objects, \method{recv_into(\var{buffer})} and -\method{recvfrom_into(\var{buffer})}, store the received data in an object -that supports the buffer protocol instead of returning the data as a -string. This means you can put the data directly into an array or a -memory-mapped file. - -Socket objects also gained \method{getfamily()}, \method{gettype()}, -and \method{getproto()} accessor methods to retrieve the family, type, -and protocol values for the socket. - -\item New module: the \module{spwd} module provides functions for -accessing the shadow password database on systems that support -shadow passwords. - -\item The \module{struct} is now faster because it -compiles format strings into \class{Struct} objects -with \method{pack()} and \method{unpack()} methods. This is similar -to how the \module{re} module lets you create compiled regular -expression objects. You can still use the module-level -\function{pack()} and \function{unpack()} functions; they'll create -\class{Struct} objects and cache them. Or you can use -\class{Struct} instances directly: - -\begin{verbatim} -s = struct.Struct('ih3s') - -data = s.pack(1972, 187, 'abc') -year, number, name = s.unpack(data) -\end{verbatim} - -You can also pack and unpack data to and from buffer objects directly -using the \method{pack_into(\var{buffer}, \var{offset}, \var{v1}, -\var{v2}, ...)} and \method{unpack_from(\var{buffer}, \var{offset})} -methods. This lets you store data directly into an array or a -memory-mapped file. - -(\class{Struct} objects were implemented by Bob Ippolito at the -NeedForSpeed sprint. Support for buffer objects was added by Martin -Blais, also at the NeedForSpeed sprint.) - -\item The Python developers switched from CVS to Subversion during the 2.5 -development process. Information about the exact build version is -available as the \code{sys.subversion} variable, a 3-tuple of -\code{(\var{interpreter-name}, \var{branch-name}, -\var{revision-range})}. For example, at the time of writing my copy -of 2.5 was reporting \code{('CPython', 'trunk', '45313:45315')}. - -This information is also available to C extensions via the -\cfunction{Py_GetBuildInfo()} function that returns a -string of build information like this: -\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}. -(Contributed by Barry Warsaw.) - -\item Another new function, \function{sys._current_frames()}, returns -the current stack frames for all running threads as a dictionary -mapping thread identifiers to the topmost stack frame currently active -in that thread at the time the function is called. (Contributed by -Tim Peters.) - -\item The \class{TarFile} class in the \module{tarfile} module now has -an \method{extractall()} method that extracts all members from the -archive into the current working directory. It's also possible to set -a different directory as the extraction target, and to unpack only a -subset of the archive's members. - -The compression used for a tarfile opened in stream mode can now be -autodetected using the mode \code{'r|*'}. -% patch 918101 -(Contributed by Lars Gust\"abel.) - -\item The \module{threading} module now lets you set the stack size -used when new threads are created. The -\function{stack_size(\optional{\var{size}})} function returns the -currently configured stack size, and supplying the optional \var{size} -parameter sets a new value. Not all platforms support changing the -stack size, but Windows, POSIX threading, and OS/2 all do. -(Contributed by Andrew MacIntyre.) -% Patch 1454481 - -\item The \module{unicodedata} module has been updated to use version 4.1.0 -of the Unicode character database. Version 3.2.0 is required -by some specifications, so it's still available as -\member{unicodedata.ucd_3_2_0}. - -\item New module: the \module{uuid} module generates -universally unique identifiers (UUIDs) according to \rfc{4122}. The -RFC defines several different UUID versions that are generated from a -starting string, from system properties, or purely randomly. This -module contains a \class{UUID} class and -functions named \function{uuid1()}, -\function{uuid3()}, \function{uuid4()}, and -\function{uuid5()} to generate different versions of UUID. (Version 2 UUIDs -are not specified in \rfc{4122} and are not supported by this module.) - -\begin{verbatim} ->>> import uuid ->>> # make a UUID based on the host ID and current time ->>> uuid.uuid1() -UUID('a8098c1a-f86e-11da-bd1a-00112444be1e') - ->>> # make a UUID using an MD5 hash of a namespace UUID and a name ->>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org') -UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e') - ->>> # make a random UUID ->>> uuid.uuid4() -UUID('16fd2706-8baf-433b-82eb-8c7fada847da') - ->>> # make a UUID using a SHA-1 hash of a namespace UUID and a name ->>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org') -UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d') -\end{verbatim} - -(Contributed by Ka-Ping Yee.) - -\item The \module{weakref} module's \class{WeakKeyDictionary} and -\class{WeakValueDictionary} types gained new methods for iterating -over the weak references contained in the dictionary. -\method{iterkeyrefs()} and \method{keyrefs()} methods were -added to \class{WeakKeyDictionary}, and -\method{itervaluerefs()} and \method{valuerefs()} were added to -\class{WeakValueDictionary}. (Contributed by Fred L.~Drake, Jr.) - -\item The \module{webbrowser} module received a number of -enhancements. -It's now usable as a script with \code{python -m webbrowser}, taking a -URL as the argument; there are a number of switches -to control the behaviour (\programopt{-n} for a new browser window, -\programopt{-t} for a new tab). New module-level functions, -\function{open_new()} and \function{open_new_tab()}, were added -to support this. The module's \function{open()} function supports an -additional feature, an \var{autoraise} parameter that signals whether -to raise the open window when possible. A number of additional -browsers were added to the supported list such as Firefox, Opera, -Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg -Brandl.) -% Patch #754022 - -\item The \module{xmlrpclib} module now supports returning - \class{datetime} objects for the XML-RPC date type. Supply - \code{use_datetime=True} to the \function{loads()} function - or the \class{Unmarshaller} class to enable this feature. - (Contributed by Skip Montanaro.) -% Patch 1120353 - -\item The \module{zipfile} module now supports the ZIP64 version of the -format, meaning that a .zip archive can now be larger than 4~GiB and -can contain individual files larger than 4~GiB. (Contributed by -Ronald Oussoren.) -% Patch 1446489 - -\item The \module{zlib} module's \class{Compress} and \class{Decompress} -objects now support a \method{copy()} method that makes a copy of the -object's internal state and returns a new -\class{Compress} or \class{Decompress} object. -(Contributed by Chris AtLee.) -% Patch 1435422 - -\end{itemize} - - - -%====================================================================== -\subsection{The ctypes package\label{module-ctypes}} - -The \module{ctypes} package, written by Thomas Heller, has been added -to the standard library. \module{ctypes} lets you call arbitrary functions -in shared libraries or DLLs. Long-time users may remember the \module{dl} module, which -provides functions for loading shared libraries and calling functions in them. The \module{ctypes} package is much fancier. - -To load a shared library or DLL, you must create an instance of the -\class{CDLL} class and provide the name or path of the shared library -or DLL. Once that's done, you can call arbitrary functions -by accessing them as attributes of the \class{CDLL} object. - -\begin{verbatim} -import ctypes - -libc = ctypes.CDLL('libc.so.6') -result = libc.printf("Line of output\n") -\end{verbatim} - -Type constructors for the various C types are provided: \function{c_int}, -\function{c_float}, \function{c_double}, \function{c_char_p} (equivalent to \ctype{char *}), and so forth. Unlike Python's types, the C versions are all mutable; you can assign to their \member{value} attribute -to change the wrapped value. Python integers and strings will be automatically -converted to the corresponding C types, but for other types you -must call the correct type constructor. (And I mean \emph{must}; -getting it wrong will often result in the interpreter crashing -with a segmentation fault.) - -You shouldn't use \function{c_char_p} with a Python string when the C function will be modifying the memory area, because Python strings are -supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area, -use \function{create_string_buffer()}: - -\begin{verbatim} -s = "this is a string" -buf = ctypes.create_string_buffer(s) -libc.strfry(buf) -\end{verbatim} - -C functions are assumed to return integers, but you can set -the \member{restype} attribute of the function object to -change this: - -\begin{verbatim} ->>> libc.atof('2.71828') --1783957616 ->>> libc.atof.restype = ctypes.c_double ->>> libc.atof('2.71828') -2.71828 -\end{verbatim} - -\module{ctypes} also provides a wrapper for Python's C API -as the \code{ctypes.pythonapi} object. This object does \emph{not} -release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter's code. -There's a \class{py_object()} type constructor that will create a -\ctype{PyObject *} pointer. A simple usage: - -\begin{verbatim} -import ctypes - -d = {} -ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d), - ctypes.py_object("abc"), ctypes.py_object(1)) -# d is now {'abc', 1}. -\end{verbatim} - -Don't forget to use \class{py_object()}; if it's omitted you end -up with a segmentation fault. - -\module{ctypes} has been around for a while, but people still write -and distribution hand-coded extension modules because you can't rely on \module{ctypes} being present. -Perhaps developers will begin to write -Python wrappers atop a library accessed through \module{ctypes} instead -of extension modules, now that \module{ctypes} is included with core Python. - -\begin{seealso} - -\seeurl{http://starship.python.net/crew/theller/ctypes/} -{The ctypes web page, with a tutorial, reference, and FAQ.} - -\seeurl{../lib/module-ctypes.html}{The documentation -for the \module{ctypes} module.} - -\end{seealso} - - -%====================================================================== -\subsection{The ElementTree package\label{module-etree}} - -A subset of Fredrik Lundh's ElementTree library for processing XML has -been added to the standard library as \module{xml.etree}. The -available modules are -\module{ElementTree}, \module{ElementPath}, and -\module{ElementInclude} from ElementTree 1.2.6. -The \module{cElementTree} accelerator module is also included. - -The rest of this section will provide a brief overview of using -ElementTree. Full documentation for ElementTree is available at -\url{http://effbot.org/zone/element-index.htm}. - -ElementTree represents an XML document as a tree of element nodes. -The text content of the document is stored as the \member{.text} -and \member{.tail} attributes of -(This is one of the major differences between ElementTree and -the Document Object Model; in the DOM there are many different -types of node, including \class{TextNode}.) - -The most commonly used parsing function is \function{parse()}, that -takes either a string (assumed to contain a filename) or a file-like -object and returns an \class{ElementTree} instance: - -\begin{verbatim} -from xml.etree import ElementTree as ET - -tree = ET.parse('ex-1.xml') - -feed = urllib.urlopen( - 'http://planet.python.org/rss10.xml') -tree = ET.parse(feed) -\end{verbatim} - -Once you have an \class{ElementTree} instance, you -can call its \method{getroot()} method to get the root \class{Element} node. - -There's also an \function{XML()} function that takes a string literal -and returns an \class{Element} node (not an \class{ElementTree}). -This function provides a tidy way to incorporate XML fragments, -approaching the convenience of an XML literal: - -\begin{verbatim} -svg = ET.XML("""<svg width="10px" version="1.0"> - </svg>""") -svg.set('height', '320px') -svg.append(elem1) -\end{verbatim} - -Each XML element supports some dictionary-like and some list-like -access methods. Dictionary-like operations are used to access attribute -values, and list-like operations are used to access child nodes. - -\begin{tableii}{c|l}{code}{Operation}{Result} - \lineii{elem[n]}{Returns n'th child element.} - \lineii{elem[m:n]}{Returns list of m'th through n'th child elements.} - \lineii{len(elem)}{Returns number of child elements.} - \lineii{list(elem)}{Returns list of child elements.} - \lineii{elem.append(elem2)}{Adds \var{elem2} as a child.} - \lineii{elem.insert(index, elem2)}{Inserts \var{elem2} at the specified location.} - \lineii{del elem[n]}{Deletes n'th child element.} - \lineii{elem.keys()}{Returns list of attribute names.} - \lineii{elem.get(name)}{Returns value of attribute \var{name}.} - \lineii{elem.set(name, value)}{Sets new value for attribute \var{name}.} - \lineii{elem.attrib}{Retrieves the dictionary containing attributes.} - \lineii{del elem.attrib[name]}{Deletes attribute \var{name}.} -\end{tableii} - -Comments and processing instructions are also represented as -\class{Element} nodes. To check if a node is a comment or processing -instructions: - -\begin{verbatim} -if elem.tag is ET.Comment: - ... -elif elem.tag is ET.ProcessingInstruction: - ... -\end{verbatim} - -To generate XML output, you should call the -\method{ElementTree.write()} method. Like \function{parse()}, -it can take either a string or a file-like object: - -\begin{verbatim} -# Encoding is US-ASCII -tree.write('output.xml') - -# Encoding is UTF-8 -f = open('output.xml', 'w') -tree.write(f, encoding='utf-8') -\end{verbatim} - -(Caution: the default encoding used for output is ASCII. For general -XML work, where an element's name may contain arbitrary Unicode -characters, ASCII isn't a very useful encoding because it will raise -an exception if an element's name contains any characters with values -greater than 127. Therefore, it's best to specify a different -encoding such as UTF-8 that can handle any Unicode character.) - -This section is only a partial description of the ElementTree interfaces. -Please read the package's official documentation for more details. - -\begin{seealso} - -\seeurl{http://effbot.org/zone/element-index.htm} -{Official documentation for ElementTree.} - -\end{seealso} - - -%====================================================================== -\subsection{The hashlib package\label{module-hashlib}} - -A new \module{hashlib} module, written by Gregory P. Smith, -has been added to replace the -\module{md5} and \module{sha} modules. \module{hashlib} adds support -for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). -When available, the module uses OpenSSL for fast platform optimized -implementations of algorithms. - -The old \module{md5} and \module{sha} modules still exist as wrappers -around hashlib to preserve backwards compatibility. The new module's -interface is very close to that of the old modules, but not identical. -The most significant difference is that the constructor functions -for creating new hashing objects are named differently. - -\begin{verbatim} -# Old versions -h = md5.md5() -h = md5.new() - -# New version -h = hashlib.md5() - -# Old versions -h = sha.sha() -h = sha.new() - -# New version -h = hashlib.sha1() - -# Hash that weren't previously available -h = hashlib.sha224() -h = hashlib.sha256() -h = hashlib.sha384() -h = hashlib.sha512() - -# Alternative form -h = hashlib.new('md5') # Provide algorithm as a string -\end{verbatim} - -Once a hash object has been created, its methods are the same as before: -\method{update(\var{string})} hashes the specified string into the -current digest state, \method{digest()} and \method{hexdigest()} -return the digest value as a binary string or a string of hex digits, -and \method{copy()} returns a new hashing object with the same digest state. - -\begin{seealso} - -\seeurl{../lib/module-hashlib.html}{The documentation -for the \module{hashlib} module.} - -\end{seealso} - - -%====================================================================== -\subsection{The sqlite3 package\label{module-sqlite}} - -The pysqlite module (\url{http://www.pysqlite.org}), a wrapper for the -SQLite embedded database, has been added to the standard library under -the package name \module{sqlite3}. - -SQLite is a C library that provides a lightweight disk-based database -that doesn't require a separate server process and allows accessing -the database using a nonstandard variant of the SQL query language. -Some applications can use SQLite for internal data storage. It's also -possible to prototype an application using SQLite and then port the -code to a larger database such as PostgreSQL or Oracle. - -pysqlite was written by Gerhard H\"aring and provides a SQL interface -compliant with the DB-API 2.0 specification described by -\pep{249}. - -If you're compiling the Python source yourself, note that the source -tree doesn't include the SQLite code, only the wrapper module. -You'll need to have the SQLite libraries and headers installed before -compiling Python, and the build process will compile the module when -the necessary headers are available. - -To use the module, you must first create a \class{Connection} object -that represents the database. Here the data will be stored in the -\file{/tmp/example} file: - -\begin{verbatim} -conn = sqlite3.connect('/tmp/example') -\end{verbatim} - -You can also supply the special name \samp{:memory:} to create -a database in RAM. - -Once you have a \class{Connection}, you can create a \class{Cursor} -object and call its \method{execute()} method to perform SQL commands: - -\begin{verbatim} -c = conn.cursor() - -# Create table -c.execute('''create table stocks -(date text, trans text, symbol text, - qty real, price real)''') - -# Insert a row of data -c.execute("""insert into stocks - values ('2006-01-05','BUY','RHAT',100,35.14)""") -\end{verbatim} - -Usually your SQL operations will need to use values from Python -variables. You shouldn't assemble your query using Python's string -operations because doing so is insecure; it makes your program -vulnerable to an SQL injection attack. - -Instead, use the DB-API's parameter substitution. Put \samp{?} as a -placeholder wherever you want to use a value, and then provide a tuple -of values as the second argument to the cursor's \method{execute()} -method. (Other database modules may use a different placeholder, -such as \samp{\%s} or \samp{:1}.) For example: - -\begin{verbatim} -# Never do this -- insecure! -symbol = 'IBM' -c.execute("... where symbol = '%s'" % symbol) - -# Do this instead -t = (symbol,) -c.execute('select * from stocks where symbol=?', t) - -# Larger example -for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), - ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), - ('2006-04-06', 'SELL', 'IBM', 500, 53.00), - ): - c.execute('insert into stocks values (?,?,?,?,?)', t) -\end{verbatim} - -To retrieve data after executing a SELECT statement, you can either -treat the cursor as an iterator, call the cursor's \method{fetchone()} -method to retrieve a single matching row, -or call \method{fetchall()} to get a list of the matching rows. - -This example uses the iterator form: - -\begin{verbatim} ->>> c = conn.cursor() ->>> c.execute('select * from stocks order by price') ->>> for row in c: -... print row -... -(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) -(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) -(u'2006-04-06', u'SELL', u'IBM', 500, 53.0) -(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0) ->>> -\end{verbatim} - -For more information about the SQL dialect supported by SQLite, see -\url{http://www.sqlite.org}. - -\begin{seealso} - -\seeurl{http://www.pysqlite.org} -{The pysqlite web page.} - -\seeurl{http://www.sqlite.org} -{The SQLite web page; the documentation describes the syntax and the -available data types for the supported SQL dialect.} - -\seeurl{../lib/module-sqlite3.html}{The documentation -for the \module{sqlite3} module.} - -\seepep{249}{Database API Specification 2.0}{PEP written by -Marc-Andr\'e Lemburg.} - -\end{seealso} - - -%====================================================================== -\subsection{The wsgiref package\label{module-wsgiref}} - -% XXX should this be in a PEP 333 section instead? - -The Web Server Gateway Interface (WSGI) v1.0 defines a standard -interface between web servers and Python web applications and is -described in \pep{333}. The \module{wsgiref} package is a reference -implementation of the WSGI specification. - -The package includes a basic HTTP server that will run a WSGI -application; this server is useful for debugging but isn't intended for -production use. Setting up a server takes only a few lines of code: - -\begin{verbatim} -from wsgiref import simple_server - -wsgi_app = ... - -host = '' -port = 8000 -httpd = simple_server.make_server(host, port, wsgi_app) -httpd.serve_forever() -\end{verbatim} - -% XXX discuss structure of WSGI applications? -% XXX provide an example using Django or some other framework? - -\begin{seealso} - -\seeurl{http://www.wsgi.org}{A central web site for WSGI-related resources.} - -\seepep{333}{Python Web Server Gateway Interface v1.0}{PEP written by -Phillip J. Eby.} - -\end{seealso} - - -% ====================================================================== -\section{Build and C API Changes\label{build-api}} - -Changes to Python's build process and to the C API include: - -\begin{itemize} - -\item The Python source tree was converted from CVS to Subversion, -in a complex migration procedure that was supervised and flawlessly -carried out by Martin von~L\"owis. The procedure was developed as -\pep{347}. - -\item Coverity, a company that markets a source code analysis tool -called Prevent, provided the results of their examination of the Python -source code. The analysis found about 60 bugs that -were quickly fixed. Many of the bugs were refcounting problems, often -occurring in error-handling code. See -\url{http://scan.coverity.com} for the statistics. - -\item The largest change to the C API came from \pep{353}, -which modifies the interpreter to use a \ctype{Py_ssize_t} type -definition instead of \ctype{int}. See the earlier -section~\ref{pep-353} for a discussion of this change. - -\item The design of the bytecode compiler has changed a great deal, -no longer generating bytecode by traversing the parse tree. Instead -the parse tree is converted to an abstract syntax tree (or AST), and it is -the abstract syntax tree that's traversed to produce the bytecode. - -It's possible for Python code to obtain AST objects by using the -\function{compile()} built-in and specifying \code{_ast.PyCF_ONLY_AST} -as the value of the -\var{flags} parameter: - -\begin{verbatim} -from _ast import PyCF_ONLY_AST -ast = compile("""a=0 -for i in range(10): - a += i -""", "<string>", 'exec', PyCF_ONLY_AST) - -assignment = ast.body[0] -for_loop = ast.body[1] -\end{verbatim} - -No official documentation has been written for the AST code yet, but -\pep{339} discusses the design. To start learning about the code, read the -definition of the various AST nodes in \file{Parser/Python.asdl}. A -Python script reads this file and generates a set of C structure -definitions in \file{Include/Python-ast.h}. The -\cfunction{PyParser_ASTFromString()} and -\cfunction{PyParser_ASTFromFile()}, defined in -\file{Include/pythonrun.h}, take Python source as input and return the -root of an AST representing the contents. This AST can then be turned -into a code object by \cfunction{PyAST_Compile()}. For more -information, read the source code, and then ask questions on -python-dev. - -% List of names taken from Jeremy's python-dev post at -% http://mail.python.org/pipermail/python-dev/2005-October/057500.html -The AST code was developed under Jeremy Hylton's management, and -implemented by (in alphabetical order) Brett Cannon, Nick Coghlan, -Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, -Armin Rigo, and Neil Schemenauer, plus the participants in a number of -AST sprints at conferences such as PyCon. - -\item Evan Jones's patch to obmalloc, first described in a talk -at PyCon DC 2005, was applied. Python 2.4 allocated small objects in -256K-sized arenas, but never freed arenas. With this patch, Python -will free arenas when they're empty. The net effect is that on some -platforms, when you allocate many objects, Python's memory usage may -actually drop when you delete them and the memory may be returned to -the operating system. (Implemented by Evan Jones, and reworked by Tim -Peters.) - -Note that this change means extension modules must be more careful -when allocating memory. Python's API has many different -functions for allocating memory that are grouped into families. For -example, \cfunction{PyMem_Malloc()}, \cfunction{PyMem_Realloc()}, and -\cfunction{PyMem_Free()} are one family that allocates raw memory, -while \cfunction{PyObject_Malloc()}, \cfunction{PyObject_Realloc()}, -and \cfunction{PyObject_Free()} are another family that's supposed to -be used for creating Python objects. - -Previously these different families all reduced to the platform's -\cfunction{malloc()} and \cfunction{free()} functions. This meant -it didn't matter if you got things wrong and allocated memory with the -\cfunction{PyMem} function but freed it with the \cfunction{PyObject} -function. With 2.5's changes to obmalloc, these families now do different -things and mismatches will probably result in a segfault. You should -carefully test your C extension modules with Python 2.5. - -\item The built-in set types now have an official C API. Call -\cfunction{PySet_New()} and \cfunction{PyFrozenSet_New()} to create a -new set, \cfunction{PySet_Add()} and \cfunction{PySet_Discard()} to -add and remove elements, and \cfunction{PySet_Contains} and -\cfunction{PySet_Size} to examine the set's state. -(Contributed by Raymond Hettinger.) - -\item C code can now obtain information about the exact revision -of the Python interpreter by calling the -\cfunction{Py_GetBuildInfo()} function that returns a -string of build information like this: -\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}. -(Contributed by Barry Warsaw.) - -\item Two new macros can be used to indicate C functions that are -local to the current file so that a faster calling convention can be -used. \cfunction{Py_LOCAL(\var{type})} declares the function as -returning a value of the specified \var{type} and uses a fast-calling -qualifier. \cfunction{Py_LOCAL_INLINE(\var{type})} does the same thing -and also requests the function be inlined. If -\cfunction{PY_LOCAL_AGGRESSIVE} is defined before \file{python.h} is -included, a set of more aggressive optimizations are enabled for the -module; you should benchmark the results to find out if these -optimizations actually make the code faster. (Contributed by Fredrik -Lundh at the NeedForSpeed sprint.) - -\item \cfunction{PyErr_NewException(\var{name}, \var{base}, -\var{dict})} can now accept a tuple of base classes as its \var{base} -argument. (Contributed by Georg Brandl.) - -\item The \cfunction{PyErr_Warn()} function for issuing warnings -is now deprecated in favour of \cfunction{PyErr_WarnEx(category, -message, stacklevel)} which lets you specify the number of stack -frames separating this function and the caller. A \var{stacklevel} of -1 is the function calling \cfunction{PyErr_WarnEx()}, 2 is the -function above that, and so forth. (Added by Neal Norwitz.) - -\item The CPython interpreter is still written in C, but -the code can now be compiled with a {\Cpp} compiler without errors. -(Implemented by Anthony Baxter, Martin von~L\"owis, Skip Montanaro.) - -\item The \cfunction{PyRange_New()} function was removed. It was -never documented, never used in the core code, and had dangerously lax -error checking. In the unlikely case that your extensions were using -it, you can replace it by something like the following: -\begin{verbatim} -range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll", - start, stop, step); -\end{verbatim} - -\end{itemize} - - -%====================================================================== -\subsection{Port-Specific Changes\label{ports}} - -\begin{itemize} - -\item MacOS X (10.3 and higher): dynamic loading of modules -now uses the \cfunction{dlopen()} function instead of MacOS-specific -functions. - -\item MacOS X: a \longprogramopt{enable-universalsdk} switch was added -to the \program{configure} script that compiles the interpreter as a -universal binary able to run on both PowerPC and Intel processors. -(Contributed by Ronald Oussoren.) - -\item Windows: \file{.dll} is no longer supported as a filename extension for -extension modules. \file{.pyd} is now the only filename extension that will -be searched for. - -\end{itemize} - - -%====================================================================== -\section{Porting to Python 2.5\label{porting}} - -This section lists previously described changes that may require -changes to your code: - -\begin{itemize} - -\item ASCII is now the default encoding for modules. It's now -a syntax error if a module contains string literals with 8-bit -characters but doesn't have an encoding declaration. In Python 2.4 -this triggered a warning, not a syntax error. - -\item Previously, the \member{gi_frame} attribute of a generator -was always a frame object. Because of the \pep{342} changes -described in section~\ref{pep-342}, it's now possible -for \member{gi_frame} to be \code{None}. - -\item A new warning, \class{UnicodeWarning}, is triggered when -you attempt to compare a Unicode string and an 8-bit string that can't -be converted to Unicode using the default ASCII encoding. Previously -such comparisons would raise a \class{UnicodeDecodeError} exception. - -\item Library: the \module{csv} module is now stricter about multi-line quoted -fields. If your files contain newlines embedded within fields, the -input should be split into lines in a manner which preserves the -newline characters. - -\item Library: the \module{locale} module's -\function{format()} function's would previously -accept any string as long as no more than one \%char specifier -appeared. In Python 2.5, the argument must be exactly one \%char -specifier with no surrounding text. - -\item Library: The \module{pickle} and \module{cPickle} modules no -longer accept a return value of \code{None} from the -\method{__reduce__()} method; the method must return a tuple of -arguments instead. The modules also no longer accept the deprecated -\var{bin} keyword parameter. - -\item Library: The \module{SimpleXMLRPCServer} and \module{DocXMLRPCServer} -classes now have a \member{rpc_paths} attribute that constrains -XML-RPC operations to a limited set of URL paths; the default is -to allow only \code{'/'} and \code{'/RPC2'}. Setting -\member{rpc_paths} to \code{None} or an empty tuple disables -this path checking. - -\item C API: Many functions now use \ctype{Py_ssize_t} -instead of \ctype{int} to allow processing more data on 64-bit -machines. Extension code may need to make the same change to avoid -warnings and to support 64-bit machines. See the earlier -section~\ref{pep-353} for a discussion of this change. - -\item C API: -The obmalloc changes mean that -you must be careful to not mix usage -of the \cfunction{PyMem_*()} and \cfunction{PyObject_*()} -families of functions. Memory allocated with -one family's \cfunction{*_Malloc()} must be -freed with the corresponding family's \cfunction{*_Free()} function. - -\end{itemize} - - -%====================================================================== -\section{Acknowledgements \label{acks}} - -The author would like to thank the following people for offering -suggestions, corrections and assistance with various drafts of this -article: Georg Brandl, Nick Coghlan, Phillip J. Eby, Lars Gust\"abel, -Raymond Hettinger, Ralf W. Grosse-Kunstleve, Kent Johnson, Iain Lowe, -Martin von~L\"owis, Fredrik Lundh, Andrew McNamara, Skip Montanaro, -Gustavo Niemeyer, Paul Prescod, James Pryor, Mike Rovner, Scott -Weikart, Barry Warsaw, Thomas Wouters. - -\end{document} |