summaryrefslogtreecommitdiffstats
path: root/Doc/whatsnew/whatsnew23.tex
blob: b4fce43d200d2c369d83f10657975132404c5ca3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
\documentclass{howto}
% $Id$

\title{What's New in Python 2.3}
\release{0.01}
\author{A.M. Kuchling}
\authoraddress{\email{akuchlin@mems-exchange.org}}

\begin{document}
\maketitle
\tableofcontents

%\section{Introduction \label{intro}}

{\large This article is a draft, and is currently up to date for some
random version of the CVS tree around May 26 2002.  Please send any
additions, comments or errata to the author.}

This article explains the new features in Python 2.3.  The tentative
release date of Python 2.3 is currently scheduled for August 30 2002.

This article doesn't attempt to provide a complete specification of
the new features, but instead provides a convenient overview.  For
full details, you should refer to the documentation for Python 2.3,
such as the
\citetitle[http://www.python.org/doc/2.3/lib/lib.html]{Python Library
Reference} and the
\citetitle[http://www.python.org/doc/2.3/ref/ref.html]{Python
Reference Manual}.  If you want to understand the complete
implementation and design rationale for a change, refer to the PEP for
a particular new feature.


%======================================================================
\section{PEP 255: Simple Generators}
\label{section-generators}

In Python 2.2, generators were added as an optional feature, to be
enabled by a \code{from __future__ import generators} directive.  In
2.3 generators no longer need to be specially enabled, and are now
always present; this means that \keyword{yield} is now always a
keyword.  The rest of this section is a copy of the description of
generators from the ``What's New in Python 2.2'' document; if you read
it when 2.2 came out, you can skip the rest of this section.

Generators are a new feature that interacts with the iterators
introduced in Python 2.2.

You're doubtless familiar with how function calls work in Python or
C.  When you call a function, it gets a private namespace where its local
variables are created.  When the function reaches a \keyword{return}
statement, the local variables are destroyed and the resulting value
is returned to the caller.  A later call to the same function will get
a fresh new set of local variables.  But, what if the local variables
weren't thrown away on exiting a function?  What if you could later
resume the function where it left off?  This is what generators
provide; they can be thought of as resumable functions.

Here's the simplest example of a generator function:

\begin{verbatim}
def generate_ints(N):
    for i in range(N):
        yield i
\end{verbatim}

A new keyword, \keyword{yield}, was introduced for generators.  Any
function containing a \keyword{yield} statement is a generator
function; this is detected by Python's bytecode compiler which
compiles the function specially as a result.  

When you call a generator function, it doesn't return a single value;
instead it returns a generator object that supports the iterator
protocol.  On executing the \keyword{yield} statement, the generator
outputs the value of \code{i}, similar to a \keyword{return}
statement.  The big difference between \keyword{yield} and a
\keyword{return} statement is that on reaching a \keyword{yield} the
generator's state of execution is suspended and local variables are
preserved.  On the next call to the generator's \code{.next()} method,
the function will resume executing immediately after the
\keyword{yield} statement.  (For complicated reasons, the
\keyword{yield} statement isn't allowed inside the \keyword{try} block
of a \code{try...finally} statement; read \pep{255} for a full
explanation of the interaction between \keyword{yield} and
exceptions.)

Here's a sample usage of the \function{generate_ints} generator:

\begin{verbatim}
>>> gen = generate_ints(3)
>>> gen
<generator object at 0x8117f90>
>>> gen.next()
0
>>> gen.next()
1
>>> gen.next()
2
>>> gen.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in generate_ints
StopIteration
\end{verbatim}

You could equally write \code{for i in generate_ints(5)}, or
\code{a,b,c = generate_ints(3)}.

Inside a generator function, the \keyword{return} statement can only
be used without a value, and signals the end of the procession of
values; afterwards the generator cannot return any further values.
\keyword{return} with a value, such as \code{return 5}, is a syntax
error inside a generator function.  The end of the generator's results
can also be indicated by raising \exception{StopIteration} manually,
or by just letting the flow of execution fall off the bottom of the
function.

You could achieve the effect of generators manually by writing your
own class and storing all the local variables of the generator as
instance variables.  For example, returning a list of integers could
be done by setting \code{self.count} to 0, and having the
\method{next()} method increment \code{self.count} and return it.
However, for a moderately complicated generator, writing a
corresponding class would be much messier.
\file{Lib/test/test_generators.py} contains a number of more
interesting examples.  The simplest one implements an in-order
traversal of a tree using generators recursively.

\begin{verbatim}
# A recursive generator that generates Tree leaves in in-order.
def inorder(t):
    if t:
        for x in inorder(t.left):
            yield x
        yield t.label
        for x in inorder(t.right):
            yield x
\end{verbatim}

Two other examples in \file{Lib/test/test_generators.py} produce
solutions for the N-Queens problem (placing $N$ queens on an $NxN$
chess board so that no queen threatens another) and the Knight's Tour
(a route that takes a knight to every square of an $NxN$ chessboard
without visiting any square twice). 

The idea of generators comes from other programming languages,
especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
idea of generators is central.  In Icon, every
expression and function call behaves like a generator.  One example
from ``An Overview of the Icon Programming Language'' at
\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
what this looks like:

\begin{verbatim}
sentence := "Store it in the neighboring harbor"
if (i := find("or", sentence)) > 5 then write(i)
\end{verbatim}

In Icon the \function{find()} function returns the indexes at which the
substring ``or'' is found: 3, 23, 33.  In the \keyword{if} statement,
\code{i} is first assigned a value of 3, but 3 is less than 5, so the
comparison fails, and Icon retries it with the second value of 23.  23
is greater than 5, so the comparison now succeeds, and the code prints
the value 23 to the screen.

Python doesn't go nearly as far as Icon in adopting generators as a
central concept.  Generators are considered a new part of the core
Python language, but learning or using them isn't compulsory; if they
don't solve any problems that you have, feel free to ignore them.
One novel feature of Python's interface as compared to
Icon's is that a generator's state is represented as a concrete object
(the iterator) that can be passed around to other functions or stored
in a data structure.

\begin{seealso}

\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
Peters, Magnus Lie Hetland.  Implemented mostly by Neil Schemenauer
and Tim Peters, with other fixes from the Python Labs crew.}

\end{seealso}


%======================================================================
\section{PEP 278: Universal Newline Support}

The three major operating systems used today are Microsoft Windows,
Apple's Macintosh OS, and the various Unix derivatives.  A minor
irritation is that these three platforms all use different characters
to mark the ends of lines in text files.  Unix uses character 10, the
ASCII linefeed, MacOS uses character 13, the ASCII carriage return,
and Windows uses a two-character sequence of carriage return plus a
newline.

Python's file objects can now support end of line conventions other
than the one followed by the platform on which Python is running.
Opening a file with the mode \samp{U} or \samp{rU} will open a file
for reading in universal newline mode.  All three line ending
conventions will be translated to a \samp{\e n} in the strings
returned by the various file methods such as \method{read()} and
\method{readline()}. 

Universal newline support is also used when importing modules and when
executing a file with the \function{execfile()} function.  This means
that Python modules can be shared between all three operating systems
without needing to convert the line-endings.

This feature can be disabled at compile-time by specifying the
\longprogramopt{without-universal-newlines} when running Python's
configure script.

\begin{seealso}

\seepep{278}{Universal Newline Support}{Written 
and implemented by Jack Jansen.}

\end{seealso}

%======================================================================
\section{PEP 285: The \class{bool} Type}
\label{section-bool}

A Boolean type was added to Python 2.3.  Two new constants were added
to the \module{__builtin__} module, \constant{True} and
\constant{False}.  The type object for this new type is named
\class{bool}; the constructor for it takes any Python value and
converts it to \constant{True} or \constant{False}.

\begin{verbatim}
>>> bool(1)
True
>>> bool(0)
False
>>> bool([])
False
>>> bool( (1,) )
True
\end{verbatim}

Most of the standard library modules and built-in functions have been
changed to return Booleans.

\begin{verbatim}
>>> o = []
>>> hasattr(o, 'append')
True
>>> isinstance(o, list)
True
>>> isinstance(o, tuple)
False
\end{verbatim}

Python's Booleans were added with the primary goal of making code
clearer.  For example, if you're reading a function and encounter the
statement \code{return 1}, you might wonder whether the \samp{1}
represents a truth value, or whether it's an index, or whether it's a
coefficient that multiplies some other quantity.  If the statement is
\code{return True}, however, the meaning of the return value is quite
clearly a truth value.

Python's Booleans were not added for the sake of strict type-checking.
A very strict language such as Pascal
% XXX is Pascal the right example here?
would also prevent you performing arithmetic with Booleans, and would
require that the expression in an \keyword{if} statement always
evaluate to a Boolean.  Python is not this strict, and it never will
be.  (\pep{285} explicitly says this.)  So you can still use any
expression in an \keyword{if}, even ones that evaluate to a list or
tuple or some random object, and the Boolean type is a subclass of the
\class{int} class, so arithmetic using a Boolean still works.

\begin{verbatim}
>>> True + 1
2
>>> False + 1
1
>>> False * 75
0
>>> True * 75
75
\end{verbatim}

To sum up \constant{True} and \constant{False} in a sentence: they're
alternative ways to spell the integer values 1 and 0, with the single
difference that \function{str()} and \function{repr()} return the
strings \samp{True} and \samp{False} instead of \samp{1} and \samp{0}.

\begin{seealso}

\seepep{285}{Adding a bool type}{Written and implemented by GvR.}

\end{seealso}


%======================================================================
\section{New and Improved Modules}

As usual, Python's standard modules had a number of enhancements and
bug fixes.  Here's a partial list; consult the \file{Misc/NEWS} file
in the source tree, or the CVS logs, for a more complete list.

\begin{itemize}

\item One minor but far-reaching change is that the names of extension
types defined by the modules included with Python now contain the
module and a \samp{.} in front of the type name.  For example, in
Python 2.2, if you created a socket and printed its
\member{__class__}, you'd get this output:

\begin{verbatim}
>>> s = socket.socket()
>>> s.__class__
<type 'socket'>
\end{verbatim}

In 2.3, you get this:
\begin{verbatim}
>>> s.__class__
<type '_socket.socket'>
\end{verbatim}

\item The \method{strip()}, \method{lstrip()}, and \method{rstrip()}
string methods now have an optional argument for specifying the
characters to strip.  The default is still to remove all whitespace
characters:

\begin{verbatim}
>>> '   abc '.strip()
'abc'
>>> '><><abc<><><>'.strip('<>')
'abc'
>>> '><><abc<><><>\n'.strip('<>')
'abc<><><>\n'
>>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
u'\u4001abc'
>>>
\end{verbatim}

\item Another new string method is \method{zfill()}, originally a
function in the \module{string} module.  \method{zfill()} pads a
numeric string with zeros on the left until it's the specified width.
Note that the \code{\%} operator is still more flexible and powerful
than \method{zfill()}.

\begin{verbatim}
>>> '45'.zfill(4)
'0045'
>>> '12345'.zfill(4)
'12345'
>>> 'goofy'.zfill(4)
'0goofy'
\end{verbatim}

\item Dictionaries have a new method, method{pop(\var{key})}, that
returns the value corresponding to \var{key} and removes that
key/value pair from the dictionary.  \method{pop()} will raise a
\exception{KeyError} if the requsted key isn't present in the
dictionary:

\begin{verbatim}
>>> d = {1:2}
>>> d
{1: 2}
>>> d.pop(4)
Traceback (most recent call last):
  File ``<stdin>'', line 1, in ?
KeyError: 4
>>> d.pop(1)
2
>>> d.pop(1)
Traceback (most recent call last):
  File ``<stdin>'', line 1, in ?
KeyError: pop(): dictionary is empty
>>> d
{}
>>>
\end{verbatim}

distutils: command/bdist_packager, support for Solaris pkgtool 
and HP-UX swinstall


\item Two new functions, \function{killpg()} and \function{mknod()},
were added to the \module{posix} module that underlies the \module{os}
module.

\item (XXX write this) arraymodule.c: - add Py_UNICODE arrays 
- support +=, *=

\item The \module{grp} module now returns enhanced tuples:

\begin{verbatim}
>>> import grp
>>> g = grp.getgrnam('amk')
>>> g.gr_name, g.gr_gid
('amk', 500)
\end{verbatim}

\item The \module{readline} module also gained a number of new
functions: \function{get_history_item()},
\function{get_current_history_length()}, and \function{redisplay()}.

\end{itemize}


New enumerate() built-in.

%======================================================================
\section{Interpreter Changes and Fixes}

Here are the changes that Python 2.3 makes to the core language.

\begin{itemize}
\item The \keyword{yield} statement is now always a keyword, as
described in section~\ref{section-generators}.

\item Two new constants, \constant{True} and \constant{False} were
added along with the built-in \class{bool} type, as described in
section~\ref{section-bool}.

\item The \class{file} type can now be subtyped.  (XXX did this not work
before?  Thought I used it in an example in the 2.2 What's New document...)

\item File objects also manage their internal string buffer
differently by increasing it exponentially when needed.  
This results in the benchmark tests in \file{Lib/test/test_bufio.py} 
speeding up from 57 seconds to 1.7 seconds, according to one
measurement.

\end{itemize}


%======================================================================
\section{Other Changes and Fixes}

XXX write this

The tools used to build the documentation now work under Cygwin as
well as \UNIX.


% ======================================================================
\section{Build and C API Changes}

XXX write this

\begin{itemize}

\item Patch \#527027: Allow building python as shared library with
--enable-shared

pymalloc is now enabled by default (also mention debug-mode pymalloc)

Memory API reworking -- which functions are deprecated?  

PyObject_DelItemString() added

PyArg_NoArgs macro is now deprecated

\item The source code for the Expat XML parser is now included with
the Python source, so the \module{pyexpat} module is no longer
dependent on having a system library containing Expat.

===
Introduce two new flag bits that can be set in a PyMethodDef method
descriptor, as used for the tp_methods slot of a type.  These new flag
bits are both optional, and mutually exclusive.  Most methods will not
use either.  These flags are used to create special method types which
exist in the same namespace as normal methods without having to use
tedious construction code to insert the new special method objects in
the type's tp_dict after PyType_Ready() has been called.

If METH_CLASS is specified, the method will represent a class method
like that returned by the classmethod() built-in.

If METH_STATIC is specified, the method will represent a static method
like that returned by the staticmethod() built-in.

These flags may not be used in the PyMethodDef table for modules since
these special method types are not meaningful in that case; a
ValueError will be raised if these flags are found in that context.
===

\end{itemize}

\subsection{Port-Specific Changes}

XXX write this

OS/2 EMX port

MacOS: Weaklink most toolbox modules, improving backward
compatibility. Modules will no longer fail to load if a single routine
is missing on the curent OS version, in stead calling the missing
routine will raise an exception.  Should finally fix 531398. 2.2.1
candidate.  Also blacklisted some constants with definitions that
were not Python-compatible.

Checked in Sean Reifschneider's RPM spec file and patches.


%======================================================================
\section{Acknowledgements \label{acks}}

The author would like to thank the following people for offering
suggestions, corrections and assistance with various drafts of this
article: Fred~L. Drake, Jr.

\end{document}