From 6938f06047a6d2170523cfc3ab8e797bae0a6c05 Mon Sep 17 00:00:00 2001 From: Guido van Rossum Date: Mon, 1 Aug 1994 12:22:53 +0000 Subject: Merge alpha100 branch back to main trunk --- Doc/.cvsignore | 1 + Doc/Makefile | 46 ++- Doc/README | 61 ++-- Doc/ext.tex | 804 +++++++++++++++++++++++++++++---------------------- Doc/ext/ext.tex | 804 +++++++++++++++++++++++++++++---------------------- Doc/fix.el | 5 +- Doc/fix_hack | 1 + Doc/info/texipre.dat | 6 +- Doc/lib.tex | 89 +++++- Doc/lib/lib.tex | 89 +++++- Doc/ref.tex | 4 +- Doc/ref/ref.tex | 4 +- Doc/ref/ref1.tex | 20 +- Doc/ref/ref2.tex | 92 +++--- Doc/ref/ref3.tex | 220 +++++++------- Doc/ref/ref4.tex | 55 ++-- Doc/ref/ref5.tex | 137 ++++----- Doc/ref1.tex | 20 +- Doc/ref2.tex | 92 +++--- Doc/ref3.tex | 220 +++++++------- Doc/ref4.tex | 55 ++-- Doc/ref5.tex | 137 ++++----- Doc/texipre.dat | 6 +- Doc/tools/fix.el | 5 +- Doc/tools/fix_hack | 1 + Doc/tut.tex | 743 +++++++++++++++++++++++++++++++++++++++-------- Doc/tut/tut.tex | 743 +++++++++++++++++++++++++++++++++++++++-------- 27 files changed, 2934 insertions(+), 1526 deletions(-) diff --git a/Doc/.cvsignore b/Doc/.cvsignore index 3df983b7..c00222c 100755 --- a/Doc/.cvsignore +++ b/Doc/.cvsignore @@ -1 +1,2 @@ python-lib.info* +lib.texi diff --git a/Doc/Makefile b/Doc/Makefile index d13f59f..fbf62f6 100644 --- a/Doc/Makefile +++ b/Doc/Makefile @@ -2,13 +2,14 @@ DESTDIR=/usr/local LIBDESTDIR=$DESTDIR/lib LIBDEST=$LIBDESTDIR/python DOCDESTDIR=$LIBDEST/doc +DVIPS= dvips -f -all: tut lib ref ext qua +all: tut lib ref ext tut: latex tut latex tut - dvips tut >tut.ps + $(DVIPS) tut >tut.ps ref: touch ref.ind @@ -16,7 +17,7 @@ ref: ./fix_hack ref.idx makeindex ref latex ref - dvips ref >ref.ps + $(DVIPS) ref >ref.ps lib: touch lib.ind @@ -24,7 +25,7 @@ lib: ./fix_hack lib.idx makeindex lib latex lib - dvips lib >lib.ps + $(DVIPS) lib >lib.ps ext: touch ext.ind @@ -32,32 +33,51 @@ ext: ./fix_hack ext.idx makeindex ext latex ext - dvips ext >ext.ps + $(DVIPS) ext >ext.ps qua: latex qua bibtex qua latex qua latex qua - dvips qua >qua.ps + $(DVIPS) qua >qua.ps -lib.texi: lib1.tex lib2.tex lib3.tex lib4.tex lib5.tex \ - texipre.dat texipost.dat partparse.py - python partparse.py -o @lib.texi lib[1-5].tex +lib.texi: lib*.tex texipre.dat texipost.dat partparse.py fix.el + python partparse.py -o @lib.texi `whichlibs` + emacs -batch -l fix.el -f save-buffer -kill mv @lib.texi lib.texi .PRECIOUS: lib.texi -python-lib.info: lib.texi fix.el - emacs -batch -l fix.el -f save-buffer -kill - makeinfo +footnote-style end +fill-column 72 +paragraph-indent 0 \ +python-lib.info: lib.texi + makeinfo --footnote-style end --fill-column 72 --paragraph-indent 0 \ lib.texi lib.info: python-lib.info # This target is very local to CWI... libwww: lib.texi - texi2html -d lib.texi /usr/local/ftp.cwi.nl/pub/www/texinfo/python + texi2html -d lib.texi /ufs/guido/www/texinfo/python + +# This one too... +L2H= /usr/local/LaTeX2html/latex2html +L2HARGS=-address $$USER@`domainname` -dont_include myformat -nolatex +l2h: l2htut l2href l2hext + +l2htut: tut + $(L2H) $(L2HARGS) tut.tex + @rm -rf python-tut + mv tut python-tut + +l2href: ref + $(L2H) $(L2HARGS) ref.tex + @rm -rf python-ref + mv ref python-ref + +l2hext: ext + $(L2H) $(L2HARGS) ext.tex + @rm -rf python-ext + mv ext python-ext clean: rm -f @* *~ *.aux *.idx *.ilg *.ind *.log *.toc *.blg *.bbl *.pyc diff --git a/Doc/README b/Doc/README index aba5c6c..5d5fb29 100644 --- a/Doc/README +++ b/Doc/README @@ -7,12 +7,14 @@ and a published article about Python. The following are the LaTeX source files: tut.tex The tutorial - lib.tex, lib[1-5].tex The library reference + lib.tex, lib*.tex The library reference ref.tex, ref[1-8].tex The reference manual + ext.tex How to extend Python qua.tex, quabib.bib Article published in CWI Quarterly -All except qua.tex use the style option file "myformat.sty". This -contains some macro definitions and sets some style parameters. +All except qua.tex (which isn't built by the default target) use the +style option file "myformat.sty". This contains some macro +definitions and sets some style parameters. The style parameters are set up for European paper size (21 x 29.7 cm, a.k.a. A4, or roughly 8.27 x 11.7 inch) by default. To use US paper, @@ -33,8 +35,10 @@ local conventions; at my site, I use dvips and lpr. For example: dvips -Ppsc ref | lpr -Ppsc # print it on printer "psc". If you don't have latex, you can ftp the pre-formatted PosytScript -versions of the documents; see "../misc/FTP" for information about -ftp-ing Python files. +versions of the documents. It should be in the same place where you +fetched the main Python distribution, if you got it by ftp. (See +"../Misc/FAQ" for information about ftp-ing Python files.) + Making the INFO version of the Library Reference ------------------------------------------------ @@ -43,44 +47,35 @@ The Library Reference can now also be read in hypertext form using the Emacs INFO system. This uses Texinfo format as an intermediate step. It requires texinfo version 2 (we have used 2.14). -To build the info files (python-lib.info*), say "make libinfo". This +To build the info files (python-lib.info*), say "make lib.info". This takes a while, even on machines with 33 MIPS and 16 Mbytes :-) You can ignore the output. But first you'll have to change a site dependency in fix.el: if -texinfo 2.xx is installed by default at your site, comment out the two -lines starting with "(setq load-path"; if it isn't, change the path! -(I'm afraid that if you don't have texinfo 2.xx this won't work -- use -archie to locate a version and ftp to fetch it.) +texinfo 2.xx isn't installed by default at your site, you'll have to +install it (use archie to locate a version and ftp to fetch it). If +you can't install it in the standard Emacs load path, uncomment the +line containing a "(setq load-path ...)" statement, and fill in the +path where you put it. The files used by the conversion process are: -partparse.py the dirty-written Python script that converts - LaTeX sources to texi files. Output is left in - `@out.texi' - -texi{pre,post}.dat these files will be put before and after the - result +partparse.py Python script that converts LaTeX sources to + texi files. -fix.sh calls emacs in order to update all the nodes and - menus. After this, makeinfo will convert the - texinfo-source to the info-file(s). Assumption: - the texi-source is called `@out.texi' +texi{pre,post}.dat Files placed before and after the result. -fix.el the elisp-file executed by emacs. Two calls to +fix.el Elisp file executed by Emacs. Two calls to 'texinfo-all-menus-update are necessary in - some cases - -fix_hack executable shell script that fixes the results - of the underscore hack. {\ptt \char'137} is - back-translated to a simple underscore. This is - needed for the texindex program - -handy.el some handy Emacs-macro's that helped converting - ``old'' documentation to a format that could be - understood by the converter scipt (partparse.py). - (You don't really need this, but, as the name - says, these macros are "handy") + some cases. + +fix_hack Shell script to fix the results of the + "underscore hack". {\ptt \char'137} is + back-translated to a simple underscore. This + is needed for the texindex program. + +whichlibs Shell script to print a list of lib*.tex files + to be processed. A Million thanks for Jan-Hein B\"uhrman for writing and debugging the convertor and related scripts, and for fixing the LaTeX sources and diff --git a/Doc/ext.tex b/Doc/ext.tex index 6eeaacf..a7d4221 100644 --- a/Doc/ext.tex +++ b/Doc/ext.tex @@ -1,6 +1,6 @@ -\documentstyle[twoside,11pt,myformat,times]{report} +\documentstyle[twoside,11pt,myformat]{report} -\title{\bf Extending and Embedding the Python Interpreter} +\title{Extending and Embedding the Python Interpreter} \author{ Guido van Rossum \\ @@ -9,7 +9,7 @@ E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! % Tell \index to actually write the .idx file \makeindex @@ -51,18 +51,30 @@ system supports this feature. It is quite easy to add non-standard built-in modules to Python, if you know how to program in C. A built-in module known to the Python programmer as \code{foo} is generally implemented by a file called -\file{foomodule.c}. All but the most essential standard built-in +\file{foomodule.c}. All but the two most essential standard built-in modules also adhere to this convention, and in fact some of them form excellent examples of how to create an extension. Extension modules can do two things that can't be done directly in -Python: they can implement new data types, and they can make system -calls or call C library functions. Since the latter is usually the -most important reason for adding an extension, I'll concentrate on -adding `wrappers' around C library functions; the concrete example -uses the wrapper for +Python: they can implement new data types (which are different from +classes by the way), and they can make system calls or call C library +functions. Since the latter is usually the most important reason for +adding an extension, I'll concentrate on adding `wrappers' around C +library functions; the concrete example uses the wrapper for \code{system()} in module \code{posix}, found in (of course) the file -\file{posixmodule.c}. +\file{Modules/posixmodule.c}. + +Note: unless otherwise mentioned, all file references in this +document are relative to the toplevel directory of the Python +distribution --- i.e. the directory that contains the \file{configure} +script. + +The compilation of an extension module depends on your system setup +and the intended use of the module; details are given in a later +section. + + +\section{A first look at the code} It is important not to be impressed by the size and complexity of the average extension module; much of this is straightforward @@ -87,8 +99,8 @@ in \file{posixmodule.c} first: \end{verbatim} This is the prototypical top-level function in an extension module. -It will be called (we'll see later how this is made possible) when the -Python program executes statements like +It will be called (we'll see later how) when the Python program +executes statements like \begin{verbatim} >>> import posix @@ -96,71 +108,84 @@ Python program executes statements like \end{verbatim} There is a straightforward translation from the arguments to the call -in Python (here the single value \code{'ls -l'}) to the arguments that +in Python (here the single expression \code{'ls -l'}) to the arguments that are passed to the C function. The C function always has two -parameters, conventionally named \var{self} and \var{args}. In this -example, \var{self} will always be a \code{NULL} pointer, since this is a -function, not a method (this is done so that the interpreter doesn't -have to understand two different types of C functions). +parameters, conventionally named \var{self} and \var{args}. The +\var{self} argument is used when the C function implements a builtin +method --- this is advanced material and not covered in this document. +In the example, \var{self} will always be a \code{NULL} pointer, since +we are defining a function, not a method (this is done so that the +interpreter doesn't have to understand two different types of C +functions). The \var{args} parameter will be a pointer to a Python object, or \code{NULL} if the Python function/method was called without arguments. It is necessary to do full argument type checking on each call, since otherwise the Python user would be able to cause the -Python interpreter to `dump core' by passing the wrong arguments to a -function in an extension module (or no arguments at all). Because -argument checking and converting arguments to C is such a common task, -there's a general function in the Python interpreter which combines -these tasks: \code{getargs()}. It uses a template string to determine -both the types of the Python argument and the types of the C variables -into which it should store the converted values. (More about this -later.)\footnote{ -There are convenience macros \code{getstrarg()}, +Python interpreter to `dump core' by passing invalid arguments to a +function in an extension module. Because argument checking and +converting arguments to C are such common tasks, there's a general +function in the Python interpreter that combines them: +\code{getargs()}. It uses a template string to determine both the +types of the Python argument and the types of the C variables into +which it should store the converted values.\footnote{There are +convenience macros \code{getnoarg()}, \code{getstrarg()}, \code{getintarg()}, etc., for many common forms of \code{getargs()} -templates. These are relics from the past; it's better to call -\code{getargs()} directly.} +templates. These are relics from the past; the recommended practice +is to call \code{getargs()} directly.} (More about this later.) If \code{getargs()} returns nonzero, the argument list has the right type and its components have been stored in the variables whose addresses are passed. If it returns zero, an error has occurred. In -the latter case it has already raised an appropriate exception by -calling \code{err_setstr()}, so the calling function can just return -\code{NULL}. +the latter case it has already raised an appropriate exception by so +the calling function should return \code{NULL} immediately --- see the +next section. \section{Intermezzo: errors and exceptions} An important convention throughout the Python interpreter is the following: when a function fails, it should set an exception condition -and return an error value (often a NULL pointer). Exceptions are set -in a global variable in the file errors.c; if this variable is NULL no -exception has occurred. A second variable is the `associated value' -of the exception. - -The file errors.h declares a host of err_* functions to set various -types of exceptions. The most common one is \code{err_setstr()} --- its -arguments are an exception object (e.g. RuntimeError --- actually it -can be any string object) and a C string indicating the cause of the -error (this is converted to a string object and stored as the -`associated value' of the exception). Another useful function is +and return an error value (often a \code{NULL} pointer). Exceptions +are stored in a static global variable in \file{Python/errors.c}; if +this variable is \code{NULL} no exception has occurred. A second +static global variable stores the `associated value' of the exception +--- the second argument to \code{raise}. + +The file \file{errors.h} declares a host of functions to set various +types of exceptions. The most common one is \code{err_setstr()} --- +its arguments are an exception object (e.g. \code{RuntimeError} --- +actually it can be any string object) and a C string indicating the +cause of the error (this is converted to a string object and stored as +the `associated value' of the exception). Another useful function is \code{err_errno()}, which only takes an exception argument and constructs the associated value by inspection of the (UNIX) global -variable errno. +variable errno. The most general function is \code{err_set()}, which +takes two object arguments, the exception and its associated value. +You don't need to \code{INCREF()} the objects passed to any of these +functions. You can test non-destructively whether an exception has been set with \code{err_occurred()}. However, most code never calls \code{err_occurred()} to see whether an error occurred or not, but -relies on error return values from the functions it calls instead: +relies on error return values from the functions it calls instead. When a function that calls another function detects that the called -function fails, it should return an error value but not set an -condition --- one is already set. The caller is then supposed to also -return an error indication to *its* caller, again *without* calling -\code{err_setstr()}, and so on --- the most detailed cause of the error -was already reported by the function that detected it in the first -place. Once the error has reached Python's interpreter main loop, -this aborts the currently executing Python code and tries to find an -exception handler specified by the Python programmer. +function fails, it should return an error value (e.g. \code{NULL} or +\code{-1}) but not call one of the \code{err_*} functions --- one has +already been called. The caller is then supposed to also return an +error indication to {\em its} caller, again {\em without} calling +\code{err_*()}, and so on --- the most detailed cause of the error was +already reported by the function that first detected it. Once the +error has reached Python's interpreter main loop, this aborts the +currently executing Python code and tries to find an exception handler +specified by the Python programmer. + +(There are situations where a module can actually give a more detailed +error message by calling another \code{err_*} function, and in such +cases it is fine to do so. As a general rule, however, this is not +necessary, and can cause information about the cause of the error to +be lost: most operations can fail for a variety of reasons.) To ignore an exception set by a function call that failed, the exception condition must be cleared explicitly by calling @@ -170,8 +195,9 @@ interpreter but wants to handle it completely by itself (e.g. by trying something else or pretending nothing happened). Finally, the function \code{err_get()} gives you both error variables -*and clears them*. Note that even if an error occurred the second one -may be NULL. I doubt you will need to use this function. +{\em and clears them}. Note that even if an error occurred the second +one may be \code{NULL}. You have to \code{XDECREF()} both when you +are finished with them. I doubt you will need to use this function. Note that a failing \code{malloc()} call must also be turned into an exception --- the direct caller of \code{malloc()} (or @@ -180,70 +206,110 @@ indicator itself. All the object-creating functions (\code{newintobject()} etc.) already do this, so only if you call \code{malloc()} directly this note is of importance. -Also note that, with the important exception of \code{getargs()}, functions -that return an integer status usually use 0 for success and -1 for -failure. +Also note that, with the important exception of \code{getargs()}, +functions that return an integer status usually return \code{0} or a +positive value for success and \code{-1} for failure. -Finally, be careful about cleaning up garbage (making appropriate -[\code{X}]\code{DECREF()} calls) when you return an error! +Finally, be careful about cleaning up garbage (making \code{XDECREF()} +or \code{DECREF()} calls for objects you have already created) when +you return an error! + +The choice of which exception to raise is entirely yours. There are +predeclared C objects corresponding to all built-in Python exceptions, +e.g. \code{ZeroDevisionError} which you can use directly. Of course, +you should chose exceptions wisely --- don't use \code{TypeError} to +mean that a file couldn't be opened (that should probably be +\code{IOError}). If anything's wrong with the argument list the +\code{getargs()} function raises \code{TypeError}. If you have an +argument whose value which must be in a particular range or must +satisfy other conditions, \code{ValueError} is appropriate. + +You can also define a new exception that is unique to your module. +For this, you usually declare a static object variable at the +beginning of your file, e.g. + +\begin{verbatim} + static object *FooError; +\end{verbatim} + +and initialize it in your module's initialization function +(\code{initfoo()}) with a string object, e.g. (leaving out the error +checking for simplicity): + +\begin{verbatim} + void + initfoo() + { + object *m, *d; + m = initmodule("foo", foo_methods); + d = getmoduledict(m); + FooError = newstringobject("foo.error"); + dictinsert(d, "error", FooError); + } +\end{verbatim} \section{Back to the example} -Going back to posix_system, you should now be able to understand this -bit: +Going back to \code{posix_system()}, you should now be able to +understand this bit: \begin{verbatim} if (!getargs(args, "s", &command)) return NULL; \end{verbatim} -It returns NULL (the error indicator for functions of this kind) if an -error is detected in the argument list, relying on the exception set -by \code{getargs()}. The string value of the argument is now copied to the -local variable 'command'. +It returns \code{NULL} (the error indicator for functions of this +kind) if an error is detected in the argument list, relying on the +exception set by \code{getargs()}. Otherwise the string value of the +argument has been copied to the local variable \code{command} --- this +is in fact just a pointer assignment and you are not supposed to +modify the string to which it points. -If a Python function is called with multiple arguments, the argument -list is turned into a tuple. Python programs can us this feature, for -instance, to explicitly create the tuple containing the arguments -first and make the call later. +If a function is called with multiple arguments, the argument list +(the argument \code{args}) is turned into a tuple. If it is called +without arguments, \code{args} is \code{NULL}. \code{getargs()} knows +about this; see later. -The next statement in posix_system is a call tothe C library function -\code{system()}, passing it the string we just got from \code{getargs()}: +The next statement in \code{posix_system()} is a call to the C library +function \code{system()}, passing it the string we just got from +\code{getargs()}: \begin{verbatim} sts = system(command); \end{verbatim} -Python strings may contain internal null bytes; but if these occur in -this example the rest of the string will be ignored by \code{system()}. - -Finally, posix.\code{system()} must return a value: the integer status -returned by the C library \code{system()} function. This is done by the -function \code{newintobject()}, which takes a (long) integer as parameter. +Finally, \code{posix.system()} must return a value: the integer status +returned by the C library \code{system()} function. This is done +using the function \code{mkvalue()}, which is something like the +inverse of \code{getargs()}: it takes a format string and a variable +number of C values and returns a new Python object. \begin{verbatim} - return newintobject((long)sts); + return mkvalue("i", sts); \end{verbatim} -(Yes, even integers are represented as objects on the heap in Python!) -If you had a function that returned no useful argument, you would need -this idiom: +In this case, it returns an integer object (yes, even integers are +objects on the heap in Python!). More info on \code{mkvalue()} is +given later. + +If you had a function that returned no useful argument (a.k.a. a +procedure), you would need this idiom: \begin{verbatim} INCREF(None); return None; \end{verbatim} -'None' is a unique Python object representing 'no value'. It differs -from NULL, which means 'error' in most contexts (except when passed as -a function argument --- there it means 'no arguments'). +\code{None} is a unique Python object representing `no value'. It +differs from \code{NULL}, which means `error' in most contexts. \section{The module's function table} I promised to show how I made the function \code{posix_system()} -available to Python programs. This is shown later in posixmodule.c: +callable from Python programs. This is shown later in +\file{Modules/posixmodule.c}: \begin{verbatim} static struct methodlist posix_methods[] = { @@ -260,78 +326,72 @@ available to Python programs. This is shown later in posixmodule.c: } \end{verbatim} -(The actual \code{initposix()} is somewhat more complicated, but most -extension modules are indeed as simple as that.) When the Python -program first imports module 'posix', \code{initposix()} is called, -which calls \code{initmodule()} with specific parameters. This -creates a module object (which is inserted in the table sys.modules -under the key 'posix'), and adds built-in-function objects to the -newly created module based upon the table (of type struct methodlist) -that was passed as its second parameter. The function -\code{initmodule()} returns a pointer to the module object that it -creates, but this is unused here. It aborts with a fatal error if the -module could not be initialized satisfactorily. - - -\section{Calling the module initialization function} - -There is one more thing to do: telling the Python module to call the -\code{initfoo()} function when it encounters an 'import foo' statement. -This is done in the file config.c. This file contains a table mapping -module names to parameterless void function pointers. You need to add -a declaration of \code{initfoo()} somewhere early in the file, and a -line saying +(The actual \code{initposix()} is somewhat more complicated, but many +extension modules can be as simple as shown here.) When the Python +program first imports module \code{posix}, \code{initposix()} is +called, which calls \code{initmodule()} with specific parameters. +This creates a `module object' (which is inserted in the table +\code{sys.modules} under the key \code{'posix'}), and adds +built-in-function objects to the newly created module based upon the +table (of type struct methodlist) that was passed as its second +parameter. The function \code{initmodule()} returns a pointer to the +module object that it creates (which is unused here). It aborts with +a fatal error if the module could not be initialized satisfactorily, +so you don't need to check for errors. + + +\section{Compilation and linkage} + +There are two more things to do before you can use your new extension +module: compiling and linking it with the Python system. If you use +dynamic loading, the details depend on the style of dynamic loading +your system uses; see the chapter on Dynamic Loading for more info +about this. + +If you can't use dynamic loading, or if you want to make your module a +permanent part of the Python interpreter, you will have to change the +configuration setup and rebuild the interpreter. Luckily, in the 1.0 +release this is very simple: just place your file (named +\file{foomodule.c} for example) in the \file{Modules} directory, add a +line to the file \file{Modules/Setup} describing your file: \begin{verbatim} - {"foo", initfoo}, + foo foomodule.o \end{verbatim} -to the initializer for inittab[]. It is conventional to include both -the declaration and the initializer line in preprocessor commands -\code{\#ifdef USE_FOO} / \code{\#endif}, to make it easy to turn the -foo extension on or off. Note that the Macintosh version uses a -different configuration file, distributed as configmac.c. This -strategy may be extended to other operating system versions, although -usually the standard config.c file gives a pretty useful starting -point for a new config*.c file. - -And, of course, I forgot the Makefile. This is actually not too hard, -just follow the examples for, say, AMOEBA. Just find all occurrences -of the string AMOEBA in the Makefile and do the same for FOO that's -done for AMOEBA... - -(Note: if you are using dynamic loading for your extension, you don't -need to edit config.c and the Makefile. See \file{./DYNLOAD} for more -info about this.) +and rebuild the interpreter by running \code{make} in the toplevel +directory. You can also run \code{make} in the \file{Modules} +subdirectory, but then you must first rebuilt the \file{Makefile} +there by running \code{make Makefile}. (This is necessary each time +you change the \file{Setup} file.) \section{Calling Python functions from C} -The above concentrates on making C functions accessible to the Python -programmer. The reverse is also often useful: calling Python -functions from C. This is especially the case for libraries that -support so-called `callback' functions. If a C interface makes heavy -use of callbacks, the equivalent Python often needs to provide a -callback mechanism to the Python programmer; the implementation may -require calling the Python callback functions from a C callback. -Other uses are also possible. +So far we have concentrated on making C functions callable from +Python. The reverse is also useful: calling Python functions from C. +This is especially the case for libraries that support so-called +`callback' functions. If a C interface makes use of callbacks, the +equivalent Python often needs to provide a callback mechanism to the +Python programmer; the implementation will require calling the Python +callback functions from a C callback. Other uses are also imaginable. Fortunately, the Python interpreter is easily called recursively, and -there is a standard interface to call a Python function. I won't +there is a standard interface to call a Python function. (I won't dwell on how to call the Python parser with a particular string as input --- if you're interested, have a look at the implementation of -the \samp{-c} command line option in pythonmain.c. +the \samp{-c} command line option in \file{Python/pythonmain.c}.) Calling a Python function is easy. First, the Python program must somehow pass you the Python function object. You should provide a function (or some other interface) to do this. When this function is called, save a pointer to the Python function object (be careful to -INCREF it!) in a global variable --- or whereever you see fit. +\code{INCREF()} it!) in a global variable --- or whereever you see fit. For example, the following function might be part of a module definition: \begin{verbatim} - static object *my_callback; + static object *my_callback = NULL; static object * my_set_callback(dummy, arg) @@ -346,29 +406,49 @@ definition: } \end{verbatim} +This particular function doesn't do any typechecking on its argument +--- that will be done by \code{call_object()}, which is a bit late but +at least protects the Python interpreter from shooting itself in its +foot. (The problem with typechecking functions is that there are at +least five different Python object types that can be called, so the +test would be somewhat cumbersome.) + +The macros \code{XINCREF()} and \code{XDECREF()} increment/decrement +the reference count of an object and are safe in the presence of +\code{NULL} pointers. More info on them in the section on Reference +Counts below. + Later, when it is time to call the function, you call the C function \code{call_object()}. This function has two arguments, both pointers -to arbitrary Python objects: the Python function, and the argument. -The argument can be NULL to call the function without arguments. For -example: +to arbitrary Python objects: the Python function, and the argument +list. The argument list must always be a tuple object, whose length +is the number of arguments. To call the Python function with no +arguments, you must pass an empty tuple. For example: \begin{verbatim} + object *arglist; object *result; ... /* Time to call the callback */ - result = call_object(my_callback, (object *)NULL); + arglist = mktuple(0); + result = call_object(my_callback, arglist); + DECREF(arglist); \end{verbatim} \code{call_object()} returns a Python object pointer: this is the return value of the Python function. \code{call_object()} is -`reference-count-neutral' with respect to its arguments, but the -return value is `new': either it is a brand new object, or it is an -existing object whose reference count has been incremented. So, you -should somehow apply DECREF to the result, even (especially!) if you -are not interested in its value. +`reference-count-neutral' with respect to its arguments. In the +example a new tuple was created to serve as the argument list, which +is \code{DECREF()}-ed immediately after the call. + +The return value of \code{call_object()} is `new': either it is a +brand new object, or it is an existing object whose reference count +has been incremented. So, unless you want to save it in a global +variable, you should somehow \code{DECREF()} the result, even +(especially!) if you are not interested in its value. Before you do this, however, it is important to check that the return -value isn't NULL. If it is, the Python function terminated by raising +value isn't \code{NULL}. If it is, the Python function terminated by raising an exception. If the C code that called \code{call_object()} is called from Python, it should now return an error indication to its Python caller, so the interpreter can print a stack trace, or the @@ -384,21 +464,21 @@ or desirable, the exception should be cleared by calling \end{verbatim} Depending on the desired interface to the Python callback function, -you may also have to provide an argument to \code{call_object()}. In -some cases the argument is also provided by the Python program, -through the same interface that specified the callback function. It -can then be saved and used in the same manner as the function object. -In other cases, you may have to construct a new object to pass as -argument. In this case you must dispose of it as well. For example, -if you want to pass an integral event code, you might use the -following code: +you may also have to provide an argument list to \code{call_object()}. +In some cases the argument list is also provided by the Python +program, through the same interface that specified the callback +function. It can then be saved and used in the same manner as the +function object. In other cases, you may have to construct a new +tuple to pass as the argument list. The simplest way to do this is to +call \code{mkvalue()}. For example, if you want to pass an integral +event code, you might use the following code: \begin{verbatim} - object *argument; + object *arglist; ... - argument = newintobject((long)eventcode); - result = call_object(my_callback, argument); - DECREF(argument); + arglist = mkvalue("(l)", eventcode); + result = call_object(my_callback, arglist); + DECREF(arglist); if (result == NULL) return NULL; /* Pass error back */ /* Here maybe use the result */ @@ -407,19 +487,8 @@ following code: Note the placement of DECREF(argument) immediately after the call, before the error check! Also note that strictly spoken this code is -not complete: \code{newintobject()} may run out of memory, and this -should be checked. - -In even more complicated cases you may want to pass the callback -function multiple arguments. To this end you have to construct (and -dispose of!) a tuple object. Details (mostly concerned with the -errror checks and reference count manipulation) are left as an -exercise for the reader; most of this is also needed when returning -multiple values from a function. - -XXX TO DO: explain objects. - -XXX TO DO: defining new object types. +not complete: \code{mkvalue()} may run out of memory, and this should +be checked. \section{Format strings for {\tt getargs()}} @@ -433,69 +502,78 @@ follows: The remaining arguments must be addresses of variables whose type is determined by the format string. For the conversion to succeed, the -`arg' object must match the format and the format must be exhausted. +\var{arg} object must match the format and the format must be exhausted. Note that while \code{getargs()} checks that the Python object really -is of the specified type, it cannot check that the addresses provided -in the call match: if you make mistakes there, your code will probably -dump core. +is of the specified type, it cannot check the validity of the +addresses of C variables provided in the call: if you make mistakes +there, your code will probably dump core. -A format string consists of a single `format unit'. A format unit -describes one Python object; it is usually a single character or a -parenthesized string. The type of a format units is determined from -its first character, the `format letter': +A non-empty format string consists of a single `format unit'. A +format unit describes one Python object; it is usually a single +character or a parenthesized sequence of format units. The type of a +format units is determined from its first character, the `format +letter': \begin{description} \item[\samp{s} (string)] The Python object must be a string object. The C argument must be a -char** (i.e. the address of a character pointer), and a pointer to -the C string contained in the Python object is stored into it. If the -next character in the format string is \samp{\#}, another C argument -of type int* must be present, and the length of the Python string (not -counting the trailing zero byte) is stored into it. +\code{(char**)} (i.e. the address of a character pointer), and a pointer +to the C string contained in the Python object is stored into it. You +must not provide storage to store the string; a pointer to an existing +string is stored into the character pointer variable whose address you +pass. If the next character in the format string is \samp{\#}, +another C argument of type \code{(int*)} must be present, and the +length of the Python string (not counting the trailing zero byte) is +stored into it. \item[\samp{z} (string or zero, i.e. \code{NULL})] Like \samp{s}, but the object may also be None. In this case the -string pointer is set to NULL and if a \samp{\#} is present the size -it set to 0. +string pointer is set to \code{NULL} and if a \samp{\#} is present the +size is set to 0. \item[\samp{b} (byte, i.e. char interpreted as tiny int)] -The object must be a Python integer. The C argument must be a char*. +The object must be a Python integer. The C argument must be a +\code{(char*)}. \item[\samp{h} (half, i.e. short)] -The object must be a Python integer. The C argument must be a short*. +The object must be a Python integer. The C argument must be a +\code{(short*)}. \item[\samp{i} (int)] -The object must be a Python integer. The C argument must be an int*. +The object must be a Python integer. The C argument must be an +\code{(int*)}. \item[\samp{l} (long)] The object must be a (plain!) Python integer. The C argument must be -a long*. +a \code{(long*)}. \item[\samp{c} (char)] The Python object must be a string of length 1. The C argument must -be a char*. (Don't pass an int*!) +be a \code{(char*)}. (Don't pass an \code{(int*)}!) \item[\samp{f} (float)] The object must be a Python int or float. The C argument must be a -float*. +\code{(float*)}. \item[\samp{d} (double)] The object must be a Python int or float. The C argument must be a -double*. +\code{(double*)}. \item[\samp{S} (string object)] The object must be a Python string. The C argument must be an -object** (i.e. the address of an object pointer). The C program thus -gets back the actual string object that was passed, not just a pointer -to its array of characters and its size as for format character -\samp{s}. +\code{(object**)} (i.e. the address of an object pointer). The C +program thus gets back the actual string object that was passed, not +just a pointer to its array of characters and its size as for format +character \samp{s}. The reference count of the object has not been +increased. \item[\samp{O} (object)] -The object can be any Python object, including None, but not NULL. -The C argument must be an object**. This can be used if an argument -list must contain objects of a type for which no format letter exist: -the caller must then check that it has the right type. +The object can be any Python object, including None, but not +\code{NULL}. The C argument must be an \code{(object**)}. This can be +used if an argument list must contain objects of a type for which no +format letter exist: the caller must then check that it has the right +type. The reference count of the object has not been increased. \item[\samp{(} (tuple)] The object must be a Python tuple. Following the \samp{(} character @@ -504,15 +582,15 @@ elements of the tuple, followed by a \samp{)} character. Tuple format units may be nested. (There are no exceptions for empty and singleton tuples; \samp{()} specifies an empty tuple and \samp{(i)} a singleton of one integer. Normally you don't want to use the latter, -since it is hard for the user to specify. +since it is hard for the Python user to specify. \end{description} More format characters will probably be added as the need arises. It -should be allowed to use Python long integers whereever integers are -expected, and perform a range check. (A range check is in fact always -necessary for the \samp{b}, \samp{h} and \samp{i} format -letters, but this is currently not implemented.) +should (but currently isn't) be allowed to use Python long integers +whereever integers are expected, and perform a range check. (A range +check is in fact always necessary for the \samp{b}, \samp{h} and +\samp{i} format letters, but this is currently not implemented.) Some example calls: @@ -523,14 +601,14 @@ Some example calls: char *s; int size; - ok = getargs(args, "(lls)", &k, &l, &s); /* Two longs and a string */ - /* Possible Python call: f(1, 2, 'three') */ + ok = getargs(args, ""); /* No arguments */ + /* Python call: f() */ ok = getargs(args, "s", &s); /* A string */ /* Possible Python call: f('whoops!') */ - ok = getargs(args, ""); /* No arguments */ - /* Python call: f() */ + ok = getargs(args, "(lls)", &k, &l, &s); /* Two longs and a string */ + /* Possible Python call: f(1, 2, 'three') */ ok = getargs(args, "((ii)s#)", &i, &j, &s, &size); /* A pair of ints and a string, whose size is also returned */ @@ -546,9 +624,13 @@ Some example calls: } \end{verbatim} -Note that a format string must consist of a single unit; strings like -\samp{is} and \samp{(ii)s\#} are not valid format strings. (But -\samp{s\#} is.) +Note that the `top level' of a non-empty format string must consist of +a single unit; strings like \samp{is} and \samp{(ii)s\#} are not valid +format strings. (But \samp{s\#} is.) If you have multiple arguments, +the format must therefore always be enclosed in parentheses, as in the +examples \samp{((ii)s\#)} and \samp{(((ii)(ii))(ii)}. (The current +implementation does not complain when more than one unparenthesized +format unit is given. Sorry.) The \code{getargs()} function does not support variable-length argument lists. In simple cases you can fake these by trying several @@ -575,7 +657,7 @@ calls to \end{verbatim} (It is possible to think of an extension to the definition of format -strings to accomodate this directly, e.g., placing a \samp{|} in a +strings to accommodate this directly, e.g. placing a \samp{|} in a tuple might specify that the remaining arguments are optional. \code{getargs()} should then return one more than the number of variables stored into.) @@ -583,13 +665,13 @@ variables stored into.) Advanced users note: If you set the `varargs' flag in the method list for a function, the argument will always be a tuple (the `raw argument list'). In this case you must enclose single and empty argument lists -in parentheses, e.g., \samp{(s)} and \samp{()}. +in parentheses, e.g. \samp{(s)} and \samp{()}. \section{The {\tt mkvalue()} function} This function is the counterpart to \code{getargs()}. It is declared -in \file{modsupport.h} as follows: +in \file{Include/modsupport.h} as follows: \begin{verbatim} object *mkvalue(char *format, ...); @@ -607,7 +689,7 @@ second argument specifies the length of the data (negative means use argument (so you should \code{DECREF()} it if you've just created it and aren't going to use it again). -If the argument for \samp{O} or \samp{S} is a NULL pointer, it is +If the argument for \samp{O} or \samp{S} is a \code{NULL} pointer, it is assumed that this was caused because the call producing the argument found an error and set an exception. Therefore, \code{mkvalue()} will return \code{NULL} but won't set an exception if one is already set. @@ -634,8 +716,10 @@ one argument is expected.) Here's a useful explanation of \code{INCREF()} and \code{DECREF()} (after an original by Sjoerd Mullender). -Use \code{XINCREF()} or \code{XDECREF()} instead of \code{INCREF()} / -\code{DECREF()} when the argument may be \code{NULL}. +Use \code{XINCREF()} or \code{XDECREF()} instead of \code{INCREF()} or +\code{DECREF()} when the argument may be \code{NULL} --- the versions +without \samp{X} are faster but wull dump core when they encounter a +\code{NULL} pointer. The basic idea is, if you create an extra reference to an object, you must \code{INCREF()} it, if you throw away a reference to an object, @@ -696,7 +780,7 @@ which you keep references in your object, but you should not use \code{DECREF()} on your object. You should use \code{DEL()} instead. -\section{Using C++} +\section{Writing extensions in C++} It is possible to write extension modules in C++. Some restrictions apply: since the main program (the Python interpreter) is compiled and @@ -733,10 +817,10 @@ lower-level operations described in the previous chapters to construct and use Python objects. A simple demo of embedding Python can be found in the directory -\file{/embed}. +\file{Demo/embed}. -\section{Using C++} +\section{Embedding Python in C++} It is also possible to embed Python in a C++ program; how this is done exactly will depend on the details of the C++ system used; in general @@ -747,13 +831,16 @@ recompile Python itself with C++. \chapter{Dynamic Loading} -On some systems (e.g., SunOS, SGI Irix) it is possible to configure -Python to support dynamic loading of modules implemented in C. Once -configured and installed it's trivial to use: if a Python program +On most modern systems it is possible to configure Python to support +dynamic loading of extension modules implemented in C. When shared +libraries are used dynamic loading is configured automatically; +otherwise you have to select it as a build option (see below). Once +configured, dynamic loading is trivial to use: when a Python program executes \code{import foo}, the search for modules tries to find a -file \file{foomodule.o} in the module search path, and if one is -found, it is linked with the executing binary and executed. Once -linked, the module acts just like a built-in module. +file \file{foomodule.o} (\file{foomodule.so} when using shared +libraries) in the module search path, and if one is found, it is +loaded into the executing binary and executed. Once loaded, the +module acts just like a built-in extension module. The advantages of dynamic loading are twofold: the `core' Python binary gets smaller, and users can extend Python with their own @@ -762,150 +849,167 @@ own copy of the Python interpreter. There are also disadvantages: dynamic loading isn't available on all systems (this just means that on some systems you have to use static loading), and dynamically loading a module that was compiled for a different version of Python -(e.g., with a different representation of objects) may dump core. - -{\bf NEW:} Under SunOS (all versions) and IRIX 5.x, dynamic loading -now uses shared libraries and is always configured. See at the -end of this chapter for how to create a dynamically loadable module. +(e.g. with a different representation of objects) may dump core. \section{Configuring and building the interpreter for dynamic loading} -(Ignore this section for SunOS and IRIX 5.x --- on these systems -dynamic loading is always configured.) +There are three styles of dynamic loading: one using shared libraries, +one using SGI IRIX 4 dynamic loading, and one using GNU dynamic +loading. + +\subsection{Shared libraries} + +The following systems supports dynamic loading using shared libraries: +SunOS 4; Solaris 2; SGI IRIX 5 (but not SGI IRIX 4!); and probably all +systems derived from SVR4, or at least those SVR4 derivatives that +support shared libraries (are there any that don't?). + +You don't need to do anything to configure dynamic loading on these +systems --- the \file{configure} detects the presence of the +\file{} header file and automatically configures dynamic +loading. + +\subsection{SGI dynamic loading} + +Only SGI IRIX 4 supports dynamic loading of modules using SGI dynamic +loading. (SGI IRIX 5 might also support it but it is inferior to +using shared libraries so there is no reason to; a small test didn't +work right away so I gave up trying to support it.) + +Before you build Python, you first need to fetch and build the \code{dl} +package written by Jack Jansen. This is available by anonymous ftp +from host \file{ftp.cwi.nl}, directory \file{pub/dynload}, file +\file{dl-1.6.tar.Z}. (The version number may change.) Follow the +instructions in the package's \file{README} file to build it. + +Once you have built \code{dl}, you can configure Python to use it. To +this end, you run the \file{configure} script with the option +\code{--with-dl=\var{directory}} where \var{directory} is the absolute +pathname of the \code{dl} directory. + +Now build and install Python as you normally would (see the +\file{README} file in the toplevel Python directory.) + +\subsection{GNU dynamic loading} + +GNU dynamic loading supports (according to its \file{README} file) the +following hardware and software combinations: VAX (Ultrix), Sun 3 +(SunOS 3.4 and 4.0), Sparc (SunOS 4.0), Sequent Symmetry (Dynix), and +Atari ST. There is no reason to use it on a Sparc; I haven't seen a +Sun 3 for years so I don't know if these have shared libraries or not. + +You need to fetch and build two packages. One is GNU DLD 3.2.3, +available by anonymous ftp from host \file{ftp.cwi.nl}, directory +\file{pub/dynload}, file \file{dld-3.2.3.tar.Z}. (As far as I know, +no further development on GNU DLD is being done.) The other is an +emulation of Jack Jansen's \code{dl} package that I wrote on top of +GNU DLD 3.2.3. This is available from the same host and directory, +file dl-dld-1.1.tar.Z. (The version number may change --- but I doubt +it will.) Follow the instructions in each package's \file{README} +file to configure build them. + +Now configure Python. Run the \file{configure} script with the option +\code{--with-dl-dld=\var{dl-directory},\var{dld-directory}} where +\var{dl-directory} is the absolute pathname of the directory where you +have built the \file{dl-dld} package, and \var{dld-directory} is that +of the GNU DLD package. The Python interpreter you build hereafter +will support GNU dynamic loading. + -Dynamic loading is a little complicated to configure, since its -implementation is extremely system dependent, and there are no -really standard libraries or interfaces for it. I'm using an -extremely simple interface, which basically needs only one function: +\section{Building a dynamically loadable module} + +Since there are three styles of dynamic loading, there are also three +groups of instructions for building a dynamically loadable module. +Instructions common for all three styles are given first. Assuming +your module is called \code{foo}, the source filename must be +\file{foomodule.c}, so the object name is \file{foomodule.o}. The +module must be written as a normal Python extension module (as +described earlier). + +Note that in all cases you will have to create your own Makefile that +compiles your module file(s). This Makefile will have to pass two +\samp{-I} arguments to the C compiler which will make it find the +Python header files. If the Make variable \var{PYTHONTOP} points to +the toplevel Python directory, your \var{CFLAGS} Make variable should +contain the options \samp{-I\$(PYTHONTOP) -I\$(PYTHONTOP)/Include}. +(Most header files are in the \file{Include} subdirectory, but the +\file{config.h} header lives in the toplevel directory.) You must +also add \samp{-DHAVE_CONFIG_H} to the definition of \var{CFLAGS} to +direct the Python headers to include \file{config.h}. + + +\subsection{Shared libraries} + +You must link the \samp{.o} file to produce a shared library. This is +done using a special invocation of the \UNIX{} loader/linker, {\em +ld}(1). Unfortunately the invocation differs slightly per system. + +On SunOS 4, use +\begin{verbatim} + ld foomodule.o -o foomodule.so +\end{verbatim} +On Solaris 2, use \begin{verbatim} - funcptr = dl_loadmod(binary, object, function) + ld -G foomodule.o -o foomodule.so \end{verbatim} -where \code{binary} is the pathname of the currently executing program -(not just \code{argv[0]}!), \code{object} is the name of the \samp{.o} -file to be dynamically loaded, and \code{function} is the name of a -function in the module. If the dynamic loading succeeds, -\code{dl_loadmod()} returns a pointer to the named function; if not, it -returns \code{NULL}. - -I provide two implementations of \code{dl_loadmod()}: one for SGI machines -running Irix 4.0 (written by my colleague Jack Jansen), and one that -is a thin interface layer for Wilson Ho's (GNU) dynamic loading -package \dfn{dld} (version 3.2.3). Dld implements a much more powerful -version of dynamic loading than needed (including unlinking), but it -does not support System V's COFF object file format. It currently -supports only VAX (Ultrix), Sun 3 (SunOS 3.4 and 4.0), SPARCstation -(SunOS 4.0), Sequent Symmetry (Dynix), and Atari ST (from the dld -3.2.3 README file). Dld is part of the standard Python distribution; -if you didn't get it,many ftp archive sites carry dld these days, so -it won't be hard to get hold of it if you need it (using archie). - -(If you don't know where to get dld, try anonymous ftp to -\file{wuarchive.wustl.edu:/mirrors2/gnu/dld-3.2.3.tar.Z}. Jack's dld -can be found at \file{ftp.cwi.nl:/pub/python/dl.tar.Z}.) - -To build a Python interpreter capable of dynamic loading, you need to -edit the Makefile. Basically you must uncomment the lines starting -with \samp{\#DL_}, but you must also edit some of the lines to choose -which version of dl_loadmod to use, and fill in the pathname of the dld -library if you use it. And, of course, you must first build -dl_loadmod and dld, if used. (This is now done through the Configure -script. For SunOS and IRIX 5.x, everything is now automatic.) +On SGI IRIX 5, use +\begin{verbatim} + ld -shared foomodule.o -o foomodule.so +\end{verbatim} +On other systems, consult the manual page for {\em ld}(1) to find what +flags, if any, must be used. -\section{Building a dynamically loadable module} +If your extension module uses system libraries that haven't already +been linked with Python (e.g. a windowing system), these must be +passed to the {\em ld} command as \samp{-l} options after the +\samp{.o} file. -Building an object file usable by dynamic loading is easy, if you -follow these rules (substitute your module name for \code{foo} -everywhere): +The resulting file \file{foomodule.so} must be copied into a directory +along the Python module search path. -\begin{itemize} -\item -The source filename must be \file{foomodule.c}, so the object -name is \file{foomodule.o}. +\subsection{SGI dynamic loading} -\item -The module must be written as a (statically linked) Python extension -module (described in an earlier chapter) except that no line for it -must be added to \file{config.c} and it mustn't be linked with the -main Python interpreter. +{bf IMPORTANT:} You must compile your extension module with the +additional C flag \samp{-G0} (or \samp{-G 0}). This instruct the +assembler to generate position-independent code. -\item -The module's initialization function must be called \code{initfoo}; it -must install the module in \code{sys.modules} (generally by calling -\code{initmodule()} as explained earlier. +You don't need to link the resulting \file{foomodule.o} file; just +copy it into a directory along the Python module search path. -\item -The module must be compiled with \samp{-c}. The resulting .o file must -not be stripped. +The first time your extension is loaded, it takes some extra time and +a few messages may be printed. This creates a file +\file{foomodule.ld} which is an image that can be loaded quickly into +the Python interpreter process. When a new Python interpreter is +installed, the \code{dl} package detects this and rebuilds +\file{foomodule.ld}. The file \file{foomodule.ld} is placed in the +directory where \file{foomodule.o} was found, unless this directory is +unwritable; in that case it is placed in a temporary +directory.\footnote{Check the manual page of the \code{dl} package for +details.} -\item -Since the module must include many standard Python include files, it -must be compiled with a \samp{-I} option pointing to the Python source -directory (unless it resides there itself). +If your extension modules uses additional system libraries, you must +create a file \file{foomodule.libs} in the same directory as the +\file{foomodule.o}. This file should contain one or more lines with +whitespace-separated options that will be passed to the linker --- +normally only \samp{-l} options or absolute pathnames of libraries +(\samp{.a} files) should be used. -\item -On SGI Irix, the compiler flag \samp{-G0} (or \samp{-G 0}) must be passed. -IF THIS IS NOT DONE THE RESULTING CODE WILL NOT WORK. -\item -{\bf NEW:} On SunOS and IRIX 5.x, you must create a shared library -from your \samp{.o} file using the following command (assuming your -module is called \code{foo}): +\subsection{GNU dynamic loading} -\begin{verbatim} - ld -o foomodule.so foomodule.o -\end{verbatim} +Just copy \file{foomodule.o} into a directory along the Python module +search path. -and place the resulting \samp{.so} file in the Python search path (not -the \samp{.o} file). Note: on Solaris, you need to pass \samp{-G} to -the loader; on IRIX 5.x, you need to pass \samp{-shared}. Sigh... - -\end{itemize} - - -\section{Using libraries} - -If your dynamically loadable module needs to be linked with one or -more libraries that aren't linked with Python (or if it needs a -routine that isn't used by Python from one of the libraries with which -Python is linked), you must specify a list of libraries to search -after loading the module in a file with extension \samp{.libs} (and -otherwise the same as your \samp{.o} file). This file should contain -one or more lines containing whitespace-separated absolute library -pathnames. When using the dl interface, \samp{-l...} flags may also -be used (it is in fact passed as an option list to the system linker -ld(1)), but the dl-dld interface requires absolute pathnames. I -believe it is possible to specify shared libraries here. - -(On SunOS, any extra libraries must be specified on the \code{ld} -command that creates the \samp{.so} file.) - - -\section{Caveats} - -Dynamic loading requires that \code{main}'s \code{argv[0]} contains -the pathname or at least filename of the Python interpreter. -Unfortunately, when executing a directly executable Python script (an -executable file with \samp{\#!...} on the first line), the kernel -overwrites \code{argv[0]} with the name of the script. There is no -easy way around this, so executable Python scripts cannot use -dynamically loaded modules. (You can always write a simple shell -script that calls the Python interpreter with the script as its -input.) - -When using dl, the overlay is first converted into an `overlay' for -the current process by the system linker (\code{ld}). The overlay is -saved as a file with extension \samp{.ld}, either in the directory -where the \samp{.o} file lives or (if that can't be written) in a -temporary directory. An existing \samp{.ld} file resulting from a -previous run (not from a temporary directory) is used, bypassing the -(costly) linking phase, provided its version matches the \samp{.o} -file and the current binary. (See the \code{dl} man page for more -details.) +If your extension modules uses additional system libraries, you must +create a file \file{foomodule.libs} in the same directory as the +\file{foomodule.o}. This file should contain one or more lines with +whitespace-separated absolute pathnames of libraries (\samp{.a} +files). No \samp{-l} options can be used. \input{ext.ind} diff --git a/Doc/ext/ext.tex b/Doc/ext/ext.tex index 6eeaacf..a7d4221 100644 --- a/Doc/ext/ext.tex +++ b/Doc/ext/ext.tex @@ -1,6 +1,6 @@ -\documentstyle[twoside,11pt,myformat,times]{report} +\documentstyle[twoside,11pt,myformat]{report} -\title{\bf Extending and Embedding the Python Interpreter} +\title{Extending and Embedding the Python Interpreter} \author{ Guido van Rossum \\ @@ -9,7 +9,7 @@ E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! % Tell \index to actually write the .idx file \makeindex @@ -51,18 +51,30 @@ system supports this feature. It is quite easy to add non-standard built-in modules to Python, if you know how to program in C. A built-in module known to the Python programmer as \code{foo} is generally implemented by a file called -\file{foomodule.c}. All but the most essential standard built-in +\file{foomodule.c}. All but the two most essential standard built-in modules also adhere to this convention, and in fact some of them form excellent examples of how to create an extension. Extension modules can do two things that can't be done directly in -Python: they can implement new data types, and they can make system -calls or call C library functions. Since the latter is usually the -most important reason for adding an extension, I'll concentrate on -adding `wrappers' around C library functions; the concrete example -uses the wrapper for +Python: they can implement new data types (which are different from +classes by the way), and they can make system calls or call C library +functions. Since the latter is usually the most important reason for +adding an extension, I'll concentrate on adding `wrappers' around C +library functions; the concrete example uses the wrapper for \code{system()} in module \code{posix}, found in (of course) the file -\file{posixmodule.c}. +\file{Modules/posixmodule.c}. + +Note: unless otherwise mentioned, all file references in this +document are relative to the toplevel directory of the Python +distribution --- i.e. the directory that contains the \file{configure} +script. + +The compilation of an extension module depends on your system setup +and the intended use of the module; details are given in a later +section. + + +\section{A first look at the code} It is important not to be impressed by the size and complexity of the average extension module; much of this is straightforward @@ -87,8 +99,8 @@ in \file{posixmodule.c} first: \end{verbatim} This is the prototypical top-level function in an extension module. -It will be called (we'll see later how this is made possible) when the -Python program executes statements like +It will be called (we'll see later how) when the Python program +executes statements like \begin{verbatim} >>> import posix @@ -96,71 +108,84 @@ Python program executes statements like \end{verbatim} There is a straightforward translation from the arguments to the call -in Python (here the single value \code{'ls -l'}) to the arguments that +in Python (here the single expression \code{'ls -l'}) to the arguments that are passed to the C function. The C function always has two -parameters, conventionally named \var{self} and \var{args}. In this -example, \var{self} will always be a \code{NULL} pointer, since this is a -function, not a method (this is done so that the interpreter doesn't -have to understand two different types of C functions). +parameters, conventionally named \var{self} and \var{args}. The +\var{self} argument is used when the C function implements a builtin +method --- this is advanced material and not covered in this document. +In the example, \var{self} will always be a \code{NULL} pointer, since +we are defining a function, not a method (this is done so that the +interpreter doesn't have to understand two different types of C +functions). The \var{args} parameter will be a pointer to a Python object, or \code{NULL} if the Python function/method was called without arguments. It is necessary to do full argument type checking on each call, since otherwise the Python user would be able to cause the -Python interpreter to `dump core' by passing the wrong arguments to a -function in an extension module (or no arguments at all). Because -argument checking and converting arguments to C is such a common task, -there's a general function in the Python interpreter which combines -these tasks: \code{getargs()}. It uses a template string to determine -both the types of the Python argument and the types of the C variables -into which it should store the converted values. (More about this -later.)\footnote{ -There are convenience macros \code{getstrarg()}, +Python interpreter to `dump core' by passing invalid arguments to a +function in an extension module. Because argument checking and +converting arguments to C are such common tasks, there's a general +function in the Python interpreter that combines them: +\code{getargs()}. It uses a template string to determine both the +types of the Python argument and the types of the C variables into +which it should store the converted values.\footnote{There are +convenience macros \code{getnoarg()}, \code{getstrarg()}, \code{getintarg()}, etc., for many common forms of \code{getargs()} -templates. These are relics from the past; it's better to call -\code{getargs()} directly.} +templates. These are relics from the past; the recommended practice +is to call \code{getargs()} directly.} (More about this later.) If \code{getargs()} returns nonzero, the argument list has the right type and its components have been stored in the variables whose addresses are passed. If it returns zero, an error has occurred. In -the latter case it has already raised an appropriate exception by -calling \code{err_setstr()}, so the calling function can just return -\code{NULL}. +the latter case it has already raised an appropriate exception by so +the calling function should return \code{NULL} immediately --- see the +next section. \section{Intermezzo: errors and exceptions} An important convention throughout the Python interpreter is the following: when a function fails, it should set an exception condition -and return an error value (often a NULL pointer). Exceptions are set -in a global variable in the file errors.c; if this variable is NULL no -exception has occurred. A second variable is the `associated value' -of the exception. - -The file errors.h declares a host of err_* functions to set various -types of exceptions. The most common one is \code{err_setstr()} --- its -arguments are an exception object (e.g. RuntimeError --- actually it -can be any string object) and a C string indicating the cause of the -error (this is converted to a string object and stored as the -`associated value' of the exception). Another useful function is +and return an error value (often a \code{NULL} pointer). Exceptions +are stored in a static global variable in \file{Python/errors.c}; if +this variable is \code{NULL} no exception has occurred. A second +static global variable stores the `associated value' of the exception +--- the second argument to \code{raise}. + +The file \file{errors.h} declares a host of functions to set various +types of exceptions. The most common one is \code{err_setstr()} --- +its arguments are an exception object (e.g. \code{RuntimeError} --- +actually it can be any string object) and a C string indicating the +cause of the error (this is converted to a string object and stored as +the `associated value' of the exception). Another useful function is \code{err_errno()}, which only takes an exception argument and constructs the associated value by inspection of the (UNIX) global -variable errno. +variable errno. The most general function is \code{err_set()}, which +takes two object arguments, the exception and its associated value. +You don't need to \code{INCREF()} the objects passed to any of these +functions. You can test non-destructively whether an exception has been set with \code{err_occurred()}. However, most code never calls \code{err_occurred()} to see whether an error occurred or not, but -relies on error return values from the functions it calls instead: +relies on error return values from the functions it calls instead. When a function that calls another function detects that the called -function fails, it should return an error value but not set an -condition --- one is already set. The caller is then supposed to also -return an error indication to *its* caller, again *without* calling -\code{err_setstr()}, and so on --- the most detailed cause of the error -was already reported by the function that detected it in the first -place. Once the error has reached Python's interpreter main loop, -this aborts the currently executing Python code and tries to find an -exception handler specified by the Python programmer. +function fails, it should return an error value (e.g. \code{NULL} or +\code{-1}) but not call one of the \code{err_*} functions --- one has +already been called. The caller is then supposed to also return an +error indication to {\em its} caller, again {\em without} calling +\code{err_*()}, and so on --- the most detailed cause of the error was +already reported by the function that first detected it. Once the +error has reached Python's interpreter main loop, this aborts the +currently executing Python code and tries to find an exception handler +specified by the Python programmer. + +(There are situations where a module can actually give a more detailed +error message by calling another \code{err_*} function, and in such +cases it is fine to do so. As a general rule, however, this is not +necessary, and can cause information about the cause of the error to +be lost: most operations can fail for a variety of reasons.) To ignore an exception set by a function call that failed, the exception condition must be cleared explicitly by calling @@ -170,8 +195,9 @@ interpreter but wants to handle it completely by itself (e.g. by trying something else or pretending nothing happened). Finally, the function \code{err_get()} gives you both error variables -*and clears them*. Note that even if an error occurred the second one -may be NULL. I doubt you will need to use this function. +{\em and clears them}. Note that even if an error occurred the second +one may be \code{NULL}. You have to \code{XDECREF()} both when you +are finished with them. I doubt you will need to use this function. Note that a failing \code{malloc()} call must also be turned into an exception --- the direct caller of \code{malloc()} (or @@ -180,70 +206,110 @@ indicator itself. All the object-creating functions (\code{newintobject()} etc.) already do this, so only if you call \code{malloc()} directly this note is of importance. -Also note that, with the important exception of \code{getargs()}, functions -that return an integer status usually use 0 for success and -1 for -failure. +Also note that, with the important exception of \code{getargs()}, +functions that return an integer status usually return \code{0} or a +positive value for success and \code{-1} for failure. -Finally, be careful about cleaning up garbage (making appropriate -[\code{X}]\code{DECREF()} calls) when you return an error! +Finally, be careful about cleaning up garbage (making \code{XDECREF()} +or \code{DECREF()} calls for objects you have already created) when +you return an error! + +The choice of which exception to raise is entirely yours. There are +predeclared C objects corresponding to all built-in Python exceptions, +e.g. \code{ZeroDevisionError} which you can use directly. Of course, +you should chose exceptions wisely --- don't use \code{TypeError} to +mean that a file couldn't be opened (that should probably be +\code{IOError}). If anything's wrong with the argument list the +\code{getargs()} function raises \code{TypeError}. If you have an +argument whose value which must be in a particular range or must +satisfy other conditions, \code{ValueError} is appropriate. + +You can also define a new exception that is unique to your module. +For this, you usually declare a static object variable at the +beginning of your file, e.g. + +\begin{verbatim} + static object *FooError; +\end{verbatim} + +and initialize it in your module's initialization function +(\code{initfoo()}) with a string object, e.g. (leaving out the error +checking for simplicity): + +\begin{verbatim} + void + initfoo() + { + object *m, *d; + m = initmodule("foo", foo_methods); + d = getmoduledict(m); + FooError = newstringobject("foo.error"); + dictinsert(d, "error", FooError); + } +\end{verbatim} \section{Back to the example} -Going back to posix_system, you should now be able to understand this -bit: +Going back to \code{posix_system()}, you should now be able to +understand this bit: \begin{verbatim} if (!getargs(args, "s", &command)) return NULL; \end{verbatim} -It returns NULL (the error indicator for functions of this kind) if an -error is detected in the argument list, relying on the exception set -by \code{getargs()}. The string value of the argument is now copied to the -local variable 'command'. +It returns \code{NULL} (the error indicator for functions of this +kind) if an error is detected in the argument list, relying on the +exception set by \code{getargs()}. Otherwise the string value of the +argument has been copied to the local variable \code{command} --- this +is in fact just a pointer assignment and you are not supposed to +modify the string to which it points. -If a Python function is called with multiple arguments, the argument -list is turned into a tuple. Python programs can us this feature, for -instance, to explicitly create the tuple containing the arguments -first and make the call later. +If a function is called with multiple arguments, the argument list +(the argument \code{args}) is turned into a tuple. If it is called +without arguments, \code{args} is \code{NULL}. \code{getargs()} knows +about this; see later. -The next statement in posix_system is a call tothe C library function -\code{system()}, passing it the string we just got from \code{getargs()}: +The next statement in \code{posix_system()} is a call to the C library +function \code{system()}, passing it the string we just got from +\code{getargs()}: \begin{verbatim} sts = system(command); \end{verbatim} -Python strings may contain internal null bytes; but if these occur in -this example the rest of the string will be ignored by \code{system()}. - -Finally, posix.\code{system()} must return a value: the integer status -returned by the C library \code{system()} function. This is done by the -function \code{newintobject()}, which takes a (long) integer as parameter. +Finally, \code{posix.system()} must return a value: the integer status +returned by the C library \code{system()} function. This is done +using the function \code{mkvalue()}, which is something like the +inverse of \code{getargs()}: it takes a format string and a variable +number of C values and returns a new Python object. \begin{verbatim} - return newintobject((long)sts); + return mkvalue("i", sts); \end{verbatim} -(Yes, even integers are represented as objects on the heap in Python!) -If you had a function that returned no useful argument, you would need -this idiom: +In this case, it returns an integer object (yes, even integers are +objects on the heap in Python!). More info on \code{mkvalue()} is +given later. + +If you had a function that returned no useful argument (a.k.a. a +procedure), you would need this idiom: \begin{verbatim} INCREF(None); return None; \end{verbatim} -'None' is a unique Python object representing 'no value'. It differs -from NULL, which means 'error' in most contexts (except when passed as -a function argument --- there it means 'no arguments'). +\code{None} is a unique Python object representing `no value'. It +differs from \code{NULL}, which means `error' in most contexts. \section{The module's function table} I promised to show how I made the function \code{posix_system()} -available to Python programs. This is shown later in posixmodule.c: +callable from Python programs. This is shown later in +\file{Modules/posixmodule.c}: \begin{verbatim} static struct methodlist posix_methods[] = { @@ -260,78 +326,72 @@ available to Python programs. This is shown later in posixmodule.c: } \end{verbatim} -(The actual \code{initposix()} is somewhat more complicated, but most -extension modules are indeed as simple as that.) When the Python -program first imports module 'posix', \code{initposix()} is called, -which calls \code{initmodule()} with specific parameters. This -creates a module object (which is inserted in the table sys.modules -under the key 'posix'), and adds built-in-function objects to the -newly created module based upon the table (of type struct methodlist) -that was passed as its second parameter. The function -\code{initmodule()} returns a pointer to the module object that it -creates, but this is unused here. It aborts with a fatal error if the -module could not be initialized satisfactorily. - - -\section{Calling the module initialization function} - -There is one more thing to do: telling the Python module to call the -\code{initfoo()} function when it encounters an 'import foo' statement. -This is done in the file config.c. This file contains a table mapping -module names to parameterless void function pointers. You need to add -a declaration of \code{initfoo()} somewhere early in the file, and a -line saying +(The actual \code{initposix()} is somewhat more complicated, but many +extension modules can be as simple as shown here.) When the Python +program first imports module \code{posix}, \code{initposix()} is +called, which calls \code{initmodule()} with specific parameters. +This creates a `module object' (which is inserted in the table +\code{sys.modules} under the key \code{'posix'}), and adds +built-in-function objects to the newly created module based upon the +table (of type struct methodlist) that was passed as its second +parameter. The function \code{initmodule()} returns a pointer to the +module object that it creates (which is unused here). It aborts with +a fatal error if the module could not be initialized satisfactorily, +so you don't need to check for errors. + + +\section{Compilation and linkage} + +There are two more things to do before you can use your new extension +module: compiling and linking it with the Python system. If you use +dynamic loading, the details depend on the style of dynamic loading +your system uses; see the chapter on Dynamic Loading for more info +about this. + +If you can't use dynamic loading, or if you want to make your module a +permanent part of the Python interpreter, you will have to change the +configuration setup and rebuild the interpreter. Luckily, in the 1.0 +release this is very simple: just place your file (named +\file{foomodule.c} for example) in the \file{Modules} directory, add a +line to the file \file{Modules/Setup} describing your file: \begin{verbatim} - {"foo", initfoo}, + foo foomodule.o \end{verbatim} -to the initializer for inittab[]. It is conventional to include both -the declaration and the initializer line in preprocessor commands -\code{\#ifdef USE_FOO} / \code{\#endif}, to make it easy to turn the -foo extension on or off. Note that the Macintosh version uses a -different configuration file, distributed as configmac.c. This -strategy may be extended to other operating system versions, although -usually the standard config.c file gives a pretty useful starting -point for a new config*.c file. - -And, of course, I forgot the Makefile. This is actually not too hard, -just follow the examples for, say, AMOEBA. Just find all occurrences -of the string AMOEBA in the Makefile and do the same for FOO that's -done for AMOEBA... - -(Note: if you are using dynamic loading for your extension, you don't -need to edit config.c and the Makefile. See \file{./DYNLOAD} for more -info about this.) +and rebuild the interpreter by running \code{make} in the toplevel +directory. You can also run \code{make} in the \file{Modules} +subdirectory, but then you must first rebuilt the \file{Makefile} +there by running \code{make Makefile}. (This is necessary each time +you change the \file{Setup} file.) \section{Calling Python functions from C} -The above concentrates on making C functions accessible to the Python -programmer. The reverse is also often useful: calling Python -functions from C. This is especially the case for libraries that -support so-called `callback' functions. If a C interface makes heavy -use of callbacks, the equivalent Python often needs to provide a -callback mechanism to the Python programmer; the implementation may -require calling the Python callback functions from a C callback. -Other uses are also possible. +So far we have concentrated on making C functions callable from +Python. The reverse is also useful: calling Python functions from C. +This is especially the case for libraries that support so-called +`callback' functions. If a C interface makes use of callbacks, the +equivalent Python often needs to provide a callback mechanism to the +Python programmer; the implementation will require calling the Python +callback functions from a C callback. Other uses are also imaginable. Fortunately, the Python interpreter is easily called recursively, and -there is a standard interface to call a Python function. I won't +there is a standard interface to call a Python function. (I won't dwell on how to call the Python parser with a particular string as input --- if you're interested, have a look at the implementation of -the \samp{-c} command line option in pythonmain.c. +the \samp{-c} command line option in \file{Python/pythonmain.c}.) Calling a Python function is easy. First, the Python program must somehow pass you the Python function object. You should provide a function (or some other interface) to do this. When this function is called, save a pointer to the Python function object (be careful to -INCREF it!) in a global variable --- or whereever you see fit. +\code{INCREF()} it!) in a global variable --- or whereever you see fit. For example, the following function might be part of a module definition: \begin{verbatim} - static object *my_callback; + static object *my_callback = NULL; static object * my_set_callback(dummy, arg) @@ -346,29 +406,49 @@ definition: } \end{verbatim} +This particular function doesn't do any typechecking on its argument +--- that will be done by \code{call_object()}, which is a bit late but +at least protects the Python interpreter from shooting itself in its +foot. (The problem with typechecking functions is that there are at +least five different Python object types that can be called, so the +test would be somewhat cumbersome.) + +The macros \code{XINCREF()} and \code{XDECREF()} increment/decrement +the reference count of an object and are safe in the presence of +\code{NULL} pointers. More info on them in the section on Reference +Counts below. + Later, when it is time to call the function, you call the C function \code{call_object()}. This function has two arguments, both pointers -to arbitrary Python objects: the Python function, and the argument. -The argument can be NULL to call the function without arguments. For -example: +to arbitrary Python objects: the Python function, and the argument +list. The argument list must always be a tuple object, whose length +is the number of arguments. To call the Python function with no +arguments, you must pass an empty tuple. For example: \begin{verbatim} + object *arglist; object *result; ... /* Time to call the callback */ - result = call_object(my_callback, (object *)NULL); + arglist = mktuple(0); + result = call_object(my_callback, arglist); + DECREF(arglist); \end{verbatim} \code{call_object()} returns a Python object pointer: this is the return value of the Python function. \code{call_object()} is -`reference-count-neutral' with respect to its arguments, but the -return value is `new': either it is a brand new object, or it is an -existing object whose reference count has been incremented. So, you -should somehow apply DECREF to the result, even (especially!) if you -are not interested in its value. +`reference-count-neutral' with respect to its arguments. In the +example a new tuple was created to serve as the argument list, which +is \code{DECREF()}-ed immediately after the call. + +The return value of \code{call_object()} is `new': either it is a +brand new object, or it is an existing object whose reference count +has been incremented. So, unless you want to save it in a global +variable, you should somehow \code{DECREF()} the result, even +(especially!) if you are not interested in its value. Before you do this, however, it is important to check that the return -value isn't NULL. If it is, the Python function terminated by raising +value isn't \code{NULL}. If it is, the Python function terminated by raising an exception. If the C code that called \code{call_object()} is called from Python, it should now return an error indication to its Python caller, so the interpreter can print a stack trace, or the @@ -384,21 +464,21 @@ or desirable, the exception should be cleared by calling \end{verbatim} Depending on the desired interface to the Python callback function, -you may also have to provide an argument to \code{call_object()}. In -some cases the argument is also provided by the Python program, -through the same interface that specified the callback function. It -can then be saved and used in the same manner as the function object. -In other cases, you may have to construct a new object to pass as -argument. In this case you must dispose of it as well. For example, -if you want to pass an integral event code, you might use the -following code: +you may also have to provide an argument list to \code{call_object()}. +In some cases the argument list is also provided by the Python +program, through the same interface that specified the callback +function. It can then be saved and used in the same manner as the +function object. In other cases, you may have to construct a new +tuple to pass as the argument list. The simplest way to do this is to +call \code{mkvalue()}. For example, if you want to pass an integral +event code, you might use the following code: \begin{verbatim} - object *argument; + object *arglist; ... - argument = newintobject((long)eventcode); - result = call_object(my_callback, argument); - DECREF(argument); + arglist = mkvalue("(l)", eventcode); + result = call_object(my_callback, arglist); + DECREF(arglist); if (result == NULL) return NULL; /* Pass error back */ /* Here maybe use the result */ @@ -407,19 +487,8 @@ following code: Note the placement of DECREF(argument) immediately after the call, before the error check! Also note that strictly spoken this code is -not complete: \code{newintobject()} may run out of memory, and this -should be checked. - -In even more complicated cases you may want to pass the callback -function multiple arguments. To this end you have to construct (and -dispose of!) a tuple object. Details (mostly concerned with the -errror checks and reference count manipulation) are left as an -exercise for the reader; most of this is also needed when returning -multiple values from a function. - -XXX TO DO: explain objects. - -XXX TO DO: defining new object types. +not complete: \code{mkvalue()} may run out of memory, and this should +be checked. \section{Format strings for {\tt getargs()}} @@ -433,69 +502,78 @@ follows: The remaining arguments must be addresses of variables whose type is determined by the format string. For the conversion to succeed, the -`arg' object must match the format and the format must be exhausted. +\var{arg} object must match the format and the format must be exhausted. Note that while \code{getargs()} checks that the Python object really -is of the specified type, it cannot check that the addresses provided -in the call match: if you make mistakes there, your code will probably -dump core. +is of the specified type, it cannot check the validity of the +addresses of C variables provided in the call: if you make mistakes +there, your code will probably dump core. -A format string consists of a single `format unit'. A format unit -describes one Python object; it is usually a single character or a -parenthesized string. The type of a format units is determined from -its first character, the `format letter': +A non-empty format string consists of a single `format unit'. A +format unit describes one Python object; it is usually a single +character or a parenthesized sequence of format units. The type of a +format units is determined from its first character, the `format +letter': \begin{description} \item[\samp{s} (string)] The Python object must be a string object. The C argument must be a -char** (i.e. the address of a character pointer), and a pointer to -the C string contained in the Python object is stored into it. If the -next character in the format string is \samp{\#}, another C argument -of type int* must be present, and the length of the Python string (not -counting the trailing zero byte) is stored into it. +\code{(char**)} (i.e. the address of a character pointer), and a pointer +to the C string contained in the Python object is stored into it. You +must not provide storage to store the string; a pointer to an existing +string is stored into the character pointer variable whose address you +pass. If the next character in the format string is \samp{\#}, +another C argument of type \code{(int*)} must be present, and the +length of the Python string (not counting the trailing zero byte) is +stored into it. \item[\samp{z} (string or zero, i.e. \code{NULL})] Like \samp{s}, but the object may also be None. In this case the -string pointer is set to NULL and if a \samp{\#} is present the size -it set to 0. +string pointer is set to \code{NULL} and if a \samp{\#} is present the +size is set to 0. \item[\samp{b} (byte, i.e. char interpreted as tiny int)] -The object must be a Python integer. The C argument must be a char*. +The object must be a Python integer. The C argument must be a +\code{(char*)}. \item[\samp{h} (half, i.e. short)] -The object must be a Python integer. The C argument must be a short*. +The object must be a Python integer. The C argument must be a +\code{(short*)}. \item[\samp{i} (int)] -The object must be a Python integer. The C argument must be an int*. +The object must be a Python integer. The C argument must be an +\code{(int*)}. \item[\samp{l} (long)] The object must be a (plain!) Python integer. The C argument must be -a long*. +a \code{(long*)}. \item[\samp{c} (char)] The Python object must be a string of length 1. The C argument must -be a char*. (Don't pass an int*!) +be a \code{(char*)}. (Don't pass an \code{(int*)}!) \item[\samp{f} (float)] The object must be a Python int or float. The C argument must be a -float*. +\code{(float*)}. \item[\samp{d} (double)] The object must be a Python int or float. The C argument must be a -double*. +\code{(double*)}. \item[\samp{S} (string object)] The object must be a Python string. The C argument must be an -object** (i.e. the address of an object pointer). The C program thus -gets back the actual string object that was passed, not just a pointer -to its array of characters and its size as for format character -\samp{s}. +\code{(object**)} (i.e. the address of an object pointer). The C +program thus gets back the actual string object that was passed, not +just a pointer to its array of characters and its size as for format +character \samp{s}. The reference count of the object has not been +increased. \item[\samp{O} (object)] -The object can be any Python object, including None, but not NULL. -The C argument must be an object**. This can be used if an argument -list must contain objects of a type for which no format letter exist: -the caller must then check that it has the right type. +The object can be any Python object, including None, but not +\code{NULL}. The C argument must be an \code{(object**)}. This can be +used if an argument list must contain objects of a type for which no +format letter exist: the caller must then check that it has the right +type. The reference count of the object has not been increased. \item[\samp{(} (tuple)] The object must be a Python tuple. Following the \samp{(} character @@ -504,15 +582,15 @@ elements of the tuple, followed by a \samp{)} character. Tuple format units may be nested. (There are no exceptions for empty and singleton tuples; \samp{()} specifies an empty tuple and \samp{(i)} a singleton of one integer. Normally you don't want to use the latter, -since it is hard for the user to specify. +since it is hard for the Python user to specify. \end{description} More format characters will probably be added as the need arises. It -should be allowed to use Python long integers whereever integers are -expected, and perform a range check. (A range check is in fact always -necessary for the \samp{b}, \samp{h} and \samp{i} format -letters, but this is currently not implemented.) +should (but currently isn't) be allowed to use Python long integers +whereever integers are expected, and perform a range check. (A range +check is in fact always necessary for the \samp{b}, \samp{h} and +\samp{i} format letters, but this is currently not implemented.) Some example calls: @@ -523,14 +601,14 @@ Some example calls: char *s; int size; - ok = getargs(args, "(lls)", &k, &l, &s); /* Two longs and a string */ - /* Possible Python call: f(1, 2, 'three') */ + ok = getargs(args, ""); /* No arguments */ + /* Python call: f() */ ok = getargs(args, "s", &s); /* A string */ /* Possible Python call: f('whoops!') */ - ok = getargs(args, ""); /* No arguments */ - /* Python call: f() */ + ok = getargs(args, "(lls)", &k, &l, &s); /* Two longs and a string */ + /* Possible Python call: f(1, 2, 'three') */ ok = getargs(args, "((ii)s#)", &i, &j, &s, &size); /* A pair of ints and a string, whose size is also returned */ @@ -546,9 +624,13 @@ Some example calls: } \end{verbatim} -Note that a format string must consist of a single unit; strings like -\samp{is} and \samp{(ii)s\#} are not valid format strings. (But -\samp{s\#} is.) +Note that the `top level' of a non-empty format string must consist of +a single unit; strings like \samp{is} and \samp{(ii)s\#} are not valid +format strings. (But \samp{s\#} is.) If you have multiple arguments, +the format must therefore always be enclosed in parentheses, as in the +examples \samp{((ii)s\#)} and \samp{(((ii)(ii))(ii)}. (The current +implementation does not complain when more than one unparenthesized +format unit is given. Sorry.) The \code{getargs()} function does not support variable-length argument lists. In simple cases you can fake these by trying several @@ -575,7 +657,7 @@ calls to \end{verbatim} (It is possible to think of an extension to the definition of format -strings to accomodate this directly, e.g., placing a \samp{|} in a +strings to accommodate this directly, e.g. placing a \samp{|} in a tuple might specify that the remaining arguments are optional. \code{getargs()} should then return one more than the number of variables stored into.) @@ -583,13 +665,13 @@ variables stored into.) Advanced users note: If you set the `varargs' flag in the method list for a function, the argument will always be a tuple (the `raw argument list'). In this case you must enclose single and empty argument lists -in parentheses, e.g., \samp{(s)} and \samp{()}. +in parentheses, e.g. \samp{(s)} and \samp{()}. \section{The {\tt mkvalue()} function} This function is the counterpart to \code{getargs()}. It is declared -in \file{modsupport.h} as follows: +in \file{Include/modsupport.h} as follows: \begin{verbatim} object *mkvalue(char *format, ...); @@ -607,7 +689,7 @@ second argument specifies the length of the data (negative means use argument (so you should \code{DECREF()} it if you've just created it and aren't going to use it again). -If the argument for \samp{O} or \samp{S} is a NULL pointer, it is +If the argument for \samp{O} or \samp{S} is a \code{NULL} pointer, it is assumed that this was caused because the call producing the argument found an error and set an exception. Therefore, \code{mkvalue()} will return \code{NULL} but won't set an exception if one is already set. @@ -634,8 +716,10 @@ one argument is expected.) Here's a useful explanation of \code{INCREF()} and \code{DECREF()} (after an original by Sjoerd Mullender). -Use \code{XINCREF()} or \code{XDECREF()} instead of \code{INCREF()} / -\code{DECREF()} when the argument may be \code{NULL}. +Use \code{XINCREF()} or \code{XDECREF()} instead of \code{INCREF()} or +\code{DECREF()} when the argument may be \code{NULL} --- the versions +without \samp{X} are faster but wull dump core when they encounter a +\code{NULL} pointer. The basic idea is, if you create an extra reference to an object, you must \code{INCREF()} it, if you throw away a reference to an object, @@ -696,7 +780,7 @@ which you keep references in your object, but you should not use \code{DECREF()} on your object. You should use \code{DEL()} instead. -\section{Using C++} +\section{Writing extensions in C++} It is possible to write extension modules in C++. Some restrictions apply: since the main program (the Python interpreter) is compiled and @@ -733,10 +817,10 @@ lower-level operations described in the previous chapters to construct and use Python objects. A simple demo of embedding Python can be found in the directory -\file{/embed}. +\file{Demo/embed}. -\section{Using C++} +\section{Embedding Python in C++} It is also possible to embed Python in a C++ program; how this is done exactly will depend on the details of the C++ system used; in general @@ -747,13 +831,16 @@ recompile Python itself with C++. \chapter{Dynamic Loading} -On some systems (e.g., SunOS, SGI Irix) it is possible to configure -Python to support dynamic loading of modules implemented in C. Once -configured and installed it's trivial to use: if a Python program +On most modern systems it is possible to configure Python to support +dynamic loading of extension modules implemented in C. When shared +libraries are used dynamic loading is configured automatically; +otherwise you have to select it as a build option (see below). Once +configured, dynamic loading is trivial to use: when a Python program executes \code{import foo}, the search for modules tries to find a -file \file{foomodule.o} in the module search path, and if one is -found, it is linked with the executing binary and executed. Once -linked, the module acts just like a built-in module. +file \file{foomodule.o} (\file{foomodule.so} when using shared +libraries) in the module search path, and if one is found, it is +loaded into the executing binary and executed. Once loaded, the +module acts just like a built-in extension module. The advantages of dynamic loading are twofold: the `core' Python binary gets smaller, and users can extend Python with their own @@ -762,150 +849,167 @@ own copy of the Python interpreter. There are also disadvantages: dynamic loading isn't available on all systems (this just means that on some systems you have to use static loading), and dynamically loading a module that was compiled for a different version of Python -(e.g., with a different representation of objects) may dump core. - -{\bf NEW:} Under SunOS (all versions) and IRIX 5.x, dynamic loading -now uses shared libraries and is always configured. See at the -end of this chapter for how to create a dynamically loadable module. +(e.g. with a different representation of objects) may dump core. \section{Configuring and building the interpreter for dynamic loading} -(Ignore this section for SunOS and IRIX 5.x --- on these systems -dynamic loading is always configured.) +There are three styles of dynamic loading: one using shared libraries, +one using SGI IRIX 4 dynamic loading, and one using GNU dynamic +loading. + +\subsection{Shared libraries} + +The following systems supports dynamic loading using shared libraries: +SunOS 4; Solaris 2; SGI IRIX 5 (but not SGI IRIX 4!); and probably all +systems derived from SVR4, or at least those SVR4 derivatives that +support shared libraries (are there any that don't?). + +You don't need to do anything to configure dynamic loading on these +systems --- the \file{configure} detects the presence of the +\file{} header file and automatically configures dynamic +loading. + +\subsection{SGI dynamic loading} + +Only SGI IRIX 4 supports dynamic loading of modules using SGI dynamic +loading. (SGI IRIX 5 might also support it but it is inferior to +using shared libraries so there is no reason to; a small test didn't +work right away so I gave up trying to support it.) + +Before you build Python, you first need to fetch and build the \code{dl} +package written by Jack Jansen. This is available by anonymous ftp +from host \file{ftp.cwi.nl}, directory \file{pub/dynload}, file +\file{dl-1.6.tar.Z}. (The version number may change.) Follow the +instructions in the package's \file{README} file to build it. + +Once you have built \code{dl}, you can configure Python to use it. To +this end, you run the \file{configure} script with the option +\code{--with-dl=\var{directory}} where \var{directory} is the absolute +pathname of the \code{dl} directory. + +Now build and install Python as you normally would (see the +\file{README} file in the toplevel Python directory.) + +\subsection{GNU dynamic loading} + +GNU dynamic loading supports (according to its \file{README} file) the +following hardware and software combinations: VAX (Ultrix), Sun 3 +(SunOS 3.4 and 4.0), Sparc (SunOS 4.0), Sequent Symmetry (Dynix), and +Atari ST. There is no reason to use it on a Sparc; I haven't seen a +Sun 3 for years so I don't know if these have shared libraries or not. + +You need to fetch and build two packages. One is GNU DLD 3.2.3, +available by anonymous ftp from host \file{ftp.cwi.nl}, directory +\file{pub/dynload}, file \file{dld-3.2.3.tar.Z}. (As far as I know, +no further development on GNU DLD is being done.) The other is an +emulation of Jack Jansen's \code{dl} package that I wrote on top of +GNU DLD 3.2.3. This is available from the same host and directory, +file dl-dld-1.1.tar.Z. (The version number may change --- but I doubt +it will.) Follow the instructions in each package's \file{README} +file to configure build them. + +Now configure Python. Run the \file{configure} script with the option +\code{--with-dl-dld=\var{dl-directory},\var{dld-directory}} where +\var{dl-directory} is the absolute pathname of the directory where you +have built the \file{dl-dld} package, and \var{dld-directory} is that +of the GNU DLD package. The Python interpreter you build hereafter +will support GNU dynamic loading. + -Dynamic loading is a little complicated to configure, since its -implementation is extremely system dependent, and there are no -really standard libraries or interfaces for it. I'm using an -extremely simple interface, which basically needs only one function: +\section{Building a dynamically loadable module} + +Since there are three styles of dynamic loading, there are also three +groups of instructions for building a dynamically loadable module. +Instructions common for all three styles are given first. Assuming +your module is called \code{foo}, the source filename must be +\file{foomodule.c}, so the object name is \file{foomodule.o}. The +module must be written as a normal Python extension module (as +described earlier). + +Note that in all cases you will have to create your own Makefile that +compiles your module file(s). This Makefile will have to pass two +\samp{-I} arguments to the C compiler which will make it find the +Python header files. If the Make variable \var{PYTHONTOP} points to +the toplevel Python directory, your \var{CFLAGS} Make variable should +contain the options \samp{-I\$(PYTHONTOP) -I\$(PYTHONTOP)/Include}. +(Most header files are in the \file{Include} subdirectory, but the +\file{config.h} header lives in the toplevel directory.) You must +also add \samp{-DHAVE_CONFIG_H} to the definition of \var{CFLAGS} to +direct the Python headers to include \file{config.h}. + + +\subsection{Shared libraries} + +You must link the \samp{.o} file to produce a shared library. This is +done using a special invocation of the \UNIX{} loader/linker, {\em +ld}(1). Unfortunately the invocation differs slightly per system. + +On SunOS 4, use +\begin{verbatim} + ld foomodule.o -o foomodule.so +\end{verbatim} +On Solaris 2, use \begin{verbatim} - funcptr = dl_loadmod(binary, object, function) + ld -G foomodule.o -o foomodule.so \end{verbatim} -where \code{binary} is the pathname of the currently executing program -(not just \code{argv[0]}!), \code{object} is the name of the \samp{.o} -file to be dynamically loaded, and \code{function} is the name of a -function in the module. If the dynamic loading succeeds, -\code{dl_loadmod()} returns a pointer to the named function; if not, it -returns \code{NULL}. - -I provide two implementations of \code{dl_loadmod()}: one for SGI machines -running Irix 4.0 (written by my colleague Jack Jansen), and one that -is a thin interface layer for Wilson Ho's (GNU) dynamic loading -package \dfn{dld} (version 3.2.3). Dld implements a much more powerful -version of dynamic loading than needed (including unlinking), but it -does not support System V's COFF object file format. It currently -supports only VAX (Ultrix), Sun 3 (SunOS 3.4 and 4.0), SPARCstation -(SunOS 4.0), Sequent Symmetry (Dynix), and Atari ST (from the dld -3.2.3 README file). Dld is part of the standard Python distribution; -if you didn't get it,many ftp archive sites carry dld these days, so -it won't be hard to get hold of it if you need it (using archie). - -(If you don't know where to get dld, try anonymous ftp to -\file{wuarchive.wustl.edu:/mirrors2/gnu/dld-3.2.3.tar.Z}. Jack's dld -can be found at \file{ftp.cwi.nl:/pub/python/dl.tar.Z}.) - -To build a Python interpreter capable of dynamic loading, you need to -edit the Makefile. Basically you must uncomment the lines starting -with \samp{\#DL_}, but you must also edit some of the lines to choose -which version of dl_loadmod to use, and fill in the pathname of the dld -library if you use it. And, of course, you must first build -dl_loadmod and dld, if used. (This is now done through the Configure -script. For SunOS and IRIX 5.x, everything is now automatic.) +On SGI IRIX 5, use +\begin{verbatim} + ld -shared foomodule.o -o foomodule.so +\end{verbatim} +On other systems, consult the manual page for {\em ld}(1) to find what +flags, if any, must be used. -\section{Building a dynamically loadable module} +If your extension module uses system libraries that haven't already +been linked with Python (e.g. a windowing system), these must be +passed to the {\em ld} command as \samp{-l} options after the +\samp{.o} file. -Building an object file usable by dynamic loading is easy, if you -follow these rules (substitute your module name for \code{foo} -everywhere): +The resulting file \file{foomodule.so} must be copied into a directory +along the Python module search path. -\begin{itemize} -\item -The source filename must be \file{foomodule.c}, so the object -name is \file{foomodule.o}. +\subsection{SGI dynamic loading} -\item -The module must be written as a (statically linked) Python extension -module (described in an earlier chapter) except that no line for it -must be added to \file{config.c} and it mustn't be linked with the -main Python interpreter. +{bf IMPORTANT:} You must compile your extension module with the +additional C flag \samp{-G0} (or \samp{-G 0}). This instruct the +assembler to generate position-independent code. -\item -The module's initialization function must be called \code{initfoo}; it -must install the module in \code{sys.modules} (generally by calling -\code{initmodule()} as explained earlier. +You don't need to link the resulting \file{foomodule.o} file; just +copy it into a directory along the Python module search path. -\item -The module must be compiled with \samp{-c}. The resulting .o file must -not be stripped. +The first time your extension is loaded, it takes some extra time and +a few messages may be printed. This creates a file +\file{foomodule.ld} which is an image that can be loaded quickly into +the Python interpreter process. When a new Python interpreter is +installed, the \code{dl} package detects this and rebuilds +\file{foomodule.ld}. The file \file{foomodule.ld} is placed in the +directory where \file{foomodule.o} was found, unless this directory is +unwritable; in that case it is placed in a temporary +directory.\footnote{Check the manual page of the \code{dl} package for +details.} -\item -Since the module must include many standard Python include files, it -must be compiled with a \samp{-I} option pointing to the Python source -directory (unless it resides there itself). +If your extension modules uses additional system libraries, you must +create a file \file{foomodule.libs} in the same directory as the +\file{foomodule.o}. This file should contain one or more lines with +whitespace-separated options that will be passed to the linker --- +normally only \samp{-l} options or absolute pathnames of libraries +(\samp{.a} files) should be used. -\item -On SGI Irix, the compiler flag \samp{-G0} (or \samp{-G 0}) must be passed. -IF THIS IS NOT DONE THE RESULTING CODE WILL NOT WORK. -\item -{\bf NEW:} On SunOS and IRIX 5.x, you must create a shared library -from your \samp{.o} file using the following command (assuming your -module is called \code{foo}): +\subsection{GNU dynamic loading} -\begin{verbatim} - ld -o foomodule.so foomodule.o -\end{verbatim} +Just copy \file{foomodule.o} into a directory along the Python module +search path. -and place the resulting \samp{.so} file in the Python search path (not -the \samp{.o} file). Note: on Solaris, you need to pass \samp{-G} to -the loader; on IRIX 5.x, you need to pass \samp{-shared}. Sigh... - -\end{itemize} - - -\section{Using libraries} - -If your dynamically loadable module needs to be linked with one or -more libraries that aren't linked with Python (or if it needs a -routine that isn't used by Python from one of the libraries with which -Python is linked), you must specify a list of libraries to search -after loading the module in a file with extension \samp{.libs} (and -otherwise the same as your \samp{.o} file). This file should contain -one or more lines containing whitespace-separated absolute library -pathnames. When using the dl interface, \samp{-l...} flags may also -be used (it is in fact passed as an option list to the system linker -ld(1)), but the dl-dld interface requires absolute pathnames. I -believe it is possible to specify shared libraries here. - -(On SunOS, any extra libraries must be specified on the \code{ld} -command that creates the \samp{.so} file.) - - -\section{Caveats} - -Dynamic loading requires that \code{main}'s \code{argv[0]} contains -the pathname or at least filename of the Python interpreter. -Unfortunately, when executing a directly executable Python script (an -executable file with \samp{\#!...} on the first line), the kernel -overwrites \code{argv[0]} with the name of the script. There is no -easy way around this, so executable Python scripts cannot use -dynamically loaded modules. (You can always write a simple shell -script that calls the Python interpreter with the script as its -input.) - -When using dl, the overlay is first converted into an `overlay' for -the current process by the system linker (\code{ld}). The overlay is -saved as a file with extension \samp{.ld}, either in the directory -where the \samp{.o} file lives or (if that can't be written) in a -temporary directory. An existing \samp{.ld} file resulting from a -previous run (not from a temporary directory) is used, bypassing the -(costly) linking phase, provided its version matches the \samp{.o} -file and the current binary. (See the \code{dl} man page for more -details.) +If your extension modules uses additional system libraries, you must +create a file \file{foomodule.libs} in the same directory as the +\file{foomodule.o}. This file should contain one or more lines with +whitespace-separated absolute pathnames of libraries (\samp{.a} +files). No \samp{-l} options can be used. \input{ext.ind} diff --git a/Doc/fix.el b/Doc/fix.el index 25086e4..f36d6f0 100644 --- a/Doc/fix.el +++ b/Doc/fix.el @@ -1,6 +1,5 @@ ; load the new texinfo package (2.xx) if not installed by default -; (setq load-path -; (cons "/ufs/jh/lib/emacs/texinfo-2.14" load-path)) -(find-file "lib.texi") +; (setq load-path (cons "/ufs/guido/lib/emacs/texinfo-2.14" load-path)) +(find-file "@lib.texi") (texinfo-all-menus-update t) (texinfo-all-menus-update t) diff --git a/Doc/fix_hack b/Doc/fix_hack index 8c97729..8dad111 100755 --- a/Doc/fix_hack +++ b/Doc/fix_hack @@ -1 +1,2 @@ +#!/bin/sh sed -e 's/{\\ptt[ ]*\\char[ ]*'"'"'137}/_/g' <"$1" > "@$1" && mv "@$1" $1 diff --git a/Doc/info/texipre.dat b/Doc/info/texipre.dat index c531077..f8cc166 100644 --- a/Doc/info/texipre.dat +++ b/Doc/info/texipre.dat @@ -14,7 +14,7 @@ the language, see the Python Tutorial. The Python Reference Manual gives a more formal definition of the language. (These manuals are not yet available in INFO or Texinfo format.) -Copyright (C) 1991, 1992, 1993 by Stichting Mathematisch Centrum, +Copyright (C) 1991, 1992, 1993, 1994 by Stichting Mathematisch Centrum, Amsterdam, The Netherlands. All Rights Reserved @@ -43,7 +43,7 @@ OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. @c The following two commands start the copyright page. @page @vskip 0pt plus 1filll -Copyright @copyright{} 1991, 1992, 1993 by Stichting Mathematisch Centrum, +Copyright @copyright{} 1991, 1992, 1993, 1994 by Stichting Mathematisch Centrum, Amsterdam, The Netherlands. @center All Rights Reserved @@ -77,7 +77,7 @@ the language, see the @cite{Python Tutorial}. The @cite{Python Reference Manual} gives a more formal definition of the language. (These manuals are not yet available in INFO or Texinfo format.) -This version corresponds roughly to Python version 1.0 (yet to be released). +This version corresponds to Python version 1.0.2. @end ifinfo diff --git a/Doc/lib.tex b/Doc/lib.tex index e7ef71e..7b4f724 100644 --- a/Doc/lib.tex +++ b/Doc/lib.tex @@ -1,9 +1,6 @@ \documentstyle[twoside,11pt,myformat]{report} -%\includeonly{lib5} -\title{\bf - Python Library Reference -} +\title{Python Library Reference} \author{ Guido van Rossum \\ @@ -12,14 +9,13 @@ E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! + +\makeindex % tell \index to actually write the .idx file -% Tell \index to actually write the .idx file -\makeindex \begin{document} -%\showthe\fam -%\showthe\ttfam + \pagenumbering{roman} \maketitle @@ -46,12 +42,75 @@ language. \pagebreak \pagenumbering{arabic} -\include{lib1} % intro; built-in types, functions and exceptions -\include{lib2} % built-in modules -\include{lib3} % standard modules -\include{lib4} % Most OS'es; UNIX only; Amoeba only -\include{lib5} % STDWIN only; SGI machines only; SUNs only; AUDIO TOOLS -\input{lib.ind} % The index + % Chapter title: + +\input{libintro} % Introduction + +\input{libobjs} % Built-in Types, Exceptions and Functions +\input{libtypes} +\input{libexcs} +\input{libfuncs} + +\input{libmods} % Built-in modules +\input{libsys} +\input{libbltin} % really __builtin__ +\input{libmain} % really __main__ +\input{libarray} +\input{libmath} +\input{libtime} +\input{libregex} +\input{libmarshal} +\input{libstruct} + +\input{libstd} % Standard Modules +\input{libgetopt} +\input{libos} +\input{librand} +\input{libregsub} +\input{libstring} +\input{libwhrandom} + +\input{libunix} % UNIX ONLY +\input{libdbm} +\input{libfcntl} +\input{libgrp} +\input{libposix} +\input{libposixfile} % XXX this uses lineii which partparse.py doesn't know +\input{libppath} % really posixpath +\input{libpwd} +\input{libselect} +\input{libsocket} +\input{libthread} + +\input{libmm} % MULTIMEDIA EXTENSIONS +\input{libaudioop} +\input{libimageop} +\input{libjpeg} +\input{librgbimg} + +\input{libcrypto} % CRYPTOGRAPHIC EXTENSIONS +\input{libmd5} +\input{libmpz} +\input{librotor} + +%\input{libamoeba} % AMOEBA ONLY + +%\input{libmac} % MACINTOSH ONLY + +\input{libstdwin} % STDWIN ONLY + +\input{libsgi} % SGI IRIX ONLY +\input{libal} +%\input{libaudio} +\input{libfl} +\input{libfm} +\input{libgl} +\input{libimgfile} +%\input{libpanel} + +\input{libsun} % SUNOS ONLY + +\input{lib.ind} % Index \end{document} diff --git a/Doc/lib/lib.tex b/Doc/lib/lib.tex index e7ef71e..7b4f724 100644 --- a/Doc/lib/lib.tex +++ b/Doc/lib/lib.tex @@ -1,9 +1,6 @@ \documentstyle[twoside,11pt,myformat]{report} -%\includeonly{lib5} -\title{\bf - Python Library Reference -} +\title{Python Library Reference} \author{ Guido van Rossum \\ @@ -12,14 +9,13 @@ E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! + +\makeindex % tell \index to actually write the .idx file -% Tell \index to actually write the .idx file -\makeindex \begin{document} -%\showthe\fam -%\showthe\ttfam + \pagenumbering{roman} \maketitle @@ -46,12 +42,75 @@ language. \pagebreak \pagenumbering{arabic} -\include{lib1} % intro; built-in types, functions and exceptions -\include{lib2} % built-in modules -\include{lib3} % standard modules -\include{lib4} % Most OS'es; UNIX only; Amoeba only -\include{lib5} % STDWIN only; SGI machines only; SUNs only; AUDIO TOOLS -\input{lib.ind} % The index + % Chapter title: + +\input{libintro} % Introduction + +\input{libobjs} % Built-in Types, Exceptions and Functions +\input{libtypes} +\input{libexcs} +\input{libfuncs} + +\input{libmods} % Built-in modules +\input{libsys} +\input{libbltin} % really __builtin__ +\input{libmain} % really __main__ +\input{libarray} +\input{libmath} +\input{libtime} +\input{libregex} +\input{libmarshal} +\input{libstruct} + +\input{libstd} % Standard Modules +\input{libgetopt} +\input{libos} +\input{librand} +\input{libregsub} +\input{libstring} +\input{libwhrandom} + +\input{libunix} % UNIX ONLY +\input{libdbm} +\input{libfcntl} +\input{libgrp} +\input{libposix} +\input{libposixfile} % XXX this uses lineii which partparse.py doesn't know +\input{libppath} % really posixpath +\input{libpwd} +\input{libselect} +\input{libsocket} +\input{libthread} + +\input{libmm} % MULTIMEDIA EXTENSIONS +\input{libaudioop} +\input{libimageop} +\input{libjpeg} +\input{librgbimg} + +\input{libcrypto} % CRYPTOGRAPHIC EXTENSIONS +\input{libmd5} +\input{libmpz} +\input{librotor} + +%\input{libamoeba} % AMOEBA ONLY + +%\input{libmac} % MACINTOSH ONLY + +\input{libstdwin} % STDWIN ONLY + +\input{libsgi} % SGI IRIX ONLY +\input{libal} +%\input{libaudio} +\input{libfl} +\input{libfm} +\input{libgl} +\input{libimgfile} +%\input{libpanel} + +\input{libsun} % SUNOS ONLY + +\input{lib.ind} % Index \end{document} diff --git a/Doc/ref.tex b/Doc/ref.tex index f0cb559..11a5d08 100644 --- a/Doc/ref.tex +++ b/Doc/ref.tex @@ -1,6 +1,6 @@ \documentstyle[twoside,11pt,myformat]{report} -\title{\bf Python Reference Manual} +\title{Python Reference Manual} \author{ Guido van Rossum \\ @@ -9,7 +9,7 @@ E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! % Tell \index to actually write the .idx file \makeindex diff --git a/Doc/ref/ref.tex b/Doc/ref/ref.tex index f0cb559..11a5d08 100644 --- a/Doc/ref/ref.tex +++ b/Doc/ref/ref.tex @@ -1,6 +1,6 @@ \documentstyle[twoside,11pt,myformat]{report} -\title{\bf Python Reference Manual} +\title{Python Reference Manual} \author{ Guido van Rossum \\ @@ -9,7 +9,7 @@ E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! % Tell \index to actually write the .idx file \makeindex diff --git a/Doc/ref/ref1.tex b/Doc/ref/ref1.tex index b373e36..169c244 100644 --- a/Doc/ref/ref1.tex +++ b/Doc/ref/ref1.tex @@ -43,22 +43,22 @@ name: lc_letter (lc_letter | "_")* lc_letter: "a"..."z" \end{verbatim} -The first line says that a \verb\name\ is an \verb\lc_letter\ followed by -a sequence of zero or more \verb\lc_letter\s and underscores. An -\verb\lc_letter\ in turn is any of the single characters `a' through `z'. +The first line says that a \verb@name@ is an \verb@lc_letter@ followed by +a sequence of zero or more \verb@lc_letter@s and underscores. An +\verb@lc_letter@ in turn is any of the single characters `a' through `z'. (This rule is actually adhered to for the names defined in lexical and grammar rules in this document.) Each rule begins with a name (which is the name defined by the rule) -and a colon. A vertical bar (\verb\|\) is used to separate +and a colon. A vertical bar (\verb@|@) is used to separate alternatives; it is the least binding operator in this notation. A -star (\verb\*\) means zero or more repetitions of the preceding item; -likewise, a plus (\verb\+\) means one or more repetitions, and a -phrase enclosed in square brackets (\verb\[ ]\) means zero or one +star (\verb@*@) means zero or more repetitions of the preceding item; +likewise, a plus (\verb@+@) means one or more repetitions, and a +phrase enclosed in square brackets (\verb@[ ]@) means zero or one occurrences (in other words, the enclosed phrase is optional). The -\verb\*\ and \verb\+\ operators bind as tightly as possible; +\verb@*@ and \verb@+@ operators bind as tightly as possible; parentheses are used for grouping. Literal strings are enclosed in -double quotes. White space is only meaningful to separate tokens. +quotes. White space is only meaningful to separate tokens. Rules are normally contained on a single line; rules with many alternatives may be formatted alternatively with each line after the first beginning with a vertical bar. @@ -66,7 +66,7 @@ first beginning with a vertical bar. In lexical definitions (as the example above), two more conventions are used: Two literal characters separated by three dots mean a choice of any single character in the given (inclusive) range of ASCII -characters. A phrase between angular brackets (\verb\<...>\) gives an +characters. A phrase between angular brackets (\verb@<...>@) gives an informal description of the symbol defined; e.g. this could be used to describe the notion of `control character' if needed. \index{lexical definitions} diff --git a/Doc/ref/ref2.tex b/Doc/ref/ref2.tex index 250bd2e..67f22f8 100644 --- a/Doc/ref/ref2.tex +++ b/Doc/ref/ref2.tex @@ -19,7 +19,7 @@ syntax (e.g. between statements in compound statements). \subsection{Comments} -A comment starts with a hash character (\verb\#\) that is not part of +A comment starts with a hash character (\verb@#@) that is not part of a string literal, and ends at the end of the physical line. A comment always signifies the end of the logical line. Comments are ignored by the syntax. @@ -28,7 +28,7 @@ the syntax. \index{physical line} \index{hash character} -\subsection{Line joining} +\subsection{Explicit line joining} Two or more physical lines may be joined into logical lines using backslash characters (\verb/\/), as follows: when a physical line ends @@ -37,15 +37,37 @@ joined with the following forming a single logical line, deleting the backslash and the following end-of-line character. For example: \index{physical line} \index{line joining} +\index{line continuation} \index{backslash character} % \begin{verbatim} -month_names = ['Januari', 'Februari', 'Maart', \ - 'April', 'Mei', 'Juni', \ - 'Juli', 'Augustus', 'September', \ - 'Oktober', 'November', 'December'] +if 1900 < year < 2100 and 1 <= month <= 12 \ + and 1 <= day <= 31 and 0 <= hour < 24 \ + and 0 <= minute < 60 and 0 <= second < 60: # Looks like a valid date + return 1 \end{verbatim} +A line ending in a backslash cannot carry a comment; a backslash does +not continue a comment (but it does continue a string literal, see +below). + +\subsection{Implicit line joining} + +Expressions in parentheses, square brackets or curly braces can be +split over more than one physical line without using backslashes. +For example: + +\begin{verbatim} +month_names = ['Januari', 'Februari', 'Maart', # These are the + 'April', 'Mei', 'Juni', # Dutch names + 'Juli', 'Augustus', 'September', # for the months + 'Oktober', 'November', 'December'] # of the year +\end{verbatim} + +Implicitly continued lines can carry comments. The indentation of the +continuation lines is not important. Blank continuation lines are +allowed. + \subsection{Blank lines} A logical line that contains only spaces, tabs, and possibly a @@ -123,7 +145,7 @@ The following example shows various indentation errors: (Actually, the first three errors are detected by the parser; only the last error is found by the lexical analyzer --- the indentation of -\verb\return r\ does not match a level popped off the stack.) +\verb@return r@ does not match a level popped off the stack.) \section{Other tokens} @@ -159,26 +181,15 @@ identifiers. They must be spelled exactly as written here: \index{reserved word} \begin{verbatim} -and del for in print -break elif from is raise -class else global not return -continue except if or try -def finally import pass while +access del from lambda return +and elif global not try +break else if or while +class except import pass +continue finally in print +def for is raise \end{verbatim} -% # This Python program sorts and formats the above table -% import string -% l = [] -% try: -% while 1: -% l = l + string.split(raw_input()) -% except EOFError: -% pass -% l.sort() -% for i in range((len(l)+4)/5): -% for j in range(i, len(l), 5): -% print string.ljust(l[j], 10), -% print +% When adding keywords, pipe it through keywords.py for reformatting \section{Literals} \label{literals} @@ -192,17 +203,24 @@ String literals are described by the following lexical definitions: \index{string literal} \begin{verbatim} -stringliteral: "'" stringitem* "'" -stringitem: stringchar | escapeseq -stringchar: -escapeseq: "'" +stringliteral: shortstring | longstring +shortstring: "'" shortstringitem* "'" | '"' shortstringitem* '"' +longstring: "'''" longstringitem* "'''" | '"""' longstringitem* '"""' +shortstringitem: shortstringchar | escapeseq +shortstringchar: +longstringchar: +escapeseq: "\" \end{verbatim} \index{ASCII} -String literals cannot span physical line boundaries. Escape -sequences in strings are actually interpreted according to rules -similar to those used by Standard C. The recognized escape sequences -are: +In ``long strings'' (strings surrounded by sets of three quotes), +unescaped newlines and quotes are allowed (and are retained), except +that three unescaped quotes in a row terminate the string. (A +``quote'' is the character used to open the string, i.e. either +\verb/'/ or \verb/"/.) + +Escape sequences in strings are interpreted according to rules similar +to those used by Standard C. The recognized escape sequences are: \index{physical line} \index{escape sequence} \index{Standard C} @@ -211,8 +229,10 @@ are: \begin{center} \begin{tabular}{|l|l|} \hline +\verb/\/{\em newline} & Ignored \\ \verb/\\/ & Backslash (\verb/\/) \\ \verb/\'/ & Single quote (\verb/'/) \\ +\verb/\"/ & Double quote (\verb/"/) \\ \verb/\a/ & ASCII Bell (BEL) \\ \verb/\b/ & ASCII Backspace (BS) \\ %\verb/\E/ & ASCII Escape (ESC) \\ @@ -309,8 +329,8 @@ Some examples of floating point literals: \end{verbatim} Note that numeric literals do not include a sign; a phrase like -\verb\-1\ is actually an expression composed of the operator -\verb\-\ and the literal \verb\1\. +\verb@-1@ is actually an expression composed of the operator +\verb@-@ and the literal \verb@1@. \section{Operators} @@ -323,7 +343,7 @@ The following tokens are operators: < == > <= <> != >= \end{verbatim} -The comparison operators \verb\<>\ and \verb\!=\ are alternate +The comparison operators \verb@<>@ and \verb@!=@ are alternate spellings of the same operator. \section{Delimiters} diff --git a/Doc/ref/ref3.tex b/Doc/ref/ref3.tex index 41ce234..5f9a0d8 100644 --- a/Doc/ref/ref3.tex +++ b/Doc/ref/ref3.tex @@ -44,7 +44,7 @@ Some objects contain references to ``external'' resources such as open files or windows. It is understood that these resources are freed when the object is garbage-collected, but since garbage collection is not guaranteed to happen, such objects also provide an explicit way to -release the external resource, usually a \verb\close\ method. +release the external resource, usually a \verb@close@ method. Programs are strongly recommended to always explicitly close such objects. @@ -69,8 +69,8 @@ objects this is not allowed. E.g. after a = 1; b = 1; c = []; d = [] \end{verbatim} -\verb\a\ and \verb\b\ may or may not refer to the same object with the -value one, depending on the implementation, but \verb\c\ and \verb\d\ +\verb@a@ and \verb@b@ may or may not refer to the same object with the +value one, depending on the implementation, but \verb@c@ and \verb@d@ are guaranteed to refer to two different, unique, newly created empty lists. @@ -90,9 +90,9 @@ Some of the type descriptions below contain a paragraph listing `special attributes'. These are attributes that provide access to the implementation and are not intended for general use. Their definition may change in the future. There are also some `generic' special -attributes, not listed with the individual objects: \verb\__methods__\ +attributes, not listed with the individual objects: \verb@__methods__@ is a list of the method names of a built-in object, if it has any; -\verb\__members__\ is a list of the data attribute names of a built-in +\verb@__members__@ is a list of the data attribute names of a built-in object, if it has any. \index{attribute} \indexii{special}{attribute} @@ -104,7 +104,7 @@ object, if it has any. \item[None] This type has a single value. There is a single object with this value. -This object is accessed through the built-in name \verb\None\. +This object is accessed through the built-in name \verb@None@. It is returned from functions that don't explicitly return an object. \ttindex{None} \obindex{None@{\tt None}} @@ -134,7 +134,7 @@ These represent numbers in the range $-2^{31}$ through $2^{31}-1$. (The range may be larger on machines with a larger natural word size, but not smaller.) When the result of an operation falls outside this range, the -exception \verb\OverflowError\ is raised. +exception \verb@OverflowError@ is raised. For the purpose of shift and mask operations, integers are assumed to have a binary, 2's complement notation using 32 or more bits, and hiding no bits from the user (i.e., all $2^{32}$ different bit @@ -172,17 +172,17 @@ C implementation for the accepted range and handling of overflow. \item[Sequences] These represent finite ordered sets indexed by natural numbers. -The built-in function \verb\len()\ returns the number of elements +The built-in function \verb@len()@ returns the number of elements of a sequence. When this number is $n$, the index set contains -the numbers $0, 1, \ldots, n-1$. Element \verb\i\ of sequence -\verb\a\ is selected by \verb\a[i]\. +the numbers $0, 1, \ldots, n-1$. Element \verb@i@ of sequence +\verb@a@ is selected by \verb@a[i]@. \obindex{seqence} \bifuncindex{len} \index{index operation} \index{item selection} \index{subscription} -Sequences also support slicing: \verb\a[i:j]\ selects all elements +Sequences also support slicing: \verb@a[i:j]@ selects all elements with index $k$ such that $i <= k < j$. When used as an expression, a slice is a sequence of the same type --- this implies that the index set is renumbered so that it starts at 0 again. @@ -209,7 +209,7 @@ The following types are immutable sequences: The elements of a string are characters. There is no separate character type; a character is represented by a string of one element. Characters represent (at least) 8-bit bytes. The built-in -functions \verb\chr()\ and \verb\ord()\ convert between characters +functions \verb@chr()@ and \verb@ord()@ convert between characters and nonnegative integers representing the byte values. Bytes with the values 0-127 represent the corresponding ASCII values. The string data type is also used to represent arrays of bytes, e.g. @@ -223,7 +223,7 @@ to hold data read from a file. (On systems whose native character set is not ASCII, strings may use EBCDIC in their internal representation, provided the functions -\verb\chr()\ and \verb\ord()\ implement a mapping between ASCII and +\verb@chr()@ and \verb@ord()@ implement a mapping between ASCII and EBCDIC, and string comparison preserves the ASCII order. Or perhaps someone can propose a better rule?) \index{ASCII} @@ -250,7 +250,7 @@ parentheses. \item[Mutable sequences] Mutable sequences can be changed after they are created. The subscription and slicing notations can be used as the target of -assignment and \verb\del\ (delete) statements. +assignment and \verb@del@ (delete) statements. \obindex{mutable sequece} \obindex{mutable} \indexii{assignment}{statement} @@ -276,10 +276,10 @@ or 1.) \item[Mapping types] These represent finite sets of objects indexed by arbitrary index sets. -The subscript notation \verb\a[k]\ selects the element indexed -by \verb\k\ from the mapping \verb\a\; this can be used in -expressions and as the target of assignments or \verb\del\ statements. -The built-in function \verb\len()\ returns the number of elements +The subscript notation \verb@a[k]@ selects the element indexed +by \verb@k@ from the mapping \verb@a@; this can be used in +expressions and as the target of assignments or \verb@del@ statements. +The built-in function \verb@len()@ returns the number of elements in a mapping. \bifuncindex{len} \index{subscription} @@ -299,7 +299,7 @@ Numeric types used for keys obey the normal rules for numeric comparison: if two numbers compare equal (e.g. 1 and 1.0) then they can be used interchangeably to index the same dictionary entry. -Dictionaries are mutable; they are created by the \verb\{...}\ +Dictionaries are mutable; they are created by the \verb@{...}@ notation (see section \ref{dict}). \obindex{dictionary} \obindex{mutable} @@ -308,7 +308,7 @@ notation (see section \ref{dict}). \item[Callable types] These are the types to which the function call (invocation) operation, -written as \verb\function(argument, argument, ...)\, can be applied: +written as \verb@function(argument, argument, ...)@, can be applied: \indexii{function}{call} \index{invocation} \indexii{function}{argument} @@ -325,8 +325,8 @@ parameter list. \obindex{function} \obindex{user-defined function} -Special read-only attributes: \verb\func_code\ is the code object -representing the compiled function body, and \verb\func_globals\ is (a +Special read-only attributes: \verb@func_code@ is the code object +representing the compiled function body, and \verb@func_globals@ is (a reference to) the dictionary that holds the function's global variables --- it implements the global name space of the module in which the function was defined. @@ -346,14 +346,14 @@ shifted one to the right. \indexii{user-defined}{method} \index{object closure} -Special read-only attributes: \verb\im_self\ is the class instance -object, \verb\im_func\ is the function object. +Special read-only attributes: \verb@im_self@ is the class instance +object, \verb@im_func@ is the function object. \ttindex{im_func} \ttindex{im_self} \item[Built-in functions] A built-in function object is a wrapper around a C function. Examples -of built-in functions are \verb\len\ and \verb\math.sin\. There +of built-in functions are \verb@len@ and \verb@math.sin@. There are no special attributes. The number and type of the arguments are determined by the C function. \obindex{built-in function} @@ -363,18 +363,20 @@ determined by the C function. \item[Built-in methods] This is really a different disguise of a built-in function, this time containing an object passed to the C function as an implicit extra -argument. An example of a built-in method is \verb\list.append\ if -\verb\list\ is a list object. +argument. An example of a built-in method is \verb@list.append@ if +\verb@list@ is a list object. \obindex{built-in method} \obindex{method} \indexii{built-in}{method} \item[Classes] Class objects are described below. When a class object is called as a -parameterless function, a new class instance (also described below) is -created and returned. The class's initialization function is not -called --- this is the responsibility of the caller. It is illegal to -call a class object with one or more arguments. +function, a new class instance (also described below) is created and +returned. This implies a call to the class's \verb@__init__@ method +if it has one. Any arguments are passed on to the \verb@__init__@ +method -- if there is \verb@__init__@ method, the class must be called +without arguments. +\ttindex{__init__} \obindex{class} \obindex{class instance} \obindex{instance} @@ -383,10 +385,10 @@ call a class object with one or more arguments. \end{description} \item[Modules] -Modules are imported by the \verb\import\ statement (see section +Modules are imported by the \verb@import@ statement (see section \ref{import}). A module object is a container for a module's name space, which is a dictionary (the same dictionary as referenced by the -\verb\func_globals\ attribute of functions defined in the module). +\verb@func_globals@ attribute of functions defined in the module). Module attribute references are translated to lookups in this dictionary. A module object does not contain the code object used to initialize the module (since it isn't needed once the initialization @@ -396,8 +398,8 @@ is done). Attribute assignment update the module's name space dictionary. -Special read-only attributes: \verb\__dict__\ yields the module's name -space as a dictionary object; \verb\__name__\ yields the module's name +Special read-only attributes: \verb@__dict__@ yields the module's name +space as a dictionary object; \verb@__name__@ yields the module's name as a string object. \ttindex{__dict__} \ttindex{__name__} @@ -423,12 +425,12 @@ Class attribute assignments update the class's dictionary, never the dictionary of a base class. \indexiii{class}{attribute}{assignment} -A class can be called as a parameterless function to yield a class -instance (see above). +A class can be called as a function to yield a class instance (see +above). \indexii{class object}{call} -Special read-only attributes: \verb\__dict__\ yields the dictionary -containing the class's name space; \verb\__bases__\ yields a tuple +Special read-only attributes: \verb@__dict__@ yields the dictionary +containing the class's name space; \verb@__bases__@ yields a tuple (possibly empty or a singleton) containing the base classes, in the order of their occurrence in the base class list. \ttindex{__dict__} @@ -436,7 +438,7 @@ order of their occurrence in the base class list. \item[Class instances] A class instance is created by calling a class object as a -parameterless function. A class instance has a dictionary in which +function. A class instance has a dictionary in which attribute references are searched. When an attribute is not found there, and the instance's class has an attribute by that name, and that class attribute is a user-defined function (and in no other @@ -457,17 +459,17 @@ section \ref{specialnames}. \obindex{sequence} \obindex{mapping} -Special read-only attributes: \verb\__dict__\ yields the attribute -dictionary; \verb\__class__\ yields the instance's class. +Special read-only attributes: \verb@__dict__@ yields the attribute +dictionary; \verb@__class__@ yields the instance's class. \ttindex{__dict__} \ttindex{__class__} \item[Files] A file object represents an open file. (It is a wrapper around a C {\tt stdio} file pointer.) File objects are created by the -\verb\open()\ built-in function, and also by \verb\posix.popen()\ and -the \verb\makefile\ method of socket objects. \verb\sys.stdin\, -\verb\sys.stdout\ and \verb\sys.stderr\ are file objects corresponding +\verb@open()@ built-in function, and also by \verb@posix.popen()@ and +the \verb@makefile@ method of socket objects. \verb@sys.stdin@, +\verb@sys.stdout@ and \verb@sys.stderr@ are file objects corresponding the the interpreter's standard input, output and error streams. See the Python Library Reference for methods of file objects and other details. @@ -500,12 +502,12 @@ was defined) which a code object contains no context. There is no way to execute a bare code object. \obindex{code} -Special read-only attributes: \verb\co_code\ is a string representing -the sequence of instructions; \verb\co_consts\ is a list of literals -used by the code; \verb\co_names\ is a list of names (strings) used by -the code; \verb\co_filename\ is the filename from which the code was +Special read-only attributes: \verb@co_code@ is a string representing +the sequence of instructions; \verb@co_consts@ is a list of literals +used by the code; \verb@co_names@ is a list of names (strings) used by +the code; \verb@co_filename@ is the filename from which the code was compiled. (To find out the line numbers, you would have to decode the -instructions; the standard library module \verb\dis\ contains an +instructions; the standard library module \verb@dis@ contains an example of how to do this.) \ttindex{co_code} \ttindex{co_consts} @@ -517,12 +519,12 @@ Frame objects represent execution frames. They may occur in traceback objects (see below). \obindex{frame} -Special read-only attributes: \verb\f_back\ is to the previous -stack frame (towards the caller), or \verb\None\ if this is the bottom -stack frame; \verb\f_code\ is the code object being executed in this -frame; \verb\f_globals\ is the dictionary used to look up global -variables; \verb\f_locals\ is used for local variables; -\verb\f_lineno\ gives the line number and \verb\f_lasti\ gives the +Special read-only attributes: \verb@f_back@ is to the previous +stack frame (towards the caller), or \verb@None@ if this is the bottom +stack frame; \verb@f_code@ is the code object being executed in this +frame; \verb@f_globals@ is the dictionary used to look up global +variables; \verb@f_locals@ is used for local variables; +\verb@f_lineno@ gives the line number and \verb@f_lasti@ gives the precise instruction (this is an index into the instruction string of the code object). \ttindex{f_back} @@ -539,11 +541,11 @@ for an exception handler unwinds the execution stack, at each unwound level a traceback object is inserted in front of the current traceback. When an exception handler is entered (see also section \ref{try}), the stack trace is -made available to the program as \verb\sys.exc_traceback\. When the +made available to the program as \verb@sys.exc_traceback@. When the program contains no suitable handler, the stack trace is written (nicely formatted) to the standard error stream; if the interpreter is interactive, it is also made available to the user as -\verb\sys.last_traceback\. +\verb@sys.last_traceback@. \obindex{traceback} \indexii{stack}{trace} \indexii{exception}{handler} @@ -553,15 +555,15 @@ interactive, it is also made available to the user as \ttindex{sys.exc_traceback} \ttindex{sys.last_traceback} -Special read-only attributes: \verb\tb_next\ is the next level in the +Special read-only attributes: \verb@tb_next@ is the next level in the stack trace (towards the frame where the exception occurred), or -\verb\None\ if there is no next level; \verb\tb_frame\ points to the -execution frame of the current level; \verb\tb_lineno\ gives the line -number where the exception occurred; \verb\tb_lasti\ indicates the +\verb@None@ if there is no next level; \verb@tb_frame@ points to the +execution frame of the current level; \verb@tb_lineno@ gives the line +number where the exception occurred; \verb@tb_lasti@ indicates the precise instruction. The line number and last instruction in the traceback may differ from the line number of its frame object if the -exception occurred in a \verb\try\ statement with no matching -\verb\except\ clause or with a \verb\finally\ clause. +exception occurred in a \verb@try@ statement with no matching +\verb@except@ clause or with a \verb@finally@ clause. \ttindex{tb_next} \ttindex{tb_frame} \ttindex{tb_lineno} @@ -578,17 +580,19 @@ exception occurred in a \verb\try\ statement with no matching A class can implement certain operations that are invoked by special syntax (such as subscription or arithmetic operations) by defining methods with special names. For instance, if a class defines a -method named \verb\__getitem__\, and \verb\x\ is an instance of this -class, then \verb\x[i]\ is equivalent to \verb\x.__getitem__(i)\. -(The reverse is not true --- if \verb\x\ is a list object, -\verb\x.__getitem__(i)\ is not equivalent to \verb\x[i]\.) +method named \verb@__getitem__@, and \verb@x@ is an instance of this +class, then \verb@x[i]@ is equivalent to \verb@x.__getitem__(i)@. +(The reverse is not true --- if \verb@x@ is a list object, +\verb@x.__getitem__(i)@ is not equivalent to \verb@x[i]@.) -Except for \verb\__repr__\, \verb\__str__\ and \verb\__cmp__\, +Except for \verb@__repr__@, \verb@__str__@ and \verb@__cmp__@, attempts to execute an operation raise an exception when no appropriate method is defined. -For \verb\__repr__\ and \verb\__cmp__\, the traditional -interpretations are used in this case. -For \verb\__str__\, the \verb\__repr__\ method is used. +For \verb@__repr__@, the default is to return a string describing the +object's class and address. +For \verb@__cmp__@, the default is to compare instances based on their +address. +For \verb@__str__@, the default is to use \verb@__repr__@. \subsection{Special methods for any type} @@ -614,17 +618,17 @@ reference is deleted. Also note that it is not guaranteed that the interpreter exits. \item[\tt __repr__(self)] -Called by the \verb\repr()\ built-in function and by conversions +Called by the \verb@repr()@ built-in function and by conversions (reverse quotes) to compute the string representation of an object. \item[\tt __str__(self)] -Called by the \verb\str()\ built-in function and by the \verb\print\ +Called by the \verb@str()@ built-in function and by the \verb@print@ statement compute the string representation of an object. \item[\tt __cmp__(self, other)] Called by all comparison operations. Should return -1 if -\verb\self < other\, 0 if \verb\self == other\, +1 if -\verb\self > other\. If no \code{__cmp__} operation is defined, class +\verb@self < other@, 0 if \verb@self == other@, +1 if +\verb@self > other@. If no \code{__cmp__} operation is defined, class instances are compared by object identity (``address''). (Implementation note: due to limitations in the interpreter, exceptions raised by comparisons are ignored, and the objects will be @@ -654,23 +658,23 @@ key's hash value is a constant. \begin{description} \item[\tt __len__(self)] -Called to implement the built-in function \verb\len()\. Should return -the length of the object, an integer \verb\>=\ 0. Also, an object -whose \verb\__len__()\ method returns 0 is considered to be false in a +Called to implement the built-in function \verb@len()@. Should return +the length of the object, an integer \verb@>=@ 0. Also, an object +whose \verb@__len__()@ method returns 0 is considered to be false in a Boolean context. \item[\tt __getitem__(self, key)] -Called to implement evaluation of \verb\self[key]\. Note that the +Called to implement evaluation of \verb@self[key]@. Note that the special interpretation of negative keys (if the class wishes to -emulate a sequence type) is up to the \verb\__getitem__\ method. +emulate a sequence type) is up to the \verb@__getitem__@ method. \item[\tt __setitem__(self, key, value)] -Called to implement assignment to \verb\self[key]\. Same note as for -\verb\__getitem__\. +Called to implement assignment to \verb@self[key]@. Same note as for +\verb@__getitem__@. \item[\tt __delitem__(self, key)] -Called to implement deletion of \verb\self[key]\. Same note as for -\verb\__getitem__\. +Called to implement deletion of \verb@self[key]@. Same note as for +\verb@__getitem__@. \end{description} @@ -680,19 +684,19 @@ Called to implement deletion of \verb\self[key]\. Same note as for \begin{description} \item[\tt __getslice__(self, i, j)] -Called to implement evaluation of \verb\self[i:j]\. Note that missing -\verb\i\ or \verb\j\ are replaced by 0 or \verb\len(self)\, -respectively, and \verb\len(self)\ has been added (once) to originally -negative \verb\i\ or \verb\j\ by the time this function is called -(unlike for \verb\__getitem__\). +Called to implement evaluation of \verb@self[i:j]@. Note that missing +\verb@i@ or \verb@j@ are replaced by 0 or \verb@len(self)@, +respectively, and \verb@len(self)@ has been added (once) to originally +negative \verb@i@ or \verb@j@ by the time this function is called +(unlike for \verb@__getitem__@). \item[\tt __setslice__(self, i, j, sequence)] -Called to implement assignment to \verb\self[i:j]\. Same notes as for -\verb\__getslice__\. +Called to implement assignment to \verb@self[i:j]@. Same notes as for +\verb@__getslice__@. \item[\tt __delslice__(self, i, j)] -Called to implement deletion of \verb\self[i:j]\. Same notes as for -\verb\__getslice__\. +Called to implement deletion of \verb@self[i:j]@. Same notes as for +\verb@__getslice__@. \end{description} @@ -713,20 +717,20 @@ Called to implement deletion of \verb\self[i:j]\. Same notes as for \item[\tt __and__(self, other)]\itemjoin \item[\tt __xor__(self, other)]\itemjoin \item[\tt __or__(self, other)]\itembreak -Called to implement the binary arithmetic operations (\verb\+\, -\verb\-\, \verb\*\, \verb\/\, \verb\%\, \verb\divmod()\, \verb\pow()\, -\verb\<<\, \verb\>>\, \verb\&\, \verb\^\, \verb\|\). +Called to implement the binary arithmetic operations (\verb@+@, +\verb@-@, \verb@*@, \verb@/@, \verb@%@, \verb@divmod()@, \verb@pow()@, +\verb@<<@, \verb@>>@, \verb@&@, \verb@^@, \verb@|@). \item[\tt __neg__(self)]\itemjoin \item[\tt __pos__(self)]\itemjoin \item[\tt __abs__(self)]\itemjoin \item[\tt __invert__(self)]\itembreak -Called to implement the unary arithmetic operations (\verb\-\, \verb\+\, -\verb\abs()\ and \verb\~\). +Called to implement the unary arithmetic operations (\verb@-@, \verb@+@, +\verb@abs()@ and \verb@~@). \item[\tt __nonzero__(self)] Called to implement boolean testing; should return 0 or 1. An -alternative name for this method is \verb\__len__\. +alternative name for this method is \verb@__len__@. \item[\tt __coerce__(self, other)] Called to implement ``mixed-mode'' numeric arithmetic. Should either @@ -737,11 +741,11 @@ interpreter will also ask the other object to attempt a coercion (but sometimes, if the implementation of the other type cannot be changed, it is useful to do the conversion to the other type here). -Note that this method is not called to coerce the arguments to \verb\+\ -and \verb\*\, because these are also used to implement sequence +Note that this method is not called to coerce the arguments to \verb@+@ +and \verb@*@, because these are also used to implement sequence concatenation and repetition, respectively. Also note that, for the -same reason, in \verb\n*x\, where \verb\n\ is a built-in number and -\verb\x\ is an instance, a call to \verb\x.__mul__(n)\ is made.% +same reason, in \verb@n*x@, where \verb@n@ is a built-in number and +\verb@x@ is an instance, a call to \verb@x.__mul__(n)@ is made.% \footnote{The interpreter should really distinguish between user-defined classes implementing sequences, mappings or numbers, but currently it doesn't --- hence this strange exception.} @@ -749,12 +753,12 @@ currently it doesn't --- hence this strange exception.} \item[\tt __int__(self)]\itemjoin \item[\tt __long__(self)]\itemjoin \item[\tt __float__(self)]\itembreak -Called to implement the built-in functions \verb\int()\, \verb\long()\ -and \verb\float()\. Should return a value of the appropriate type. +Called to implement the built-in functions \verb@int()@, \verb@long()@ +and \verb@float()@. Should return a value of the appropriate type. \item[\tt __oct__(self)]\itemjoin \item[\tt __hex__(self)]\itembreak -Called to implement the built-in functions \verb\oct()\ and -\verb\hex()\. Should return a string value. +Called to implement the built-in functions \verb@oct()@ and +\verb@hex()@. Should return a string value. \end{description} diff --git a/Doc/ref/ref4.tex b/Doc/ref/ref4.tex index 62db120..c14fada 100644 --- a/Doc/ref/ref4.tex +++ b/Doc/ref/ref4.tex @@ -20,9 +20,9 @@ The following are code blocks: A module is a code block. A function body is a code block. A class definition is a code block. Each command typed interactively is a separate code block; a script file is a code block. The string argument passed to the built-in function -\verb\eval\ and to the \verb\exec\ statement are code blocks. +\verb@eval@ and to the \verb@exec@ statement are code blocks. And finally, the -expression read and evaluated by the built-in function \verb\input\ is +expression read and evaluated by the built-in function \verb@input@ is a code block. A code block is executed in an execution frame. An {\em execution @@ -46,7 +46,7 @@ Name spaces are functionally equivalent to dictionaries. The {\em local name space} of an execution frame determines the default place where names are defined and searched. The {\em global name -space} determines the place where names listed in \verb\global\ +space} determines the place where names listed in \verb@global@ statements are defined and searched, and where names that are not explicitly bound in the current code block are searched. \indexii{local}{name space} @@ -55,25 +55,35 @@ explicitly bound in the current code block are searched. Whether a name is local or global in a code block is determined by static inspection of the source text for the code block: in the -absence of \verb\global\ statements, a name that is bound anywhere in +absence of \verb@global@ statements, a name that is bound anywhere in the code block is local in the entire code block; all other names are -considered global. The \verb\global\ statement forces global +considered global. The \verb@global@ statement forces global interpretation of selected names throughout the code block. The -following constructs bind names: formal parameters, \verb\import\ +following constructs bind names: formal parameters, \verb@import@ statements, class and function definitions (these bind the class or function name), and targets that are identifiers if occurring in an -assignment, \verb\for\ loop header, or \verb\except\ clause header. -(A target occurring in a \verb\del\ statement does not bind a name.) +assignment, \verb@for@ loop header, or \verb@except@ clause header. + +A target occurring in a \verb@del@ statement is also considered bound +for this purpose (though the actual semantics are to ``unbind'' the +name). When a global name is not found in the global name space, it is searched in the list of ``built-in'' names (which is actually the -global name space of the module \verb\__builtin__\). When a name is not -found at all, the \verb\NameError\ exception is raised. +global name space of the module \verb@__builtin__@). When a name is not +found at all, the \verb@NameError@ exception is raised.% +\footnote{If the code block contains \verb@exec@ statement or the +construct \verb@from ... import *@, the semantics of names not +explicitly mentioned in a \verb@global@ statement change subtly: name +lookup first searches the local name space, then the global one, then +the built-in one.} The following table lists the meaning of the local and global name space for various types of code blocks. The name space for a particular module is automatically created when the module is first -referenced. +referenced. Note that in almost all cases, the global name space is +the name space of the containing module -- scopes in Python do not +nest! \begin{center} \begin{tabular}{|l|l|l|l|} @@ -81,15 +91,18 @@ referenced. Code block type & Global name space & Local name space & Notes \\ \hline Module & n.s. for this module & same as global & \\ -Script & n.s. for \verb\__main__\ & same as global & \\ -Interactive command & n.s. for \verb\__main__\ & same as global & \\ +Script & n.s. for \verb@__main__@ & same as global & \\ +Interactive command & n.s. for \verb@__main__@ & same as global & \\ Class definition & global n.s. of containing block & new n.s. & \\ Function body & global n.s. of containing block & new n.s. & \\ -String passed to \verb\exec\ or \verb\eval\ +String passed to \verb@exec@ statement + & global n.s. of cobtaining block + & local n.s. of containing block & (1) \\ +String passed to \verb@eval()@ & global n.s. of caller & local n.s. of caller & (1) \\ -File read by \verb\execfile\ +File read by \verb@execfile()@ & global n.s. of caller & local n.s. of caller & (1) \\ -Expression read by \verb\input\ +Expression read by \verb@input@ & global n.s. of caller & local n.s. of caller & \\ \hline \end{tabular} @@ -101,7 +114,7 @@ Notes: \item[n.s.] means {\em name space} -\item[(1)] The global and local name space for these functions can be +\item[(1)] The global and local name space for these can be overridden with optional extra arguments. \end{description} @@ -123,8 +136,8 @@ where the error occurred. The Python interpreter raises an exception when it detects an run-time error (such as division by zero). A Python program can also -explicitly raise an exception with the \verb\raise\ statement. -Exception handlers are specified with the \verb\try...except\ +explicitly raise an exception with the \verb@raise@ statement. +Exception handlers are specified with the \verb@try...except@ statement. Python uses the ``termination'' model of error handling: an exception @@ -139,10 +152,10 @@ execution of the program, or returns to its interactive main loop. Exceptions are identified by string objects. Two different string objects with the same value identify different exceptions. -When an exception is raised, an object (maybe \verb\None\) is passed +When an exception is raised, an object (maybe \verb@None@) is passed as the exception's ``parameter''; this object does not affect the selection of an exception handler, but is passed to the selected exception handler as additional information. -See also the description of the \verb\try\ and \verb\raise\ +See also the description of the \verb@try@ and \verb@raise@ statements. diff --git a/Doc/ref/ref5.tex b/Doc/ref/ref5.tex index 55f523f..3e60931 100644 --- a/Doc/ref/ref5.tex +++ b/Doc/ref/ref5.tex @@ -12,14 +12,14 @@ may be used wherever an expression is required by enclosing it in parentheses. The only places where expressions are used in the syntax instead of conditions is in expression statements and on the right-hand side of assignment statements; this catches some nasty bugs -like accidentally writing \verb\x == 1\ instead of \verb\x = 1\. +like accidentally writing \verb@x == 1@ instead of \verb@x = 1@. \indexii{assignment}{statement} The comma plays several roles in Python's syntax. It is usually an operator with a lower precedence than all others, but occasionally serves other purposes as well; e.g. it separates function arguments, is used in list and dictionary constructors, and has special semantics -in \verb\print\ statements. +in \verb@print@ statements. \index{comma} When (one alternative of) a syntax rule has the form @@ -28,8 +28,8 @@ When (one alternative of) a syntax rule has the form name: othername \end{verbatim} -and no semantics are given, the semantics of this form of \verb\name\ -are the same as for \verb\othername\. +and no semantics are given, the semantics of this form of \verb@name@ +are the same as for \verb@othername@. \index{syntax} \section{Arithmetic conversions} @@ -38,7 +38,7 @@ are the same as for \verb\othername\. When a description of an arithmetic operator below uses the phrase ``the numeric arguments are converted to a common type'', this both means that if either argument is not a number, a -\verb\TypeError\ exception is raised, and that otherwise +\verb@TypeError@ exception is raised, and that otherwise the following conversions are applied: \exindex{TypeError} \indexii{floating point}{number} @@ -71,11 +71,13 @@ enclosure: parenth_form | list_display | dict_display | string_conversion \index{identifier} An identifier occurring as an atom is a reference to a local, global -or built-in name binding. If a name can be assigned to anywhere in a -code block, and is not mentioned in a \verb\global\ statement in that -code block, it refers to a local name throughout that code block. -Otherwise, it refers to a global name if one exists, else to a -built-in name. +or built-in name binding. If a name is assigned to anywhere in a code +block (even in unreachable code), and is not mentioned in a +\verb@global@ statement in that code block, then it refers to a local +name throughout that code block. When it is not assigned to anywhere +in the block, or when it is assigned to but also explicitly listed in +a \verb@global@ statement, it refers to a global name if one exists, +else to a built-in name (and this binding may dynamically change). \indexii{name}{binding} \index{code block} \stindex{global} @@ -84,7 +86,7 @@ built-in name. When the name is bound to an object, evaluation of the atom yields that object. When a name is not bound, an attempt to evaluate it -raises a \verb\NameError\ exception. +raises a \verb@NameError@ exception. \exindex{NameError} \subsection{Literals} @@ -197,10 +199,10 @@ A string conversion evaluates the contained condition list and converts the resulting object into a string according to rules specific to its type. -If the object is a string, a number, \verb\None\, or a tuple, list or +If the object is a string, a number, \verb@None@, or a tuple, list or dictionary containing only objects whose type is one of these, the resulting string is a valid Python expression which can be passed to -the built-in function \verb\eval()\ to yield an expression with the +the built-in function \verb@eval()@ to yield an expression with the same value (or an approximation, if floating point numbers are involved). @@ -234,7 +236,7 @@ attributeref: primary "." identifier The primary must evaluate to an object of a type that supports attribute references, e.g. a module or a list. This object is then asked to produce the attribute whose name is the identifier. If this -attribute is not available, the exception \verb\AttributeError\ is +attribute is not available, the exception \verb@AttributeError@ is raised. Otherwise, the type and value of the object produced is determined by the object. Multiple evaluations of the same attribute reference may yield different objects. @@ -266,7 +268,7 @@ the value in the mapping that corresponds to that key. If it is a sequence, the condition must evaluate to a plain integer. If this value is negative, the length of the sequence is added to it -(so that, e.g. \verb\x[-1]\ selects the last item of \verb\x\.) +(so that, e.g. \verb@x[-1]@ selects the last item of \verb@x@.) The resulting value must be a nonnegative integer smaller than the number of items in the sequence, and the subscription selects the item whose index is that value (counting from zero). @@ -318,7 +320,7 @@ objects, and methods of class instances are callable). If it is a class, the argument list must be empty; otherwise, the arguments are evaluated. -A call always returns some value, possibly \verb\None\, unless it +A call always returns some value, possibly \verb@None@, unless it raises an exception. How this value is computed depends on the type of the callable object. If it is: @@ -328,7 +330,7 @@ of the callable object. If it is: executed, passing it the argument list. The first thing the code block will do is bind the formal parameters to the arguments; this is described in section \ref{function}. When the code block executes a -\verb\return\ statement, this specifies the return value of the +\verb@return@ statement, this specifies the return value of the function call. \indexii{function}{call} \indexiii{user-defined}{function}{call} @@ -371,22 +373,22 @@ All unary arithmetic (and bit-wise) operations have the same priority: u_expr: primary | "-" u_expr | "+" u_expr | "~" u_expr \end{verbatim} -The unary \verb\"-"\ (minus) operator yields the negation of its +The unary \verb@"-"@ (minus) operator yields the negation of its numeric argument. \index{negation} \index{minus} -The unary \verb\"+"\ (plus) operator yields its numeric argument +The unary \verb@"+"@ (plus) operator yields its numeric argument unchanged. \index{plus} -The unary \verb\"~"\ (invert) operator yields the bit-wise inversion +The unary \verb@"~"@ (invert) operator yields the bit-wise inversion of its plain or long integer argument. The bit-wise inversion of -\verb\x\ is defined as \verb\-(x+1)\. +\verb@x@ is defined as \verb@-(x+1)@. \index{inversion} In all three cases, if the argument does not have the proper type, -a \verb\TypeError\ exception is raised. +a \verb@TypeError@ exception is raised. \exindex{TypeError} \section{Binary arithmetic operations} @@ -404,7 +406,7 @@ m_expr: u_expr | m_expr "*" u_expr a_expr: m_expr | aexpr "+" m_expr | aexpr "-" m_expr \end{verbatim} -The \verb\"*"\ (multiplication) operator yields the product of its +The \verb@"*"@ (multiplication) operator yields the product of its arguments. The arguments must either both be numbers, or one argument must be a plain integer and the other must be a sequence. In the former case, the numbers are converted to a common type and then @@ -412,40 +414,40 @@ multiplied together. In the latter case, sequence repetition is performed; a negative repetition factor yields an empty sequence. \index{multiplication} -The \verb\"/"\ (division) operator yields the quotient of its +The \verb@"/"@ (division) operator yields the quotient of its arguments. The numeric arguments are first converted to a common type. Plain or long integer division yields an integer of the same type; the result is that of mathematical division with the `floor' function applied to the result. Division by zero raises the -\verb\ZeroDivisionError\ exception. +\verb@ZeroDivisionError@ exception. \exindex{ZeroDivisionError} \index{division} -The \verb\"%"\ (modulo) operator yields the remainder from the +The \verb@"%"@ (modulo) operator yields the remainder from the division of the first argument by the second. The numeric arguments are first converted to a common type. A zero right argument raises -the \verb\ZeroDivisionError\ exception. The arguments may be floating -point numbers, e.g. \verb\3.14 % 0.7\ equals \verb\0.34\. The modulo +the \verb@ZeroDivisionError@ exception. The arguments may be floating +point numbers, e.g. \verb@3.14 % 0.7@ equals \verb@0.34@. The modulo operator always yields a result with the same sign as its second operand (or zero); the absolute value of the result is strictly smaller than the second operand. \index{modulo} The integer division and modulo operators are connected by the -following identity: \verb\x == (x/y)*y + (x%y)\. Integer division and -modulo are also connected with the built-in function \verb\divmod()\: -\verb\divmod(x, y) == (x/y, x%y)\. These identities don't hold for +following identity: \verb@x == (x/y)*y + (x%y)@. Integer division and +modulo are also connected with the built-in function \verb@divmod()@: +\verb@divmod(x, y) == (x/y, x%y)@. These identities don't hold for floating point numbers; there a similar identity holds where -\verb\x/y\ is replaced by \verb\floor(x/y)\). +\verb@x/y@ is replaced by \verb@floor(x/y)@). -The \verb\"+"\ (addition) operator yields the sum of its arguments. +The \verb@"+"@ (addition) operator yields the sum of its arguments. The arguments must either both be numbers, or both sequences of the same type. In the former case, the numbers are converted to a common type and then added together. In the latter case, the sequences are concatenated. \index{addition} -The \verb\"-"\ (subtraction) operator yields the difference of its +The \verb@"-"@ (subtraction) operator yields the difference of its arguments. The numeric arguments are first converted to a common type. \index{subtraction} @@ -470,7 +472,7 @@ shift by $n$ bits is defined as multiplication with $2^n$; for plain integers there is no overflow check so this drops bits and flip the sign if the result is not less than $2^{31}$ in absolute value. -Negative shift counts raise a \verb\ValueError\ exception. +Negative shift counts raise a \verb@ValueError@ exception. \exindex{ValueError} \section{Binary bit-wise operations} @@ -484,18 +486,18 @@ xor_expr: and_expr | xor_expr "^" and_expr or_expr: xor_expr | or_expr "|" xor_expr \end{verbatim} -The \verb\"&"\ operator yields the bitwise AND of its arguments, which +The \verb@"&"@ operator yields the bitwise AND of its arguments, which must be plain or long integers. The arguments are converted to a common type. \indexii{bit-wise}{and} -The \verb\"^"\ operator yields the bitwise XOR (exclusive OR) of its +The \verb@"^"@ operator yields the bitwise XOR (exclusive OR) of its arguments, which must be plain or long integers. The arguments are converted to a common type. \indexii{bit-wise}{xor} \indexii{exclusive}{or} -The \verb\"|"\ operator yields the bitwise (inclusive) OR of its +The \verb@"|"@ operator yields the bitwise (inclusive) OR of its arguments, which must be plain or long integers. The arguments are converted to a common type. \indexii{bit-wise}{or} @@ -507,7 +509,7 @@ converted to a common type. Contrary to C, all comparison operations in Python have the same priority, which is lower than that of any arithmetic, shifting or bitwise operation. Also contrary to C, expressions like -\verb\a < b < c\ have the interpretation that is conventional in +\verb@a < b < c@ have the interpretation that is conventional in mathematics: \index{C} @@ -519,23 +521,23 @@ comp_operator: "<"|">"|"=="|">="|"<="|"<>"|"!="|"is" ["not"]|["not"] "in" Comparisons yield integer values: 1 for true, 0 for false. Comparisons can be chained arbitrarily, e.g. $x < y <= z$ is -equivalent to $x < y$ \verb\and\ $y <= z$, except that $y$ is +equivalent to $x < y$ \verb@and@ $y <= z$, except that $y$ is evaluated only once (but in both cases $z$ is not evaluated at all when $x < y$ is found to be false). \indexii{chaining}{comparisons} \catcode`\_=8 Formally, $e_0 op_1 e_1 op_2 e_2 ...e_{n-1} op_n e_n$ is equivalent to -$e_0 op_1 e_1$ \verb\and\ $e_1 op_2 e_2$ \verb\and\ ... \verb\and\ +$e_0 op_1 e_1$ \verb@and@ $e_1 op_2 e_2$ \verb@and@ ... \verb@and@ $e_{n-1} op_n e_n$, except that each expression is evaluated at most once. Note that $e_0 op_1 e_1 op_2 e_2$ does not imply any kind of comparison between $e_0$ and $e_2$, e.g. $x < y > z$ is perfectly legal. \catcode`\_=12 -The forms \verb\<>\ and \verb\!=\ are equivalent; for consistency with -C, \verb\!=\ is preferred; where \verb\!=\ is mentioned below -\verb\<>\ is also implied. +The forms \verb@<>@ and \verb@!=@ are equivalent; for consistency with +C, \verb@!=@ is preferred; where \verb@!=@ is mentioned below +\verb@<>@ is also implied. The operators {\tt "<", ">", "==", ">=", "<="}, and {\tt "!="} compare the values of two objects. The objects needn't have the same type. @@ -544,8 +546,8 @@ objects of different types {\em always} compare unequal, and are ordered consistently but arbitrarily. (This unusual definition of comparison is done to simplify the -definition of operations like sorting and the \verb\in\ and \verb\not -in\ operators.) +definition of operations like sorting and the \verb@in@ and +\verb@not in@ operators.) Comparison of objects of the same type depends on the type: @@ -556,7 +558,7 @@ Numbers are compared arithmetically. \item Strings are compared lexicographically using the numeric equivalents -(the result of the built-in function \verb\ord\) of their characters. +(the result of the built-in function \verb@ord@) of their characters. \item Tuples and lists are compared lexicographically using comparison of @@ -579,11 +581,11 @@ execution of a program. \end{itemize} -The operators \verb\in\ and \verb\not in\ test for sequence -membership: if $y$ is a sequence, $x ~\verb\in\~ y$ is true if and +The operators \verb@in@ and \verb@not in@ test for sequence +membership: if $y$ is a sequence, $x ~\verb@in@~ y$ is true if and only if there exists an index $i$ such that $x = y[i]$. -$x ~\verb\not in\~ y$ yields the inverse truth value. The exception -\verb\TypeError\ is raised when $y$ is not a sequence, or when $y$ is +$x ~\verb@not in@~ y$ yields the inverse truth value. The exception +\verb@TypeError@ is raised when $y$ is not a sequence, or when $y$ is a string and $x$ is not a string of length one.% \footnote{The latter restriction is sometimes a nuisance.} \opindex{in} @@ -591,9 +593,9 @@ a string and $x$ is not a string of length one.% \indexii{membership}{test} \obindex{sequence} -The operators \verb\is\ and \verb\is not\ test for object identity: -$x ~\verb\is\~ y$ is true if and only if $x$ and $y$ are the same -object. $x ~\verb\is not\~ y$ yields the inverse truth value. +The operators \verb@is@ and \verb@is not@ test for object identity: +$x ~\verb@is@~ y$ is true if and only if $x$ and $y$ are the same +object. $x ~\verb@is not@~ y$ yields the inverse truth value. \opindex{is} \opindex{is not} \indexii{identity}{test} @@ -613,38 +615,39 @@ lambda_form: "lambda" [parameter_list]: condition In the context of Boolean operations, and also when conditions are used by control flow statements, the following values are interpreted -as false: \verb\None\, numeric zero of all types, empty sequences +as false: \verb@None@, numeric zero of all types, empty sequences (strings, tuples and lists), and empty mappings (dictionaries). All other values are interpreted as true. -The operator \verb\not\ yields 1 if its argument is false, 0 otherwise. +The operator \verb@not@ yields 1 if its argument is false, 0 otherwise. \opindex{not} -The condition $x ~\verb\and\~ y$ first evaluates $x$; if $x$ is false, +The condition $x ~\verb@and@~ y$ first evaluates $x$; if $x$ is false, its value is returned; otherwise, $y$ is evaluated and the resulting value is returned. \opindex{and} -The condition $x ~\verb\or\~ y$ first evaluates $x$; if $x$ is true, +The condition $x ~\verb@or@~ y$ first evaluates $x$; if $x$ is true, its value is returned; otherwise, $y$ is evaluated and the resulting value is returned. \opindex{or} -(Note that \verb\and\ and \verb\or\ do not restrict the value and type +(Note that \verb@and@ and \verb@or@ do not restrict the value and type they return to 0 and 1, but rather return the last evaluated argument. -This is sometimes useful, e.g. if \verb\s\ is a string that should be +This is sometimes useful, e.g. if \verb@s@ is a string that should be replaced by a default value if it is empty, the expression -\verb\s or 'foo'\ yields the desired value. Because \verb\not\ has to +\verb@s or 'foo'@ yields the desired value. Because \verb@not@ has to invent a value anyway, it does not bother to return a value of the -same type as its argument, so e.g. \verb\not 'foo'\ yields \verb\0\, -not \verb\''\.) +same type as its argument, so e.g. \verb@not 'foo'@ yields \verb@0@, +not \verb@''@.) Lambda forms (lambda expressions) have the same syntactic position as conditions. They are a shorthand to create anonymous functions; the -expression \verb\lambda\ {\em arguments}\verb\:\ {\em condition} +expression {\em {\tt lambda} arguments{\tt :} condition} yields a function object that behaves virtually identical to one -defined with \verb\def\ {\em name}\verb\(\{\em arguments}\verb\) : -return\ {\em condition}. See section \ref{function} for the syntax of +defined with +{\em {\tt def} name {\tt (}arguments{\tt ): return} condition}. +See section \ref{function} for the syntax of parameter lists. Note that functions created with lambda forms cannot contain statements. \label{lambda} @@ -686,4 +689,4 @@ tuple, but rather yields the value of that expression (condition). \indexii{trailing}{comma} (To create an empty tuple, use an empty pair of parentheses: -\verb\()\.) +\verb@()@.) diff --git a/Doc/ref1.tex b/Doc/ref1.tex index b373e36..169c244 100644 --- a/Doc/ref1.tex +++ b/Doc/ref1.tex @@ -43,22 +43,22 @@ name: lc_letter (lc_letter | "_")* lc_letter: "a"..."z" \end{verbatim} -The first line says that a \verb\name\ is an \verb\lc_letter\ followed by -a sequence of zero or more \verb\lc_letter\s and underscores. An -\verb\lc_letter\ in turn is any of the single characters `a' through `z'. +The first line says that a \verb@name@ is an \verb@lc_letter@ followed by +a sequence of zero or more \verb@lc_letter@s and underscores. An +\verb@lc_letter@ in turn is any of the single characters `a' through `z'. (This rule is actually adhered to for the names defined in lexical and grammar rules in this document.) Each rule begins with a name (which is the name defined by the rule) -and a colon. A vertical bar (\verb\|\) is used to separate +and a colon. A vertical bar (\verb@|@) is used to separate alternatives; it is the least binding operator in this notation. A -star (\verb\*\) means zero or more repetitions of the preceding item; -likewise, a plus (\verb\+\) means one or more repetitions, and a -phrase enclosed in square brackets (\verb\[ ]\) means zero or one +star (\verb@*@) means zero or more repetitions of the preceding item; +likewise, a plus (\verb@+@) means one or more repetitions, and a +phrase enclosed in square brackets (\verb@[ ]@) means zero or one occurrences (in other words, the enclosed phrase is optional). The -\verb\*\ and \verb\+\ operators bind as tightly as possible; +\verb@*@ and \verb@+@ operators bind as tightly as possible; parentheses are used for grouping. Literal strings are enclosed in -double quotes. White space is only meaningful to separate tokens. +quotes. White space is only meaningful to separate tokens. Rules are normally contained on a single line; rules with many alternatives may be formatted alternatively with each line after the first beginning with a vertical bar. @@ -66,7 +66,7 @@ first beginning with a vertical bar. In lexical definitions (as the example above), two more conventions are used: Two literal characters separated by three dots mean a choice of any single character in the given (inclusive) range of ASCII -characters. A phrase between angular brackets (\verb\<...>\) gives an +characters. A phrase between angular brackets (\verb@<...>@) gives an informal description of the symbol defined; e.g. this could be used to describe the notion of `control character' if needed. \index{lexical definitions} diff --git a/Doc/ref2.tex b/Doc/ref2.tex index 250bd2e..67f22f8 100644 --- a/Doc/ref2.tex +++ b/Doc/ref2.tex @@ -19,7 +19,7 @@ syntax (e.g. between statements in compound statements). \subsection{Comments} -A comment starts with a hash character (\verb\#\) that is not part of +A comment starts with a hash character (\verb@#@) that is not part of a string literal, and ends at the end of the physical line. A comment always signifies the end of the logical line. Comments are ignored by the syntax. @@ -28,7 +28,7 @@ the syntax. \index{physical line} \index{hash character} -\subsection{Line joining} +\subsection{Explicit line joining} Two or more physical lines may be joined into logical lines using backslash characters (\verb/\/), as follows: when a physical line ends @@ -37,15 +37,37 @@ joined with the following forming a single logical line, deleting the backslash and the following end-of-line character. For example: \index{physical line} \index{line joining} +\index{line continuation} \index{backslash character} % \begin{verbatim} -month_names = ['Januari', 'Februari', 'Maart', \ - 'April', 'Mei', 'Juni', \ - 'Juli', 'Augustus', 'September', \ - 'Oktober', 'November', 'December'] +if 1900 < year < 2100 and 1 <= month <= 12 \ + and 1 <= day <= 31 and 0 <= hour < 24 \ + and 0 <= minute < 60 and 0 <= second < 60: # Looks like a valid date + return 1 \end{verbatim} +A line ending in a backslash cannot carry a comment; a backslash does +not continue a comment (but it does continue a string literal, see +below). + +\subsection{Implicit line joining} + +Expressions in parentheses, square brackets or curly braces can be +split over more than one physical line without using backslashes. +For example: + +\begin{verbatim} +month_names = ['Januari', 'Februari', 'Maart', # These are the + 'April', 'Mei', 'Juni', # Dutch names + 'Juli', 'Augustus', 'September', # for the months + 'Oktober', 'November', 'December'] # of the year +\end{verbatim} + +Implicitly continued lines can carry comments. The indentation of the +continuation lines is not important. Blank continuation lines are +allowed. + \subsection{Blank lines} A logical line that contains only spaces, tabs, and possibly a @@ -123,7 +145,7 @@ The following example shows various indentation errors: (Actually, the first three errors are detected by the parser; only the last error is found by the lexical analyzer --- the indentation of -\verb\return r\ does not match a level popped off the stack.) +\verb@return r@ does not match a level popped off the stack.) \section{Other tokens} @@ -159,26 +181,15 @@ identifiers. They must be spelled exactly as written here: \index{reserved word} \begin{verbatim} -and del for in print -break elif from is raise -class else global not return -continue except if or try -def finally import pass while +access del from lambda return +and elif global not try +break else if or while +class except import pass +continue finally in print +def for is raise \end{verbatim} -% # This Python program sorts and formats the above table -% import string -% l = [] -% try: -% while 1: -% l = l + string.split(raw_input()) -% except EOFError: -% pass -% l.sort() -% for i in range((len(l)+4)/5): -% for j in range(i, len(l), 5): -% print string.ljust(l[j], 10), -% print +% When adding keywords, pipe it through keywords.py for reformatting \section{Literals} \label{literals} @@ -192,17 +203,24 @@ String literals are described by the following lexical definitions: \index{string literal} \begin{verbatim} -stringliteral: "'" stringitem* "'" -stringitem: stringchar | escapeseq -stringchar: -escapeseq: "'" +stringliteral: shortstring | longstring +shortstring: "'" shortstringitem* "'" | '"' shortstringitem* '"' +longstring: "'''" longstringitem* "'''" | '"""' longstringitem* '"""' +shortstringitem: shortstringchar | escapeseq +shortstringchar: +longstringchar: +escapeseq: "\" \end{verbatim} \index{ASCII} -String literals cannot span physical line boundaries. Escape -sequences in strings are actually interpreted according to rules -similar to those used by Standard C. The recognized escape sequences -are: +In ``long strings'' (strings surrounded by sets of three quotes), +unescaped newlines and quotes are allowed (and are retained), except +that three unescaped quotes in a row terminate the string. (A +``quote'' is the character used to open the string, i.e. either +\verb/'/ or \verb/"/.) + +Escape sequences in strings are interpreted according to rules similar +to those used by Standard C. The recognized escape sequences are: \index{physical line} \index{escape sequence} \index{Standard C} @@ -211,8 +229,10 @@ are: \begin{center} \begin{tabular}{|l|l|} \hline +\verb/\/{\em newline} & Ignored \\ \verb/\\/ & Backslash (\verb/\/) \\ \verb/\'/ & Single quote (\verb/'/) \\ +\verb/\"/ & Double quote (\verb/"/) \\ \verb/\a/ & ASCII Bell (BEL) \\ \verb/\b/ & ASCII Backspace (BS) \\ %\verb/\E/ & ASCII Escape (ESC) \\ @@ -309,8 +329,8 @@ Some examples of floating point literals: \end{verbatim} Note that numeric literals do not include a sign; a phrase like -\verb\-1\ is actually an expression composed of the operator -\verb\-\ and the literal \verb\1\. +\verb@-1@ is actually an expression composed of the operator +\verb@-@ and the literal \verb@1@. \section{Operators} @@ -323,7 +343,7 @@ The following tokens are operators: < == > <= <> != >= \end{verbatim} -The comparison operators \verb\<>\ and \verb\!=\ are alternate +The comparison operators \verb@<>@ and \verb@!=@ are alternate spellings of the same operator. \section{Delimiters} diff --git a/Doc/ref3.tex b/Doc/ref3.tex index 41ce234..5f9a0d8 100644 --- a/Doc/ref3.tex +++ b/Doc/ref3.tex @@ -44,7 +44,7 @@ Some objects contain references to ``external'' resources such as open files or windows. It is understood that these resources are freed when the object is garbage-collected, but since garbage collection is not guaranteed to happen, such objects also provide an explicit way to -release the external resource, usually a \verb\close\ method. +release the external resource, usually a \verb@close@ method. Programs are strongly recommended to always explicitly close such objects. @@ -69,8 +69,8 @@ objects this is not allowed. E.g. after a = 1; b = 1; c = []; d = [] \end{verbatim} -\verb\a\ and \verb\b\ may or may not refer to the same object with the -value one, depending on the implementation, but \verb\c\ and \verb\d\ +\verb@a@ and \verb@b@ may or may not refer to the same object with the +value one, depending on the implementation, but \verb@c@ and \verb@d@ are guaranteed to refer to two different, unique, newly created empty lists. @@ -90,9 +90,9 @@ Some of the type descriptions below contain a paragraph listing `special attributes'. These are attributes that provide access to the implementation and are not intended for general use. Their definition may change in the future. There are also some `generic' special -attributes, not listed with the individual objects: \verb\__methods__\ +attributes, not listed with the individual objects: \verb@__methods__@ is a list of the method names of a built-in object, if it has any; -\verb\__members__\ is a list of the data attribute names of a built-in +\verb@__members__@ is a list of the data attribute names of a built-in object, if it has any. \index{attribute} \indexii{special}{attribute} @@ -104,7 +104,7 @@ object, if it has any. \item[None] This type has a single value. There is a single object with this value. -This object is accessed through the built-in name \verb\None\. +This object is accessed through the built-in name \verb@None@. It is returned from functions that don't explicitly return an object. \ttindex{None} \obindex{None@{\tt None}} @@ -134,7 +134,7 @@ These represent numbers in the range $-2^{31}$ through $2^{31}-1$. (The range may be larger on machines with a larger natural word size, but not smaller.) When the result of an operation falls outside this range, the -exception \verb\OverflowError\ is raised. +exception \verb@OverflowError@ is raised. For the purpose of shift and mask operations, integers are assumed to have a binary, 2's complement notation using 32 or more bits, and hiding no bits from the user (i.e., all $2^{32}$ different bit @@ -172,17 +172,17 @@ C implementation for the accepted range and handling of overflow. \item[Sequences] These represent finite ordered sets indexed by natural numbers. -The built-in function \verb\len()\ returns the number of elements +The built-in function \verb@len()@ returns the number of elements of a sequence. When this number is $n$, the index set contains -the numbers $0, 1, \ldots, n-1$. Element \verb\i\ of sequence -\verb\a\ is selected by \verb\a[i]\. +the numbers $0, 1, \ldots, n-1$. Element \verb@i@ of sequence +\verb@a@ is selected by \verb@a[i]@. \obindex{seqence} \bifuncindex{len} \index{index operation} \index{item selection} \index{subscription} -Sequences also support slicing: \verb\a[i:j]\ selects all elements +Sequences also support slicing: \verb@a[i:j]@ selects all elements with index $k$ such that $i <= k < j$. When used as an expression, a slice is a sequence of the same type --- this implies that the index set is renumbered so that it starts at 0 again. @@ -209,7 +209,7 @@ The following types are immutable sequences: The elements of a string are characters. There is no separate character type; a character is represented by a string of one element. Characters represent (at least) 8-bit bytes. The built-in -functions \verb\chr()\ and \verb\ord()\ convert between characters +functions \verb@chr()@ and \verb@ord()@ convert between characters and nonnegative integers representing the byte values. Bytes with the values 0-127 represent the corresponding ASCII values. The string data type is also used to represent arrays of bytes, e.g. @@ -223,7 +223,7 @@ to hold data read from a file. (On systems whose native character set is not ASCII, strings may use EBCDIC in their internal representation, provided the functions -\verb\chr()\ and \verb\ord()\ implement a mapping between ASCII and +\verb@chr()@ and \verb@ord()@ implement a mapping between ASCII and EBCDIC, and string comparison preserves the ASCII order. Or perhaps someone can propose a better rule?) \index{ASCII} @@ -250,7 +250,7 @@ parentheses. \item[Mutable sequences] Mutable sequences can be changed after they are created. The subscription and slicing notations can be used as the target of -assignment and \verb\del\ (delete) statements. +assignment and \verb@del@ (delete) statements. \obindex{mutable sequece} \obindex{mutable} \indexii{assignment}{statement} @@ -276,10 +276,10 @@ or 1.) \item[Mapping types] These represent finite sets of objects indexed by arbitrary index sets. -The subscript notation \verb\a[k]\ selects the element indexed -by \verb\k\ from the mapping \verb\a\; this can be used in -expressions and as the target of assignments or \verb\del\ statements. -The built-in function \verb\len()\ returns the number of elements +The subscript notation \verb@a[k]@ selects the element indexed +by \verb@k@ from the mapping \verb@a@; this can be used in +expressions and as the target of assignments or \verb@del@ statements. +The built-in function \verb@len()@ returns the number of elements in a mapping. \bifuncindex{len} \index{subscription} @@ -299,7 +299,7 @@ Numeric types used for keys obey the normal rules for numeric comparison: if two numbers compare equal (e.g. 1 and 1.0) then they can be used interchangeably to index the same dictionary entry. -Dictionaries are mutable; they are created by the \verb\{...}\ +Dictionaries are mutable; they are created by the \verb@{...}@ notation (see section \ref{dict}). \obindex{dictionary} \obindex{mutable} @@ -308,7 +308,7 @@ notation (see section \ref{dict}). \item[Callable types] These are the types to which the function call (invocation) operation, -written as \verb\function(argument, argument, ...)\, can be applied: +written as \verb@function(argument, argument, ...)@, can be applied: \indexii{function}{call} \index{invocation} \indexii{function}{argument} @@ -325,8 +325,8 @@ parameter list. \obindex{function} \obindex{user-defined function} -Special read-only attributes: \verb\func_code\ is the code object -representing the compiled function body, and \verb\func_globals\ is (a +Special read-only attributes: \verb@func_code@ is the code object +representing the compiled function body, and \verb@func_globals@ is (a reference to) the dictionary that holds the function's global variables --- it implements the global name space of the module in which the function was defined. @@ -346,14 +346,14 @@ shifted one to the right. \indexii{user-defined}{method} \index{object closure} -Special read-only attributes: \verb\im_self\ is the class instance -object, \verb\im_func\ is the function object. +Special read-only attributes: \verb@im_self@ is the class instance +object, \verb@im_func@ is the function object. \ttindex{im_func} \ttindex{im_self} \item[Built-in functions] A built-in function object is a wrapper around a C function. Examples -of built-in functions are \verb\len\ and \verb\math.sin\. There +of built-in functions are \verb@len@ and \verb@math.sin@. There are no special attributes. The number and type of the arguments are determined by the C function. \obindex{built-in function} @@ -363,18 +363,20 @@ determined by the C function. \item[Built-in methods] This is really a different disguise of a built-in function, this time containing an object passed to the C function as an implicit extra -argument. An example of a built-in method is \verb\list.append\ if -\verb\list\ is a list object. +argument. An example of a built-in method is \verb@list.append@ if +\verb@list@ is a list object. \obindex{built-in method} \obindex{method} \indexii{built-in}{method} \item[Classes] Class objects are described below. When a class object is called as a -parameterless function, a new class instance (also described below) is -created and returned. The class's initialization function is not -called --- this is the responsibility of the caller. It is illegal to -call a class object with one or more arguments. +function, a new class instance (also described below) is created and +returned. This implies a call to the class's \verb@__init__@ method +if it has one. Any arguments are passed on to the \verb@__init__@ +method -- if there is \verb@__init__@ method, the class must be called +without arguments. +\ttindex{__init__} \obindex{class} \obindex{class instance} \obindex{instance} @@ -383,10 +385,10 @@ call a class object with one or more arguments. \end{description} \item[Modules] -Modules are imported by the \verb\import\ statement (see section +Modules are imported by the \verb@import@ statement (see section \ref{import}). A module object is a container for a module's name space, which is a dictionary (the same dictionary as referenced by the -\verb\func_globals\ attribute of functions defined in the module). +\verb@func_globals@ attribute of functions defined in the module). Module attribute references are translated to lookups in this dictionary. A module object does not contain the code object used to initialize the module (since it isn't needed once the initialization @@ -396,8 +398,8 @@ is done). Attribute assignment update the module's name space dictionary. -Special read-only attributes: \verb\__dict__\ yields the module's name -space as a dictionary object; \verb\__name__\ yields the module's name +Special read-only attributes: \verb@__dict__@ yields the module's name +space as a dictionary object; \verb@__name__@ yields the module's name as a string object. \ttindex{__dict__} \ttindex{__name__} @@ -423,12 +425,12 @@ Class attribute assignments update the class's dictionary, never the dictionary of a base class. \indexiii{class}{attribute}{assignment} -A class can be called as a parameterless function to yield a class -instance (see above). +A class can be called as a function to yield a class instance (see +above). \indexii{class object}{call} -Special read-only attributes: \verb\__dict__\ yields the dictionary -containing the class's name space; \verb\__bases__\ yields a tuple +Special read-only attributes: \verb@__dict__@ yields the dictionary +containing the class's name space; \verb@__bases__@ yields a tuple (possibly empty or a singleton) containing the base classes, in the order of their occurrence in the base class list. \ttindex{__dict__} @@ -436,7 +438,7 @@ order of their occurrence in the base class list. \item[Class instances] A class instance is created by calling a class object as a -parameterless function. A class instance has a dictionary in which +function. A class instance has a dictionary in which attribute references are searched. When an attribute is not found there, and the instance's class has an attribute by that name, and that class attribute is a user-defined function (and in no other @@ -457,17 +459,17 @@ section \ref{specialnames}. \obindex{sequence} \obindex{mapping} -Special read-only attributes: \verb\__dict__\ yields the attribute -dictionary; \verb\__class__\ yields the instance's class. +Special read-only attributes: \verb@__dict__@ yields the attribute +dictionary; \verb@__class__@ yields the instance's class. \ttindex{__dict__} \ttindex{__class__} \item[Files] A file object represents an open file. (It is a wrapper around a C {\tt stdio} file pointer.) File objects are created by the -\verb\open()\ built-in function, and also by \verb\posix.popen()\ and -the \verb\makefile\ method of socket objects. \verb\sys.stdin\, -\verb\sys.stdout\ and \verb\sys.stderr\ are file objects corresponding +\verb@open()@ built-in function, and also by \verb@posix.popen()@ and +the \verb@makefile@ method of socket objects. \verb@sys.stdin@, +\verb@sys.stdout@ and \verb@sys.stderr@ are file objects corresponding the the interpreter's standard input, output and error streams. See the Python Library Reference for methods of file objects and other details. @@ -500,12 +502,12 @@ was defined) which a code object contains no context. There is no way to execute a bare code object. \obindex{code} -Special read-only attributes: \verb\co_code\ is a string representing -the sequence of instructions; \verb\co_consts\ is a list of literals -used by the code; \verb\co_names\ is a list of names (strings) used by -the code; \verb\co_filename\ is the filename from which the code was +Special read-only attributes: \verb@co_code@ is a string representing +the sequence of instructions; \verb@co_consts@ is a list of literals +used by the code; \verb@co_names@ is a list of names (strings) used by +the code; \verb@co_filename@ is the filename from which the code was compiled. (To find out the line numbers, you would have to decode the -instructions; the standard library module \verb\dis\ contains an +instructions; the standard library module \verb@dis@ contains an example of how to do this.) \ttindex{co_code} \ttindex{co_consts} @@ -517,12 +519,12 @@ Frame objects represent execution frames. They may occur in traceback objects (see below). \obindex{frame} -Special read-only attributes: \verb\f_back\ is to the previous -stack frame (towards the caller), or \verb\None\ if this is the bottom -stack frame; \verb\f_code\ is the code object being executed in this -frame; \verb\f_globals\ is the dictionary used to look up global -variables; \verb\f_locals\ is used for local variables; -\verb\f_lineno\ gives the line number and \verb\f_lasti\ gives the +Special read-only attributes: \verb@f_back@ is to the previous +stack frame (towards the caller), or \verb@None@ if this is the bottom +stack frame; \verb@f_code@ is the code object being executed in this +frame; \verb@f_globals@ is the dictionary used to look up global +variables; \verb@f_locals@ is used for local variables; +\verb@f_lineno@ gives the line number and \verb@f_lasti@ gives the precise instruction (this is an index into the instruction string of the code object). \ttindex{f_back} @@ -539,11 +541,11 @@ for an exception handler unwinds the execution stack, at each unwound level a traceback object is inserted in front of the current traceback. When an exception handler is entered (see also section \ref{try}), the stack trace is -made available to the program as \verb\sys.exc_traceback\. When the +made available to the program as \verb@sys.exc_traceback@. When the program contains no suitable handler, the stack trace is written (nicely formatted) to the standard error stream; if the interpreter is interactive, it is also made available to the user as -\verb\sys.last_traceback\. +\verb@sys.last_traceback@. \obindex{traceback} \indexii{stack}{trace} \indexii{exception}{handler} @@ -553,15 +555,15 @@ interactive, it is also made available to the user as \ttindex{sys.exc_traceback} \ttindex{sys.last_traceback} -Special read-only attributes: \verb\tb_next\ is the next level in the +Special read-only attributes: \verb@tb_next@ is the next level in the stack trace (towards the frame where the exception occurred), or -\verb\None\ if there is no next level; \verb\tb_frame\ points to the -execution frame of the current level; \verb\tb_lineno\ gives the line -number where the exception occurred; \verb\tb_lasti\ indicates the +\verb@None@ if there is no next level; \verb@tb_frame@ points to the +execution frame of the current level; \verb@tb_lineno@ gives the line +number where the exception occurred; \verb@tb_lasti@ indicates the precise instruction. The line number and last instruction in the traceback may differ from the line number of its frame object if the -exception occurred in a \verb\try\ statement with no matching -\verb\except\ clause or with a \verb\finally\ clause. +exception occurred in a \verb@try@ statement with no matching +\verb@except@ clause or with a \verb@finally@ clause. \ttindex{tb_next} \ttindex{tb_frame} \ttindex{tb_lineno} @@ -578,17 +580,19 @@ exception occurred in a \verb\try\ statement with no matching A class can implement certain operations that are invoked by special syntax (such as subscription or arithmetic operations) by defining methods with special names. For instance, if a class defines a -method named \verb\__getitem__\, and \verb\x\ is an instance of this -class, then \verb\x[i]\ is equivalent to \verb\x.__getitem__(i)\. -(The reverse is not true --- if \verb\x\ is a list object, -\verb\x.__getitem__(i)\ is not equivalent to \verb\x[i]\.) +method named \verb@__getitem__@, and \verb@x@ is an instance of this +class, then \verb@x[i]@ is equivalent to \verb@x.__getitem__(i)@. +(The reverse is not true --- if \verb@x@ is a list object, +\verb@x.__getitem__(i)@ is not equivalent to \verb@x[i]@.) -Except for \verb\__repr__\, \verb\__str__\ and \verb\__cmp__\, +Except for \verb@__repr__@, \verb@__str__@ and \verb@__cmp__@, attempts to execute an operation raise an exception when no appropriate method is defined. -For \verb\__repr__\ and \verb\__cmp__\, the traditional -interpretations are used in this case. -For \verb\__str__\, the \verb\__repr__\ method is used. +For \verb@__repr__@, the default is to return a string describing the +object's class and address. +For \verb@__cmp__@, the default is to compare instances based on their +address. +For \verb@__str__@, the default is to use \verb@__repr__@. \subsection{Special methods for any type} @@ -614,17 +618,17 @@ reference is deleted. Also note that it is not guaranteed that the interpreter exits. \item[\tt __repr__(self)] -Called by the \verb\repr()\ built-in function and by conversions +Called by the \verb@repr()@ built-in function and by conversions (reverse quotes) to compute the string representation of an object. \item[\tt __str__(self)] -Called by the \verb\str()\ built-in function and by the \verb\print\ +Called by the \verb@str()@ built-in function and by the \verb@print@ statement compute the string representation of an object. \item[\tt __cmp__(self, other)] Called by all comparison operations. Should return -1 if -\verb\self < other\, 0 if \verb\self == other\, +1 if -\verb\self > other\. If no \code{__cmp__} operation is defined, class +\verb@self < other@, 0 if \verb@self == other@, +1 if +\verb@self > other@. If no \code{__cmp__} operation is defined, class instances are compared by object identity (``address''). (Implementation note: due to limitations in the interpreter, exceptions raised by comparisons are ignored, and the objects will be @@ -654,23 +658,23 @@ key's hash value is a constant. \begin{description} \item[\tt __len__(self)] -Called to implement the built-in function \verb\len()\. Should return -the length of the object, an integer \verb\>=\ 0. Also, an object -whose \verb\__len__()\ method returns 0 is considered to be false in a +Called to implement the built-in function \verb@len()@. Should return +the length of the object, an integer \verb@>=@ 0. Also, an object +whose \verb@__len__()@ method returns 0 is considered to be false in a Boolean context. \item[\tt __getitem__(self, key)] -Called to implement evaluation of \verb\self[key]\. Note that the +Called to implement evaluation of \verb@self[key]@. Note that the special interpretation of negative keys (if the class wishes to -emulate a sequence type) is up to the \verb\__getitem__\ method. +emulate a sequence type) is up to the \verb@__getitem__@ method. \item[\tt __setitem__(self, key, value)] -Called to implement assignment to \verb\self[key]\. Same note as for -\verb\__getitem__\. +Called to implement assignment to \verb@self[key]@. Same note as for +\verb@__getitem__@. \item[\tt __delitem__(self, key)] -Called to implement deletion of \verb\self[key]\. Same note as for -\verb\__getitem__\. +Called to implement deletion of \verb@self[key]@. Same note as for +\verb@__getitem__@. \end{description} @@ -680,19 +684,19 @@ Called to implement deletion of \verb\self[key]\. Same note as for \begin{description} \item[\tt __getslice__(self, i, j)] -Called to implement evaluation of \verb\self[i:j]\. Note that missing -\verb\i\ or \verb\j\ are replaced by 0 or \verb\len(self)\, -respectively, and \verb\len(self)\ has been added (once) to originally -negative \verb\i\ or \verb\j\ by the time this function is called -(unlike for \verb\__getitem__\). +Called to implement evaluation of \verb@self[i:j]@. Note that missing +\verb@i@ or \verb@j@ are replaced by 0 or \verb@len(self)@, +respectively, and \verb@len(self)@ has been added (once) to originally +negative \verb@i@ or \verb@j@ by the time this function is called +(unlike for \verb@__getitem__@). \item[\tt __setslice__(self, i, j, sequence)] -Called to implement assignment to \verb\self[i:j]\. Same notes as for -\verb\__getslice__\. +Called to implement assignment to \verb@self[i:j]@. Same notes as for +\verb@__getslice__@. \item[\tt __delslice__(self, i, j)] -Called to implement deletion of \verb\self[i:j]\. Same notes as for -\verb\__getslice__\. +Called to implement deletion of \verb@self[i:j]@. Same notes as for +\verb@__getslice__@. \end{description} @@ -713,20 +717,20 @@ Called to implement deletion of \verb\self[i:j]\. Same notes as for \item[\tt __and__(self, other)]\itemjoin \item[\tt __xor__(self, other)]\itemjoin \item[\tt __or__(self, other)]\itembreak -Called to implement the binary arithmetic operations (\verb\+\, -\verb\-\, \verb\*\, \verb\/\, \verb\%\, \verb\divmod()\, \verb\pow()\, -\verb\<<\, \verb\>>\, \verb\&\, \verb\^\, \verb\|\). +Called to implement the binary arithmetic operations (\verb@+@, +\verb@-@, \verb@*@, \verb@/@, \verb@%@, \verb@divmod()@, \verb@pow()@, +\verb@<<@, \verb@>>@, \verb@&@, \verb@^@, \verb@|@). \item[\tt __neg__(self)]\itemjoin \item[\tt __pos__(self)]\itemjoin \item[\tt __abs__(self)]\itemjoin \item[\tt __invert__(self)]\itembreak -Called to implement the unary arithmetic operations (\verb\-\, \verb\+\, -\verb\abs()\ and \verb\~\). +Called to implement the unary arithmetic operations (\verb@-@, \verb@+@, +\verb@abs()@ and \verb@~@). \item[\tt __nonzero__(self)] Called to implement boolean testing; should return 0 or 1. An -alternative name for this method is \verb\__len__\. +alternative name for this method is \verb@__len__@. \item[\tt __coerce__(self, other)] Called to implement ``mixed-mode'' numeric arithmetic. Should either @@ -737,11 +741,11 @@ interpreter will also ask the other object to attempt a coercion (but sometimes, if the implementation of the other type cannot be changed, it is useful to do the conversion to the other type here). -Note that this method is not called to coerce the arguments to \verb\+\ -and \verb\*\, because these are also used to implement sequence +Note that this method is not called to coerce the arguments to \verb@+@ +and \verb@*@, because these are also used to implement sequence concatenation and repetition, respectively. Also note that, for the -same reason, in \verb\n*x\, where \verb\n\ is a built-in number and -\verb\x\ is an instance, a call to \verb\x.__mul__(n)\ is made.% +same reason, in \verb@n*x@, where \verb@n@ is a built-in number and +\verb@x@ is an instance, a call to \verb@x.__mul__(n)@ is made.% \footnote{The interpreter should really distinguish between user-defined classes implementing sequences, mappings or numbers, but currently it doesn't --- hence this strange exception.} @@ -749,12 +753,12 @@ currently it doesn't --- hence this strange exception.} \item[\tt __int__(self)]\itemjoin \item[\tt __long__(self)]\itemjoin \item[\tt __float__(self)]\itembreak -Called to implement the built-in functions \verb\int()\, \verb\long()\ -and \verb\float()\. Should return a value of the appropriate type. +Called to implement the built-in functions \verb@int()@, \verb@long()@ +and \verb@float()@. Should return a value of the appropriate type. \item[\tt __oct__(self)]\itemjoin \item[\tt __hex__(self)]\itembreak -Called to implement the built-in functions \verb\oct()\ and -\verb\hex()\. Should return a string value. +Called to implement the built-in functions \verb@oct()@ and +\verb@hex()@. Should return a string value. \end{description} diff --git a/Doc/ref4.tex b/Doc/ref4.tex index 62db120..c14fada 100644 --- a/Doc/ref4.tex +++ b/Doc/ref4.tex @@ -20,9 +20,9 @@ The following are code blocks: A module is a code block. A function body is a code block. A class definition is a code block. Each command typed interactively is a separate code block; a script file is a code block. The string argument passed to the built-in function -\verb\eval\ and to the \verb\exec\ statement are code blocks. +\verb@eval@ and to the \verb@exec@ statement are code blocks. And finally, the -expression read and evaluated by the built-in function \verb\input\ is +expression read and evaluated by the built-in function \verb@input@ is a code block. A code block is executed in an execution frame. An {\em execution @@ -46,7 +46,7 @@ Name spaces are functionally equivalent to dictionaries. The {\em local name space} of an execution frame determines the default place where names are defined and searched. The {\em global name -space} determines the place where names listed in \verb\global\ +space} determines the place where names listed in \verb@global@ statements are defined and searched, and where names that are not explicitly bound in the current code block are searched. \indexii{local}{name space} @@ -55,25 +55,35 @@ explicitly bound in the current code block are searched. Whether a name is local or global in a code block is determined by static inspection of the source text for the code block: in the -absence of \verb\global\ statements, a name that is bound anywhere in +absence of \verb@global@ statements, a name that is bound anywhere in the code block is local in the entire code block; all other names are -considered global. The \verb\global\ statement forces global +considered global. The \verb@global@ statement forces global interpretation of selected names throughout the code block. The -following constructs bind names: formal parameters, \verb\import\ +following constructs bind names: formal parameters, \verb@import@ statements, class and function definitions (these bind the class or function name), and targets that are identifiers if occurring in an -assignment, \verb\for\ loop header, or \verb\except\ clause header. -(A target occurring in a \verb\del\ statement does not bind a name.) +assignment, \verb@for@ loop header, or \verb@except@ clause header. + +A target occurring in a \verb@del@ statement is also considered bound +for this purpose (though the actual semantics are to ``unbind'' the +name). When a global name is not found in the global name space, it is searched in the list of ``built-in'' names (which is actually the -global name space of the module \verb\__builtin__\). When a name is not -found at all, the \verb\NameError\ exception is raised. +global name space of the module \verb@__builtin__@). When a name is not +found at all, the \verb@NameError@ exception is raised.% +\footnote{If the code block contains \verb@exec@ statement or the +construct \verb@from ... import *@, the semantics of names not +explicitly mentioned in a \verb@global@ statement change subtly: name +lookup first searches the local name space, then the global one, then +the built-in one.} The following table lists the meaning of the local and global name space for various types of code blocks. The name space for a particular module is automatically created when the module is first -referenced. +referenced. Note that in almost all cases, the global name space is +the name space of the containing module -- scopes in Python do not +nest! \begin{center} \begin{tabular}{|l|l|l|l|} @@ -81,15 +91,18 @@ referenced. Code block type & Global name space & Local name space & Notes \\ \hline Module & n.s. for this module & same as global & \\ -Script & n.s. for \verb\__main__\ & same as global & \\ -Interactive command & n.s. for \verb\__main__\ & same as global & \\ +Script & n.s. for \verb@__main__@ & same as global & \\ +Interactive command & n.s. for \verb@__main__@ & same as global & \\ Class definition & global n.s. of containing block & new n.s. & \\ Function body & global n.s. of containing block & new n.s. & \\ -String passed to \verb\exec\ or \verb\eval\ +String passed to \verb@exec@ statement + & global n.s. of cobtaining block + & local n.s. of containing block & (1) \\ +String passed to \verb@eval()@ & global n.s. of caller & local n.s. of caller & (1) \\ -File read by \verb\execfile\ +File read by \verb@execfile()@ & global n.s. of caller & local n.s. of caller & (1) \\ -Expression read by \verb\input\ +Expression read by \verb@input@ & global n.s. of caller & local n.s. of caller & \\ \hline \end{tabular} @@ -101,7 +114,7 @@ Notes: \item[n.s.] means {\em name space} -\item[(1)] The global and local name space for these functions can be +\item[(1)] The global and local name space for these can be overridden with optional extra arguments. \end{description} @@ -123,8 +136,8 @@ where the error occurred. The Python interpreter raises an exception when it detects an run-time error (such as division by zero). A Python program can also -explicitly raise an exception with the \verb\raise\ statement. -Exception handlers are specified with the \verb\try...except\ +explicitly raise an exception with the \verb@raise@ statement. +Exception handlers are specified with the \verb@try...except@ statement. Python uses the ``termination'' model of error handling: an exception @@ -139,10 +152,10 @@ execution of the program, or returns to its interactive main loop. Exceptions are identified by string objects. Two different string objects with the same value identify different exceptions. -When an exception is raised, an object (maybe \verb\None\) is passed +When an exception is raised, an object (maybe \verb@None@) is passed as the exception's ``parameter''; this object does not affect the selection of an exception handler, but is passed to the selected exception handler as additional information. -See also the description of the \verb\try\ and \verb\raise\ +See also the description of the \verb@try@ and \verb@raise@ statements. diff --git a/Doc/ref5.tex b/Doc/ref5.tex index 55f523f..3e60931 100644 --- a/Doc/ref5.tex +++ b/Doc/ref5.tex @@ -12,14 +12,14 @@ may be used wherever an expression is required by enclosing it in parentheses. The only places where expressions are used in the syntax instead of conditions is in expression statements and on the right-hand side of assignment statements; this catches some nasty bugs -like accidentally writing \verb\x == 1\ instead of \verb\x = 1\. +like accidentally writing \verb@x == 1@ instead of \verb@x = 1@. \indexii{assignment}{statement} The comma plays several roles in Python's syntax. It is usually an operator with a lower precedence than all others, but occasionally serves other purposes as well; e.g. it separates function arguments, is used in list and dictionary constructors, and has special semantics -in \verb\print\ statements. +in \verb@print@ statements. \index{comma} When (one alternative of) a syntax rule has the form @@ -28,8 +28,8 @@ When (one alternative of) a syntax rule has the form name: othername \end{verbatim} -and no semantics are given, the semantics of this form of \verb\name\ -are the same as for \verb\othername\. +and no semantics are given, the semantics of this form of \verb@name@ +are the same as for \verb@othername@. \index{syntax} \section{Arithmetic conversions} @@ -38,7 +38,7 @@ are the same as for \verb\othername\. When a description of an arithmetic operator below uses the phrase ``the numeric arguments are converted to a common type'', this both means that if either argument is not a number, a -\verb\TypeError\ exception is raised, and that otherwise +\verb@TypeError@ exception is raised, and that otherwise the following conversions are applied: \exindex{TypeError} \indexii{floating point}{number} @@ -71,11 +71,13 @@ enclosure: parenth_form | list_display | dict_display | string_conversion \index{identifier} An identifier occurring as an atom is a reference to a local, global -or built-in name binding. If a name can be assigned to anywhere in a -code block, and is not mentioned in a \verb\global\ statement in that -code block, it refers to a local name throughout that code block. -Otherwise, it refers to a global name if one exists, else to a -built-in name. +or built-in name binding. If a name is assigned to anywhere in a code +block (even in unreachable code), and is not mentioned in a +\verb@global@ statement in that code block, then it refers to a local +name throughout that code block. When it is not assigned to anywhere +in the block, or when it is assigned to but also explicitly listed in +a \verb@global@ statement, it refers to a global name if one exists, +else to a built-in name (and this binding may dynamically change). \indexii{name}{binding} \index{code block} \stindex{global} @@ -84,7 +86,7 @@ built-in name. When the name is bound to an object, evaluation of the atom yields that object. When a name is not bound, an attempt to evaluate it -raises a \verb\NameError\ exception. +raises a \verb@NameError@ exception. \exindex{NameError} \subsection{Literals} @@ -197,10 +199,10 @@ A string conversion evaluates the contained condition list and converts the resulting object into a string according to rules specific to its type. -If the object is a string, a number, \verb\None\, or a tuple, list or +If the object is a string, a number, \verb@None@, or a tuple, list or dictionary containing only objects whose type is one of these, the resulting string is a valid Python expression which can be passed to -the built-in function \verb\eval()\ to yield an expression with the +the built-in function \verb@eval()@ to yield an expression with the same value (or an approximation, if floating point numbers are involved). @@ -234,7 +236,7 @@ attributeref: primary "." identifier The primary must evaluate to an object of a type that supports attribute references, e.g. a module or a list. This object is then asked to produce the attribute whose name is the identifier. If this -attribute is not available, the exception \verb\AttributeError\ is +attribute is not available, the exception \verb@AttributeError@ is raised. Otherwise, the type and value of the object produced is determined by the object. Multiple evaluations of the same attribute reference may yield different objects. @@ -266,7 +268,7 @@ the value in the mapping that corresponds to that key. If it is a sequence, the condition must evaluate to a plain integer. If this value is negative, the length of the sequence is added to it -(so that, e.g. \verb\x[-1]\ selects the last item of \verb\x\.) +(so that, e.g. \verb@x[-1]@ selects the last item of \verb@x@.) The resulting value must be a nonnegative integer smaller than the number of items in the sequence, and the subscription selects the item whose index is that value (counting from zero). @@ -318,7 +320,7 @@ objects, and methods of class instances are callable). If it is a class, the argument list must be empty; otherwise, the arguments are evaluated. -A call always returns some value, possibly \verb\None\, unless it +A call always returns some value, possibly \verb@None@, unless it raises an exception. How this value is computed depends on the type of the callable object. If it is: @@ -328,7 +330,7 @@ of the callable object. If it is: executed, passing it the argument list. The first thing the code block will do is bind the formal parameters to the arguments; this is described in section \ref{function}. When the code block executes a -\verb\return\ statement, this specifies the return value of the +\verb@return@ statement, this specifies the return value of the function call. \indexii{function}{call} \indexiii{user-defined}{function}{call} @@ -371,22 +373,22 @@ All unary arithmetic (and bit-wise) operations have the same priority: u_expr: primary | "-" u_expr | "+" u_expr | "~" u_expr \end{verbatim} -The unary \verb\"-"\ (minus) operator yields the negation of its +The unary \verb@"-"@ (minus) operator yields the negation of its numeric argument. \index{negation} \index{minus} -The unary \verb\"+"\ (plus) operator yields its numeric argument +The unary \verb@"+"@ (plus) operator yields its numeric argument unchanged. \index{plus} -The unary \verb\"~"\ (invert) operator yields the bit-wise inversion +The unary \verb@"~"@ (invert) operator yields the bit-wise inversion of its plain or long integer argument. The bit-wise inversion of -\verb\x\ is defined as \verb\-(x+1)\. +\verb@x@ is defined as \verb@-(x+1)@. \index{inversion} In all three cases, if the argument does not have the proper type, -a \verb\TypeError\ exception is raised. +a \verb@TypeError@ exception is raised. \exindex{TypeError} \section{Binary arithmetic operations} @@ -404,7 +406,7 @@ m_expr: u_expr | m_expr "*" u_expr a_expr: m_expr | aexpr "+" m_expr | aexpr "-" m_expr \end{verbatim} -The \verb\"*"\ (multiplication) operator yields the product of its +The \verb@"*"@ (multiplication) operator yields the product of its arguments. The arguments must either both be numbers, or one argument must be a plain integer and the other must be a sequence. In the former case, the numbers are converted to a common type and then @@ -412,40 +414,40 @@ multiplied together. In the latter case, sequence repetition is performed; a negative repetition factor yields an empty sequence. \index{multiplication} -The \verb\"/"\ (division) operator yields the quotient of its +The \verb@"/"@ (division) operator yields the quotient of its arguments. The numeric arguments are first converted to a common type. Plain or long integer division yields an integer of the same type; the result is that of mathematical division with the `floor' function applied to the result. Division by zero raises the -\verb\ZeroDivisionError\ exception. +\verb@ZeroDivisionError@ exception. \exindex{ZeroDivisionError} \index{division} -The \verb\"%"\ (modulo) operator yields the remainder from the +The \verb@"%"@ (modulo) operator yields the remainder from the division of the first argument by the second. The numeric arguments are first converted to a common type. A zero right argument raises -the \verb\ZeroDivisionError\ exception. The arguments may be floating -point numbers, e.g. \verb\3.14 % 0.7\ equals \verb\0.34\. The modulo +the \verb@ZeroDivisionError@ exception. The arguments may be floating +point numbers, e.g. \verb@3.14 % 0.7@ equals \verb@0.34@. The modulo operator always yields a result with the same sign as its second operand (or zero); the absolute value of the result is strictly smaller than the second operand. \index{modulo} The integer division and modulo operators are connected by the -following identity: \verb\x == (x/y)*y + (x%y)\. Integer division and -modulo are also connected with the built-in function \verb\divmod()\: -\verb\divmod(x, y) == (x/y, x%y)\. These identities don't hold for +following identity: \verb@x == (x/y)*y + (x%y)@. Integer division and +modulo are also connected with the built-in function \verb@divmod()@: +\verb@divmod(x, y) == (x/y, x%y)@. These identities don't hold for floating point numbers; there a similar identity holds where -\verb\x/y\ is replaced by \verb\floor(x/y)\). +\verb@x/y@ is replaced by \verb@floor(x/y)@). -The \verb\"+"\ (addition) operator yields the sum of its arguments. +The \verb@"+"@ (addition) operator yields the sum of its arguments. The arguments must either both be numbers, or both sequences of the same type. In the former case, the numbers are converted to a common type and then added together. In the latter case, the sequences are concatenated. \index{addition} -The \verb\"-"\ (subtraction) operator yields the difference of its +The \verb@"-"@ (subtraction) operator yields the difference of its arguments. The numeric arguments are first converted to a common type. \index{subtraction} @@ -470,7 +472,7 @@ shift by $n$ bits is defined as multiplication with $2^n$; for plain integers there is no overflow check so this drops bits and flip the sign if the result is not less than $2^{31}$ in absolute value. -Negative shift counts raise a \verb\ValueError\ exception. +Negative shift counts raise a \verb@ValueError@ exception. \exindex{ValueError} \section{Binary bit-wise operations} @@ -484,18 +486,18 @@ xor_expr: and_expr | xor_expr "^" and_expr or_expr: xor_expr | or_expr "|" xor_expr \end{verbatim} -The \verb\"&"\ operator yields the bitwise AND of its arguments, which +The \verb@"&"@ operator yields the bitwise AND of its arguments, which must be plain or long integers. The arguments are converted to a common type. \indexii{bit-wise}{and} -The \verb\"^"\ operator yields the bitwise XOR (exclusive OR) of its +The \verb@"^"@ operator yields the bitwise XOR (exclusive OR) of its arguments, which must be plain or long integers. The arguments are converted to a common type. \indexii{bit-wise}{xor} \indexii{exclusive}{or} -The \verb\"|"\ operator yields the bitwise (inclusive) OR of its +The \verb@"|"@ operator yields the bitwise (inclusive) OR of its arguments, which must be plain or long integers. The arguments are converted to a common type. \indexii{bit-wise}{or} @@ -507,7 +509,7 @@ converted to a common type. Contrary to C, all comparison operations in Python have the same priority, which is lower than that of any arithmetic, shifting or bitwise operation. Also contrary to C, expressions like -\verb\a < b < c\ have the interpretation that is conventional in +\verb@a < b < c@ have the interpretation that is conventional in mathematics: \index{C} @@ -519,23 +521,23 @@ comp_operator: "<"|">"|"=="|">="|"<="|"<>"|"!="|"is" ["not"]|["not"] "in" Comparisons yield integer values: 1 for true, 0 for false. Comparisons can be chained arbitrarily, e.g. $x < y <= z$ is -equivalent to $x < y$ \verb\and\ $y <= z$, except that $y$ is +equivalent to $x < y$ \verb@and@ $y <= z$, except that $y$ is evaluated only once (but in both cases $z$ is not evaluated at all when $x < y$ is found to be false). \indexii{chaining}{comparisons} \catcode`\_=8 Formally, $e_0 op_1 e_1 op_2 e_2 ...e_{n-1} op_n e_n$ is equivalent to -$e_0 op_1 e_1$ \verb\and\ $e_1 op_2 e_2$ \verb\and\ ... \verb\and\ +$e_0 op_1 e_1$ \verb@and@ $e_1 op_2 e_2$ \verb@and@ ... \verb@and@ $e_{n-1} op_n e_n$, except that each expression is evaluated at most once. Note that $e_0 op_1 e_1 op_2 e_2$ does not imply any kind of comparison between $e_0$ and $e_2$, e.g. $x < y > z$ is perfectly legal. \catcode`\_=12 -The forms \verb\<>\ and \verb\!=\ are equivalent; for consistency with -C, \verb\!=\ is preferred; where \verb\!=\ is mentioned below -\verb\<>\ is also implied. +The forms \verb@<>@ and \verb@!=@ are equivalent; for consistency with +C, \verb@!=@ is preferred; where \verb@!=@ is mentioned below +\verb@<>@ is also implied. The operators {\tt "<", ">", "==", ">=", "<="}, and {\tt "!="} compare the values of two objects. The objects needn't have the same type. @@ -544,8 +546,8 @@ objects of different types {\em always} compare unequal, and are ordered consistently but arbitrarily. (This unusual definition of comparison is done to simplify the -definition of operations like sorting and the \verb\in\ and \verb\not -in\ operators.) +definition of operations like sorting and the \verb@in@ and +\verb@not in@ operators.) Comparison of objects of the same type depends on the type: @@ -556,7 +558,7 @@ Numbers are compared arithmetically. \item Strings are compared lexicographically using the numeric equivalents -(the result of the built-in function \verb\ord\) of their characters. +(the result of the built-in function \verb@ord@) of their characters. \item Tuples and lists are compared lexicographically using comparison of @@ -579,11 +581,11 @@ execution of a program. \end{itemize} -The operators \verb\in\ and \verb\not in\ test for sequence -membership: if $y$ is a sequence, $x ~\verb\in\~ y$ is true if and +The operators \verb@in@ and \verb@not in@ test for sequence +membership: if $y$ is a sequence, $x ~\verb@in@~ y$ is true if and only if there exists an index $i$ such that $x = y[i]$. -$x ~\verb\not in\~ y$ yields the inverse truth value. The exception -\verb\TypeError\ is raised when $y$ is not a sequence, or when $y$ is +$x ~\verb@not in@~ y$ yields the inverse truth value. The exception +\verb@TypeError@ is raised when $y$ is not a sequence, or when $y$ is a string and $x$ is not a string of length one.% \footnote{The latter restriction is sometimes a nuisance.} \opindex{in} @@ -591,9 +593,9 @@ a string and $x$ is not a string of length one.% \indexii{membership}{test} \obindex{sequence} -The operators \verb\is\ and \verb\is not\ test for object identity: -$x ~\verb\is\~ y$ is true if and only if $x$ and $y$ are the same -object. $x ~\verb\is not\~ y$ yields the inverse truth value. +The operators \verb@is@ and \verb@is not@ test for object identity: +$x ~\verb@is@~ y$ is true if and only if $x$ and $y$ are the same +object. $x ~\verb@is not@~ y$ yields the inverse truth value. \opindex{is} \opindex{is not} \indexii{identity}{test} @@ -613,38 +615,39 @@ lambda_form: "lambda" [parameter_list]: condition In the context of Boolean operations, and also when conditions are used by control flow statements, the following values are interpreted -as false: \verb\None\, numeric zero of all types, empty sequences +as false: \verb@None@, numeric zero of all types, empty sequences (strings, tuples and lists), and empty mappings (dictionaries). All other values are interpreted as true. -The operator \verb\not\ yields 1 if its argument is false, 0 otherwise. +The operator \verb@not@ yields 1 if its argument is false, 0 otherwise. \opindex{not} -The condition $x ~\verb\and\~ y$ first evaluates $x$; if $x$ is false, +The condition $x ~\verb@and@~ y$ first evaluates $x$; if $x$ is false, its value is returned; otherwise, $y$ is evaluated and the resulting value is returned. \opindex{and} -The condition $x ~\verb\or\~ y$ first evaluates $x$; if $x$ is true, +The condition $x ~\verb@or@~ y$ first evaluates $x$; if $x$ is true, its value is returned; otherwise, $y$ is evaluated and the resulting value is returned. \opindex{or} -(Note that \verb\and\ and \verb\or\ do not restrict the value and type +(Note that \verb@and@ and \verb@or@ do not restrict the value and type they return to 0 and 1, but rather return the last evaluated argument. -This is sometimes useful, e.g. if \verb\s\ is a string that should be +This is sometimes useful, e.g. if \verb@s@ is a string that should be replaced by a default value if it is empty, the expression -\verb\s or 'foo'\ yields the desired value. Because \verb\not\ has to +\verb@s or 'foo'@ yields the desired value. Because \verb@not@ has to invent a value anyway, it does not bother to return a value of the -same type as its argument, so e.g. \verb\not 'foo'\ yields \verb\0\, -not \verb\''\.) +same type as its argument, so e.g. \verb@not 'foo'@ yields \verb@0@, +not \verb@''@.) Lambda forms (lambda expressions) have the same syntactic position as conditions. They are a shorthand to create anonymous functions; the -expression \verb\lambda\ {\em arguments}\verb\:\ {\em condition} +expression {\em {\tt lambda} arguments{\tt :} condition} yields a function object that behaves virtually identical to one -defined with \verb\def\ {\em name}\verb\(\{\em arguments}\verb\) : -return\ {\em condition}. See section \ref{function} for the syntax of +defined with +{\em {\tt def} name {\tt (}arguments{\tt ): return} condition}. +See section \ref{function} for the syntax of parameter lists. Note that functions created with lambda forms cannot contain statements. \label{lambda} @@ -686,4 +689,4 @@ tuple, but rather yields the value of that expression (condition). \indexii{trailing}{comma} (To create an empty tuple, use an empty pair of parentheses: -\verb\()\.) +\verb@()@.) diff --git a/Doc/texipre.dat b/Doc/texipre.dat index c531077..f8cc166 100644 --- a/Doc/texipre.dat +++ b/Doc/texipre.dat @@ -14,7 +14,7 @@ the language, see the Python Tutorial. The Python Reference Manual gives a more formal definition of the language. (These manuals are not yet available in INFO or Texinfo format.) -Copyright (C) 1991, 1992, 1993 by Stichting Mathematisch Centrum, +Copyright (C) 1991, 1992, 1993, 1994 by Stichting Mathematisch Centrum, Amsterdam, The Netherlands. All Rights Reserved @@ -43,7 +43,7 @@ OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. @c The following two commands start the copyright page. @page @vskip 0pt plus 1filll -Copyright @copyright{} 1991, 1992, 1993 by Stichting Mathematisch Centrum, +Copyright @copyright{} 1991, 1992, 1993, 1994 by Stichting Mathematisch Centrum, Amsterdam, The Netherlands. @center All Rights Reserved @@ -77,7 +77,7 @@ the language, see the @cite{Python Tutorial}. The @cite{Python Reference Manual} gives a more formal definition of the language. (These manuals are not yet available in INFO or Texinfo format.) -This version corresponds roughly to Python version 1.0 (yet to be released). +This version corresponds to Python version 1.0.2. @end ifinfo diff --git a/Doc/tools/fix.el b/Doc/tools/fix.el index 25086e4..f36d6f0 100644 --- a/Doc/tools/fix.el +++ b/Doc/tools/fix.el @@ -1,6 +1,5 @@ ; load the new texinfo package (2.xx) if not installed by default -; (setq load-path -; (cons "/ufs/jh/lib/emacs/texinfo-2.14" load-path)) -(find-file "lib.texi") +; (setq load-path (cons "/ufs/guido/lib/emacs/texinfo-2.14" load-path)) +(find-file "@lib.texi") (texinfo-all-menus-update t) (texinfo-all-menus-update t) diff --git a/Doc/tools/fix_hack b/Doc/tools/fix_hack index 8c97729..8dad111 100755 --- a/Doc/tools/fix_hack +++ b/Doc/tools/fix_hack @@ -1 +1,2 @@ +#!/bin/sh sed -e 's/{\\ptt[ ]*\\char[ ]*'"'"'137}/_/g' <"$1" > "@$1" && mv "@$1" $1 diff --git a/Doc/tut.tex b/Doc/tut.tex index 5327488..5353d56 100644 --- a/Doc/tut.tex +++ b/Doc/tut.tex @@ -1,17 +1,15 @@ \documentstyle[twoside,11pt,myformat]{report} -\title{\bf - Python Tutorial -} - +\title{Python Tutorial} + \author{ - Guido van Rossum \\ - Dept. CST, CWI, P.O. Box 94079 \\ - 1090 GB Amsterdam, The Netherlands \\ - E-mail: {\tt guido@cwi.nl} + Guido van Rossum \\ + Dept. CST, CWI, P.O. Box 94079 \\ + 1090 GB Amsterdam, The Netherlands \\ + E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! \begin{document} @@ -65,7 +63,7 @@ a more formal definition of the language. If you ever wrote a large shell script, you probably know this feeling: you'd love to add yet another feature, but it's already so slow, and so big, and so complicated; or the feature involves a system -call or other funcion that is only accessible from C \ldots Usually +call or other function that is only accessible from C \ldots Usually the problem at hand isn't serious enough to warrant rewriting the script in C; perhaps because the problem requires variable-length strings or other data types (like sorted lists of file names) that are @@ -137,9 +135,10 @@ trying out the examples shown later. The rest of the tutorial introduces various features of the Python language and system though examples, beginning with simple expressions, statements and data types, through functions and modules, -and finally touching upon advanced concepts like exceptions. +and finally touching upon advanced concepts like exceptions +and user-defined classes. -When you're through with the turtorial (or just getting bored), you +When you're through with the tutorial (or just getting bored), you should read the Library Reference, which gives complete (though terse) reference material about built-in and standard types, functions and modules that can save you a lot of time when writing Python programs. @@ -219,8 +218,8 @@ and a copyright notice before printing the first prompt, e.g.: \bcode\begin{verbatim} python -Python 0.9.9 (Apr 2 1993). -Copyright 1990, 1991, 1992, 1993 Stichting Mathematisch Centrum, Amsterdam +Python 1.0.3 (Jul 14 1994) +Copyright 1991-1994 Stichting Mathematisch Centrum, Amsterdam >>> \end{verbatim}\ecode @@ -244,7 +243,7 @@ Typing the interrupt character (usually Control-C or DEL) to the primary or secondary prompt cancels the input and returns to the primary prompt.% \footnote{ - A problem with the GNU Readline package may prevent this. + A problem with the GNU Readline package may prevent this. } Typing an interrupt while a command is executing raises the {\tt KeyboardInterrupt} exception, which may be handled by a {\tt try} @@ -301,7 +300,7 @@ When you use Python interactively, it is frequently handy to have some standard commands executed every time the interpreter is started. You can do this by setting an environment variable named {\tt PYTHONSTARTUP} to the name of a file containing your start-up -commands. This is similar to the {\tt /profile} feature of the UNIX +commands. This is similar to the {\tt .profile} feature of the UNIX shells. This file is only read in interactive sessions, not when Python reads @@ -423,9 +422,9 @@ the example, you must type everything after the prompt, when the prompt appears; lines that do not begin with a prompt are output from the interpreter.% \footnote{ - I'd prefer to use different fonts to distinguish input - from output, but the amount of LaTeX hacking that would require - is currently beyond my ability. + I'd prefer to use different fonts to distinguish input + from output, but the amount of LaTeX hacking that would require + is currently beyond my ability. } Note that a secondary prompt on a line by itself in an example means you must type a blank line; this is used to end a multi-line command. @@ -444,15 +443,20 @@ work just like in most other languages (e.g., Pascal or C); parentheses can be used for grouping. For example: \bcode\begin{verbatim} ->>> # This is a comment >>> 2+2 4 ->>> +>>> # This is a comment +... 2+2 +4 +>>> 2+2 # and a comment on the same line as code +4 >>> (50-5*6)/4 5 ->>> # Division truncates towards zero: ->>> 7/3 +>>> # Integer division returns the floor: +... 7/3 2 +>>> 7/-3 +-3 >>> \end{verbatim}\ecode % @@ -470,9 +474,14 @@ variable. The value of an assignment is not written: A value can be assigned to several variables simultaneously: \bcode\begin{verbatim} ->>> # Zero x, y and z ->>> x = y = z = 0 ->>> +>>> x = y = z = 0 # Zero x, y and z +>>> x +0 +>>> y +0 +>>> z +0 +>>> \end{verbatim}\ecode % There is full support for floating point; operators with mixed type @@ -489,21 +498,30 @@ operands convert the integer operand to floating point: \subsection{Strings} Besides numbers, Python can also manipulate strings, enclosed in -single quotes: +single quotes or double quotes: \bcode\begin{verbatim} >>> 'foo bar' 'foo bar' >>> 'doesn\'t' -'doesn\'t' +"doesn't" +>>> "doesn't" +"doesn't" +>>> '"Yes," he said.' +'"Yes," he said.' +>>> "\"Yes,\" he said." +'"Yes," he said.' +>>> '"Isn\'t," she said.' +'"Isn\'t," she said.' >>> \end{verbatim}\ecode % Strings are written the same way as they are typed for input: inside -quotes and with quotes and other funny characters escaped by -backslashes, to show the precise value. (The {\tt print} statement, -described later, can be used to write strings without quotes or -escapes.) +quotes and with quotes and other funny characters escaped by backslashes, +to show the precise value. The string is enclosed in double quotes if +the string contains a single quote and no double quotes, else it's +enclosed in single quotes. (The {\tt print} statement, described later, +can be used to write strings without quotes or escapes.) Strings can be concatenated (glued together) with the {\tt +} operator, and repeated with {\tt *}: @@ -602,7 +620,9 @@ for single-element (non-slice) indices: >>> word[-100:] 'HelpA' >>> word[-10] # error -Unhandled exception: IndexError: string index out of range +Traceback (innermost last): + File "", line 1 +IndexError: string index out of range >>> \end{verbatim}\ecode % @@ -687,15 +707,15 @@ of the list: \bcode\begin{verbatim} >>> # Replace some items: ->>> a[0:2] = [1, 12] +... a[0:2] = [1, 12] >>> a [1, 12, 123, 1234] >>> # Remove some: ->>> a[0:2] = [] +... a[0:2] = [] >>> a [123, 1234] >>> # Insert some: ->>> a[1:1] = ['bletch', 'xyzzy'] +... a[1:1] = ['bletch', 'xyzzy'] >>> a [123, 'bletch', 'xyzzy', 1234] >>> a[:0] = a # Insert (a copy of) itself at the beginning @@ -743,8 +763,8 @@ subsequence of the {\em Fibonacci} series as follows: \bcode\begin{verbatim} >>> # Fibonacci series: ->>> # the sum of two elements defines the next ->>> a, b = 0, 1 +... # the sum of two elements defines the next +... a, b = 0, 1 >>> while b < 10: ... print b ... a, b = b, a+b @@ -864,7 +884,7 @@ example (no pun intended): \bcode\begin{verbatim} >>> # Measure some strings: ->>> a = ['cat', 'window', 'defenestrate'] +... a = ['cat', 'window', 'defenestrate'] >>> for x in a: ... print x, len(x) ... @@ -992,7 +1012,7 @@ arbitrary boundary: ... a, b = b, a+b ... >>> # Now call the function we just defined: ->>> fib(2000) +... fib(2000) 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 >>> \end{verbatim}\ecode @@ -1006,20 +1026,21 @@ The {\em execution} of a function introduces a new symbol table used for the local variables of the function. More precisely, all variable assignments in a function store the value in the local symbol table; whereas - variable references first look in the local symbol table, then +variable references first look in the local symbol table, then in the global symbol table, and then in the table of built-in names. Thus, global variables cannot be directly assigned to from within a -function, although they may be referenced. +function (unless named in a {\tt global} statement), although +they may be referenced. The actual parameters (arguments) to a function call are introduced in the local symbol table of the called function when it is called; thus, arguments are passed using {\em call\ by\ value}.% \footnote{ - Actually, {\em call by object reference} would be a better - description, since if a mutable object is passed, the caller - will see any changes the callee makes to it (e.g., items - inserted into a list). + Actually, {\em call by object reference} would be a better + description, since if a mutable object is passed, the caller + will see any changes the callee makes to it (e.g., items + inserted into a list). } When a function calls another function, a new local symbol table is created for that call. @@ -1081,7 +1102,7 @@ This example, as usual, demonstrates some new Python features: \item The {\tt return} statement returns with a value from a function. {\tt return} without an expression argument is used to return from the middle -of a procedure (falling off the end also returns from a proceduce), in +of a procedure (falling off the end also returns from a procedure), in which case the {\tt None} value is returned. \item @@ -1092,8 +1113,8 @@ object (this may be an expression), and {\tt methodname} is the name of a method that is defined by the object's type. Different types define different methods. Methods of different types may have the same name without causing ambiguity. (It is possible to define your -own object types and methods, using {\em classes}. This is an -advanced feature that is not discussed in this tutorial.) +own object types and methods, using {\em classes}, as discussed later +in this tutorial.) The method {\tt append} shown in the example, is defined for list objects; it adds a new element at the end of the list. In this example @@ -1137,12 +1158,17 @@ Sort the items of the list, in place. \item[{\tt reverse()}] Reverse the elements of the list, in place. +\item[{\tt count(x)}] +Return the number of times {\tt x} appears in the list. + \end{description} An example that uses all list methods: \bcode\begin{verbatim} >>> a = [66.6, 333, 333, 1, 1234.5] +>>> print a.count(333), a.count(66.6), a.count('x') +2 1 0 >>> a.insert(2, -1) >>> a.append(333) >>> a @@ -1194,7 +1220,7 @@ later. \section{Tuples and Sequences} We saw that lists and strings have many common properties, e.g., -indexinging and slicing operations. They are two examples of {\em +indexing and slicing operations. They are two examples of {\em sequence} data types. Since Python is an evolving language, other sequence data types may be added. There is also another standard sequence data type: the {\em tuple}. @@ -1209,7 +1235,7 @@ instance: >>> t (12345, 54321, 'hello!') >>> # Tuples may be nested: ->>> u = t, (1, 2, 3, 4, 5) +... u = t, (1, 2, 3, 4, 5) >>> u ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5)) >>> @@ -1227,7 +1253,7 @@ simulate much of the same effect with slicing and concatenation, though). A special problem is the construction of tuples containing 0 or 1 -items: the syntax has some extra quirks to accomodate these. Empty +items: the syntax has some extra quirks to accommodate these. Empty tuples are constructed by an empty pair of parentheses; a tuple with one item is constructed by following a value with a comma (it is not sufficient to enclose a single value in parentheses). @@ -1277,7 +1303,9 @@ Another useful data type built into Python is the {\em dictionary}. Dictionaries are sometimes found in other languages as ``associative memories'' or ``associative arrays''. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by {\em keys}, -which are strings. It is best to think of a dictionary as an unordered set of +which are strings (the use of non-string values as keys +is supported, but beyond the scope of this tutorial). +It is best to think of a dictionary as an unordered set of {\em key:value} pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: \verb/{}/. @@ -1291,7 +1319,7 @@ a key:value pair with {\tt del}. If you store using a key that is already in use, the old value associated with that key is forgotten. It is an error to extract a -value using a non-existant key. +value using a non-existent key. The {\tt keys()} method of a dictionary object returns a list of all the keys used in the dictionary, in random order (if you want it sorted, @@ -1351,14 +1379,17 @@ shortcut operator, when used as a general value and not as a Boolean, is the last evaluated argument. It is possible to assign the result of a comparison or other Boolean -expression to a variable, but you must enclose the entire Boolean -expression in parentheses. This is necessary because otherwise an -assignment like \verb/a = b = c/ would be ambiguous: does it assign the -value of {\tt c} to {\tt a} and {\tt b}, or does it compare {\tt b} to -{\tt c} and assign the outcome (0 or 1) to {\tt a}? As it is, the first -meaning is what you get, and to get the latter you have to write -\verb/a = (b = c)/. (In Python, unlike C, assignment cannot occur -inside expressions.) +expression to a variable. For example, + +\bcode\begin{verbatim} +>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance' +>>> non_null = string1 or string2 or string3 +>>> non_null +'Trondheim' +>>> +\end{verbatim}\ecode +% +Note that in Python, unlike C, assignment cannot occur inside expressions. \section{Comparing Sequences and Other Types} @@ -1368,7 +1399,7 @@ first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, -the lexiographical comparison is carried out recursively. If all +the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial subsequence of the other, the shorted sequence is the smaller one. Lexicographical ordering for @@ -1391,9 +1422,9 @@ Thus, a list is always smaller than a string, a string is always smaller than a tuple, etc. Mixed numeric types are compared according to their numeric value, so 0 equals 0.0, etc.% \footnote{ - The rules for comparing objects of different types should - not be relied upon; they may change in a future version of - the language. + The rules for comparing objects of different types should + not be relied upon; they may change in a future version of + the language. } @@ -1418,9 +1449,11 @@ executed at the top level and in calculator mode). A module is a file containing Python definitions and statements. The -file name is the module name with the suffix {\tt .py} appended. For -instance, use your favorite text editor to create a file called {\tt -fibo.py} in the current directory with the following contents: +file name is the module name with the suffix {\tt .py} appended. Within +a module, the module's name (as a string) is available as the value of +the global variable {\tt __name__}. For instance, use your favorite text +editor to create a file called {\tt fibo.py} in the current directory +with the following contents: \bcode\begin{verbatim} # Fibonacci numbers module @@ -1460,6 +1493,8 @@ Using the module name you can access the functions: 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 >>> fibo.fib2(100) [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89] +>>> fibo.__name__ +'fibo' >>> \end{verbatim}\ecode % @@ -1481,9 +1516,9 @@ They are executed only the {\em first} time the module is imported somewhere.% \footnote{ - In fact function definitions are also `statements' that are - `executed'; the execution enters the function name in the - module's global symbol table. + In fact function definitions are also `statements' that are + `executed'; the execution enters the function name in the + module's global symbol table. } Each module has its own private symbol table, which is used as the @@ -1586,9 +1621,11 @@ defines. It returns a sorted list of strings: \bcode\begin{verbatim} >>> import fibo, sys >>> dir(fibo) -['fib', 'fib2'] +['__name__', 'fib', 'fib2'] >>> dir(sys) -['argv', 'exit', 'modules', 'path', 'ps1', 'ps2', 'stderr', 'stdin', 'stdout'] +['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit', +'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace', +'stderr', 'stdin', 'stdout', 'version'] >>> \end{verbatim}\ecode % @@ -1599,7 +1636,7 @@ Without arguments, {\tt dir()} lists the names you have defined currently: >>> import fibo, sys >>> fib = fibo.fib >>> dir() -['a', 'fib', 'fibo', 'sys'] +['__name__', 'a', 'fib', 'fibo', 'sys'] >>> \end{verbatim}\ecode % @@ -1612,14 +1649,15 @@ If you want a list of those, they are defined in the standard module \bcode\begin{verbatim} >>> import __builtin__ >>> dir(__builtin__) -['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError', 'I -mportError', 'IndexError', 'KeyError', 'KeyboardInterrupt', 'MemoryError', ' -NameError', 'None', 'OverflowError', 'RuntimeError', 'SyntaxError', 'SystemE -rror', 'SystemExit', 'TypeError', 'ValueError', 'ZeroDivisionError', 'abs', -'apply', 'chr', 'cmp', 'coerce', 'compile', 'dir', 'divmod', 'eval', 'execfi -le', 'float', 'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'le -n', 'long', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input', - 'reload', 'repr', 'round', 'setattr', 'str', 'type'] +['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError', +'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt', +'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError', +'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError', +'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce', +'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float', +'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long', +'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input', +'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange'] >>> \end{verbatim}\ecode @@ -1638,6 +1676,10 @@ Python is to do all the string handling yourself; using string slicing and concatenation operations you can create any lay-out you can imagine. The standard module {\tt string} contains some useful operations for padding strings to a given column width; these will be discussed shortly. +Finally, the \code{\%} operator (modulo) with a string left argument +interprets this string as a C sprintf format string to be applied to the +right argument, and returns the string resulting from this formatting +operation. One question remains, of course: how do you convert values to strings? Luckily, Python has a way to convert any value to a string: just write @@ -1650,22 +1692,22 @@ the value between reverse quotes (\verb/``/). Some examples: >>> print s The value of x is 31.4, and y is 40000... >>> # Reverse quotes work on other types besides numbers: ->>> p = [x, y] +... p = [x, y] >>> ps = `p` >>> ps '[31.4, 40000]' >>> # Converting a string adds string quotes and backslashes: ->>> hello = 'hello, world\n' +... hello = 'hello, world\n' >>> hellos = `hello` >>> print hellos 'hello, world\012' >>> # The argument of reverse quotes may be a tuple: ->>> `x, y, ('foo', 'bar')` -'(31.4, 40000, (\'foo\', \'bar\'))' +... `x, y, ('foo', 'bar')` +"(31.4, 40000, ('foo', 'bar'))" >>> \end{verbatim}\ecode % -Here is how you write a table of squares and cubes: +Here are two ways to write a table of squares and cubes: \bcode\begin{verbatim} >>> import string @@ -1684,6 +1726,19 @@ Here is how you write a table of squares and cubes: 8 64 512 9 81 729 10 100 1000 +>>> for x in range(1,11): +... print '%2d %3d %4d' % (x, x*x, x*x*x) +... + 1 1 1 + 2 4 8 + 3 9 27 + 4 16 64 + 5 25 125 + 6 36 216 + 7 49 343 + 8 64 512 + 9 81 729 +10 100 1000 >>> \end{verbatim}\ecode % @@ -1702,11 +1757,7 @@ a slice operation, as in {\tt string.ljust(x,~n)[0:n]}.) There is another function, {\tt string.zfill}, which pads a numeric string on the left with zeros. It understands about plus and minus -signs:% -\footnote{ - Better facilities for formatting floating point numbers are - lacking at this moment. -} +signs: \bcode\begin{verbatim} >>> string.zfill('12', 5) @@ -1733,10 +1784,10 @@ kind of complaint you get while you are still learning Python: \bcode\begin{verbatim} >>> while 1 print 'Hello world' -Parsing error: file , line 1: -while 1 print 'Hello world' - ^ -Unhandled exception: run-time error: syntax error + File "", line 1 + while 1 print 'Hello world' + ^ +SyntaxError: invalid syntax >>> \end{verbatim}\ecode % @@ -1764,11 +1815,11 @@ Traceback (innermost last): File "", line 1 ZeroDivisionError: integer division or modulo >>> 4 + foo*3 -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 1 NameError: foo >>> '2' + 2 -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 1 TypeError: illegal argument type for built-in operation >>> @@ -1910,7 +1961,7 @@ For example: \bcode\begin{verbatim} >>> raise NameError, 'HiThere' -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 1 NameError: HiThere >>> @@ -1932,10 +1983,10 @@ For example: ... except my_exc, val: ... print 'My exception occurred, value:', val ... -My exception occured, value: 4 +My exception occurred, value: 4 >>> raise my_exc, 1 -Stack backtrace (innermost last): - File "", line 7 +Traceback (innermost last): + File "", line 1 my_exc: 1 >>> \end{verbatim}\ecode @@ -1956,7 +2007,7 @@ For example: ... print 'Goodbye, world!' ... Goodbye, world! -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 2 KeyboardInterrupt >>> @@ -1964,7 +2015,7 @@ KeyboardInterrupt % A {\tt finally} clause is executed whether or not an exception has occurred in the {\tt try} clause. When an exception has occurred, it -is re-raised after the {\tt finally} clauses is executed. The +is re-raised after the {\tt finally} clause is executed. The {\tt finally} clause is also executed ``on the way out'' when the {\tt try} statement is left via a {\tt break} or {\tt return} statement. @@ -1988,7 +2039,7 @@ same name. Objects can contain an arbitrary amount of private data. In C++ terminology, all class members (including the data members) are {\em public}, and all member functions are {\em virtual}. There are -no special constructors or desctructors. As in Modula-3, there are no +no special constructors or destructors. As in Modula-3, there are no shorthands for referencing the object's members from its methods: the method function is declared with an explicit first argument representing the object, which is provided implicitly by the call. As @@ -1996,9 +2047,9 @@ in Smalltalk, classes themselves are objects, albeit in the wider sense of the word: in Python, all data types are objects. This provides semantics for importing and renaming. But, just like in C++ or Modula-3, built-in types cannot be used as base classes for -extension by the user. Also, like in Modula-3 but unlike in C++, the +extension by the user. Also, like in C++ but unlike in Modula-3, most built-in operators with special syntax (arithmetic operators, -subscriptong etc.) cannot be redefined for class members. +subscripting etc.) can be redefined for class members. \section{A word about terminology} @@ -2022,7 +2073,7 @@ can be bound to the same object. This is known as aliasing in other languages. This is usually not appreciated on a first glance at Python, and can be safely ignored when dealing with immutable basic types (numbers, strings, tuples). However, aliasing has an -(intended!) effect on the semantics of Python code involving mutable +(intended!) effect on the semantics of Python code involving mutable objects such as lists, dictionaries, and most types representing entities outside the program (files, windows, etc.). This is usually used to the benefit of the program, since aliases behave like pointers @@ -2065,13 +2116,13 @@ names in modules are attribute references: in the expression be a straightforward mapping between the module's attributes and the global names defined in the module: they share the same name space!% \footnote{ - Except for one thing. Module objects have a secret read-only - attribute called {\tt __dict__} which returns the dictionary - used to implement the module's name space; the name - {\tt __dict__} is an attribute but not a global name. - Obviously, using this violates the abstraction of name space - implementation, and should be restricted to things like - post-mortem debuggers... + Except for one thing. Module objects have a secret read-only + attribute called {\tt __dict__} which returns the dictionary + used to implement the module's name space; the name + {\tt __dict__} is an attribute but not a global name. + Obviously, using this violates the abstraction of name space + implementation, and should be restricted to things like + post-mortem debuggers... } Attributes may be read-only or writable. In the latter case, @@ -2314,7 +2365,7 @@ avoid accidental name conflicts, which may cause hard-to-find bugs in large programs, it is wise to use some kind of convention that minimizes the chance of conflicts, e.g., capitalize method names, prefix data attribute names with a small unique string (perhaps just -an undescore), or use verbs for methods and nouns for data attributes. +an underscore), or use verbs for methods and nouns for data attributes. Data attributes may be referenced by methods as well as by ordinary @@ -2392,8 +2443,9 @@ Methods may call other methods by using method attributes of the The instantiation operation (``calling'' a class object) creates an empty object. Many classes like to create objects in a known initial -state. There is no special syntax to enforce this, but a convention -works almost as well: add a method named \verb\init\ to the class, +state. In early versions of Python, there was no special syntax to +enforce this (see below), but a convention was widely used: +add a method named \verb\init\ to the class, which initializes the instance (by assigning to some important data attributes) and returns the instance itself. For example, class \verb\Bag\ above could have the following method: @@ -2411,13 +2463,39 @@ statement, as follows: x = Bag().init() \end{verbatim} -Of course, the \verb\init\ method may have arguments for greater -flexibility. +In later versions of Python, a special method named \verb\__init__\ may be +defined instead: + +\begin{verbatim} + def __init__(self): + self.empty() +\end{verbatim} + +When a class defines an \verb\__init__\ method, class instantiation +automatically invokes \verb\__init__\ for the newly-created class +instance. So in the \verb\Bag\ example, a new and initialized instance +can be obtained by: -Warning: a common mistake is to forget the \verb\return self\ at the -end of an init method! +\begin{verbatim} + x = Bag() +\end{verbatim} +Of course, the \verb\__init__\ method may have arguments for greater +flexibility. In that case, arguments given to the class instantiation +operator are passed on to \verb\__init__\. For example, +\bcode\begin{verbatim} +>>> class Complex: +... def __init__(self, realpart, imagpart): +... self.r = realpart +... self.i = imagpart +... +>>> x = Complex(3.0,-4.5) +>>> x.r, x.i +(3.0, -4.5) +>>> +\end{verbatim}\ecode +% Methods may reference global names in the same way as ordinary functions. The global scope associated with a method is the module containing the class definition. (The class itself is never used as a @@ -2484,7 +2562,7 @@ the base class is defined or imported directly in the global scope.) \subsection{Multiple inheritance} -Poython supports a limited form of multiple inheritance as well. A +Python supports a limited form of multiple inheritance as well. A class definition with multiple base classes looks as follows: \begin{verbatim} @@ -2559,7 +2637,422 @@ object of which the method is an instance, and \verb\m.im_func\ is the function object corresponding to the method. -XXX Mention bw compat hacks. +\chapter{Recent Additions} + +Python is an evolving language. Since this tutorial was last +thoroughly revised, several new features have been added to the +language. While ideally I should revise the tutorial to incorporate +them in the mainline of the text, lack of time currently requires me +to a more modest approach. In this chapter I will briefly list the +most important improvements to the language and how you can use them +to your benefit. + +\section{The Last Printed Expression} + +In interactive mode, the last printed expression is assigned to the +variable \code\_\. This means that when you are using Python as a +desk calculator, it is somewhat easier to continue calculations, for +example: + +\begin{verbatim} + >>> tax = 17.5 / 100 + >>> price = 3.50 + >>> price * tax + 0.6125 + >>> price + _ + 4.1125 + >>> round(_, 2) + 4.11 + >>> +\end{verbatim} + +\section{String Literals} + +\subsection{Double Quotes} + +Python can now also use double quotes to surround string literals, +e.g. \verb\"this doesn't hurt a bit"\. + +\subsection{Continuation Of String Literals} + +String literals can span multiple lines by escaping newlines with +backslashes, e.g. + +\begin{verbatim} + hello = "This is a rather long string containing\n\ + several lines of text just as you would do in C.\n\ + Note that whitespace at the beginning of the line is\ + significant.\n" + print hello +\end{verbatim} + +which would print the following: +\begin{verbatim} + This is a rather long string containing + several lines of text just as you would do in C. + Note that whitespace at the beginning of the line is significant. +\end{verbatim} + +\subsection{Triple-quoted strings} + +In some cases, when you need to include really long strings (e.g. +containing several paragraphs of informational text), it is annoying +that you have to terminate each line with \verb@\n\@, especially if +you would like to reformat the text occasionally with a powerful text +editor like Emacs. For such situations, ``triple-quoted'' strings can +be used, e.g. + +\begin{verbatim} + hello = """ + + This string is bounded by triple double quotes (3 times "). + Newlines in the string are retained, though \ + it is still possible\nto use all normal escape sequences. + + Whitespace at the beginning of a line is + significant. If you need to include three opening quotes + you have to escape at least one of them, e.g. \""". + + This string ends in a newline. + """ +\end{verbatim} + +Note that there is no semantic difference between strings quoted with +single quotes (\verb/'/) or double quotes (\verb\"\). + +\subsection{String Literal Juxtaposition} + +One final twist: you can juxtapose multiple string literals. Two or +more adjacent string literals (but not arbitrary expressions!) +separated only by whitespace will be concatenated (without intervening +whitespace) into a single string object at compile time. This makes +it possible to continue a long string on the next line without +sacrificing indentation or performance, unlike the use of the string +concatenation operator \verb\+\ or the continuation of the literal +itself on the next line (since leading whitespace is significant +inside all types of string literals). Note that this feature, like +all string features except triple-quoted strings, is borrowed from +Standard C. + +\section{The Formatting Operator} + +\subsection{Basic Usage} + +The chapter on output formatting is really out of date: there is now +an almost complete interface to C-style printf formats. This is done +by overloading the modulo operator (\verb\%\) for a left operand +which is a string, e.g. + +\begin{verbatim} + >>> import math + >>> print 'The value of PI is approximately %5.3f.' % math.pi + The value of PI is approximately 3.142. + >>> +\end{verbatim} + +If there is more than one format in the string you pass a tuple as +right operand, e.g. + +\begin{verbatim} + >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} + >>> for name, phone in table.items(): + ... print '%-10s ==> %10d' % (name, phone) + ... + Jack ==> 4098 + Dcab ==> 8637678 + Sjoerd ==> 4127 + >>> +\end{verbatim} + +Most formats work exactly as in C and require that you pass the proper +type (however, if you don't you get an exception, not a core dump). +The \verb\%s\ format is more relaxed: if the corresponding argument is +not a string object, it is converted to string using the \verb\str()\ +built-in function. Using \verb\*\ to pass the width or precision in +as a separate (integer) argument is supported. The C formats +\verb\%n\ and \verb\%p\ are not supported. + +\subsection{Referencing Variables By Name} + +If you have a really long format string that you don't want to split +up, it would be nice if you could reference the variables to be +formatted by name instead of by position. This can be done by using +an extension of C formats using the form \verb\%(name)format\, e.g. + +\begin{verbatim} + >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} + >>> print 'Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table + Jack: 4098; Sjoerd: 4127; Dcab: 8637678 + >>> +\end{verbatim} + +This is particularly useful in combination with the new built-in +\verb\vars()\ function, which returns a dictionary containing all +local variables. + +\section{Optional Function Arguments} + +It is now possible to define functions with a variable number of +arguments. There are two forms, which can be combined. + +\subsection{Default Argument Values} + +The most useful form is to specify a default value for one or more +arguments. This creates a function that can be called with fewer +arguments than it is defined, e.g. + +\begin{verbatim} + def ask_ok(prompt, retries = 4, complaint = 'Yes or no, please!'): + while 1: + ok = raw_input(prompt) + if ok in ('y', 'ye', 'yes'): return 1 + if ok in ('n', 'no', 'nop', 'nope'): return 0 + retries = retries - 1 + if retries < 0: raise IOError, 'refusenik user' + print complaint +\end{verbatim} + +This function can be called either like this: +\verb\ask_ok('Do you really want to quit?')\ or like this: +\verb\ask_ok('OK to overwrite the file?', 2)\. + +The default values are evaluated at the point of function definition +in the {\em defining} scope, so that e.g. + +\begin{verbatim} + i = 5 + def f(arg = i): print arg + i = 6 + f() +\end{verbatim} + +will print \verb\5\. + +\subsection{Arbitrary Argument Lists} + +It is also possible to specify that a function can be called with an +arbitrary number of arguments. These arguments will be wrapped up in +a tuple. Before the variable number of arguments, zero or more normal +arguments may occur, e.g. + +\begin{verbatim} + def fprintf(file, format, *args): + file.write(format % args) +\end{verbatim} + +This feature may be combined with the previous, e.g. + +\begin{verbatim} + def but_is_it_useful(required, optional = None, *remains): + print "I don't know" +\end{verbatim} + +\section{Lambda And Functional Programming Tools} + +\subsection{Lambda Forms} + +On popular demand, a few features commonly found in functional +programming languages and Lisp have been added to Python. With the +\verb\lambda\ keyword, small anonymous functions can be created. +Here's a function that returns the sum of its two arguments: +\verb\lambda a, b: a+b\. Lambda forms can be used wherever function +objects are required. They are syntactically restricted to a single +expression. Semantically, they are just syntactic sugar for a normal +function definition. Like nested function definitions, lambda forms +cannot reference variables from the containing scope, but this can be +overcome through the judicious use of default argument values, e.g. + +\begin{verbatim} + def make_incrementor(n): + return lambda(x, incr=n): x+incr +\end{verbatim} + +\subsection{Map, Reduce and Filter} + +Three new built-in functions on sequences are good candidate to pass +lambda forms. + +\subsubsection{Map.} + +\verb\map(function, sequence)\ calls \verb\function(item)\ for each of +the sequence's items and returns a list of the return values. For +example, to compute some cubes: + +\begin{verbatim} + >>> map(lambda x: x*x*x, range(1, 11)) + [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000] + >>> +\end{verbatim} + +More than one sequence may be passed; the function must then have as +many arguments as there are sequences and is called with the +corresponding item from each sequence (or \verb\None\ if some sequence +is shorter than another). If \verb\None\ is passed for the function, +a function returning its argument(s) is substituted. + +Combining these two special cases, we see that +\verb\map(None, list1, list2)\ is a convenient way of turning a pair +of lists into a list of pairs. For example: + +\begin{verbatim} + >>> seq = range(8) + >>> map(None, seq, map(lambda x: x*x, seq)) + [(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)] + >>> +\end{verbatim} + +\subsubsection{Filter.} + +\verb\filter(function, sequence)\ returns a sequence (of the same +type, if possible) consisting of those items from the sequence for +which \verb\function(item)\ is true. For example, to compute some +primes: + +\begin{verbatim} + >>> filter(lambda x: x%2 != 0 and x%3 != 0, range(2, 25)) + [5, 7, 11, 13, 17, 19, 23] + >>> +\end{verbatim} + +\subsubsection{Reduce.} + +\verb\reduce(function, sequence)\ returns a single value constructed +by calling the (binary) function on the first two items of the +sequence, then on the result and the next item, and so on. For +example, to compute the sum of the numbers 1 through 10: + +\begin{verbatim} + >>> reduce(lambda x, y: x+y, range(1, 11)) + 55 + >>> +\end{verbatim} + +If there's only one item in the sequence, its value is returned; if +the sequence is empty, an exception is raised. + +A third argument can be passed to indicate the starting value. In this +case the starting value is returned for an empty sequence, and the +function is first applied to the starting value and the first sequence +item, then to the result and the next item, and so on. For example, + +\begin{verbatim} + >>> def sum(seq): + ... return reduce(lambda x, y: x+y, seq, 0) + ... + >>> sum(range(1, 11)) + 55 + >>> sum([]) + 0 + >>> +\end{verbatim} + +\section{Continuation Lines Without Backslashes} +While the general mechanism for continuation of a source line on the +next physical line remains to place a backslash on the end of the +line, expressions inside matched parentheses (or square brackets, or +curly braces) can now also be continued without using a backslash. +This is particularly useful for calls to functions with many +arguments, and for initializations of large tables. + +For example: + +\begin{verbatim} + month_names = ['Januari', 'Februari', 'Maart', + 'April', 'Mei', 'Juni', + 'Juli', 'Augustus', 'September', + 'Oktober', 'November', 'December'] +\end{verbatim} + +and + +\begin{verbatim} + CopyInternalHyperLinks(self.context.hyperlinks, + copy.context.hyperlinks, + uidremap) +\end{verbatim} + +\section{Regular Expressions} + +While C's printf-style output formats, transformed into Python, are +adequate for most output formatting jobs, C's scanf-style input +formats are not very powerful. Instead of scanf-style input, Python +offers Emacs-style regular expressions as a powerful input and +scanning mechanism. Read the corresponding section in the Library +Reference for a full description. + +\section{Generalized Dictionaries} + +The keys of dictionaries are no longer restricted to strings -- they +can be numbers, tuples, or (certain) class instances. (Lists and +dictionaries are not acceptable as dictionary keys, in order to avoid +problems when the object used as a key is modified.) + +Dictionaries have two new methods: \verb\d.values()\ returns a list of +the dictionary's values, and \verb\d.items()\ returns a list of the +dictionary's (key, value) pairs. Like \verb\d.keys()\, these +operations are slow for large dictionaries. Examples: + +\begin{verbatim} + >>> d = {100: 'honderd', 1000: 'duizend', 10: 'tien'} + >>> d.keys() + [100, 10, 1000] + >>> d.values() + ['honderd', 'tien', 'duizend'] + >>> d.items() + [(100, 'honderd'), (10, 'tien'), (1000, 'duizend')] + >>> +\end{verbatim} + +\section{Miscellaneous New Built-in Functions} + +The function \verb\vars()\ returns a dictionary containing the current +local variables. With a module as argument, it returns that module's +global variables. The old function \verb\dir(x)\ returns +\verb\vars(x).keys()\. + +The function \verb\round(x)\ returns a floating point number rounded +to the nearest integer (but still expressed as a floating point +number). E.g. \verb\round(3.4) == 3.0\ and \verb\round(3.5) == 4.0\. +With a second argument it rounds to the specified number of digits, +e.g. \verb\round(math.pi, 4) == 3.1416\ or even +\verb\round(123.4, -2) == 100.0\. + +The function \verb\hash(x)\ returns a hash value for an object. +All object types acceptable as dictionary keys have a hash value (and +it is this hash value that the dictionary implementation uses). + +The function \verb\id(x)\ return a unique identifier for an object. +For two objects x and y, \verb\id(x) == id(y)\ if and only if +\verb\x is y\. (In fact the object's address is used.) + +The function \verb\hasattr(x, name)\ returns whether an object has an +attribute with the given name (a string value). The function +\verb\getattr(x, name)\ returns the object's attribute with the given +name. The function \verb\setattr(x, name, value)\ assigns a value to +an object's attribute with the given name. These three functions are +useful if the attribute names are not known beforehand. Note that +\verb\getattr(x, 'foo')\ is equivalent to \verb\x.foo\, and +\verb\setattr(x, 'foo', y)\ is equivalent to \verb\x.foo = y\. By +definition, \verb\hasattr(x, name)\ returns true if and only if +\verb\getattr(x, name)\ returns without raising an exception. + +\section{Else Clause For Try Statement} + +The \verb\try...except\ statement now has an optional \verb\else\ +clause, which must follow all \verb\except\ clauses. It is useful to +place code that must be executed if the \verb\try\ clause does not +raise an exception. For example: + +\begin{verbatim} + for arg in sys.argv: + try: + f = open(arg, 'r') + except IOError: + print 'cannot open', arg + else: + print arg, 'has', len(f.readlines()), 'lines' + f.close() +\end{verbatim} \end{document} diff --git a/Doc/tut/tut.tex b/Doc/tut/tut.tex index 5327488..5353d56 100644 --- a/Doc/tut/tut.tex +++ b/Doc/tut/tut.tex @@ -1,17 +1,15 @@ \documentstyle[twoside,11pt,myformat]{report} -\title{\bf - Python Tutorial -} - +\title{Python Tutorial} + \author{ - Guido van Rossum \\ - Dept. CST, CWI, P.O. Box 94079 \\ - 1090 GB Amsterdam, The Netherlands \\ - E-mail: {\tt guido@cwi.nl} + Guido van Rossum \\ + Dept. CST, CWI, P.O. Box 94079 \\ + 1090 GB Amsterdam, The Netherlands \\ + E-mail: {\tt guido@cwi.nl} } -\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release! +\date{14 Jul 1994 \\ Release 1.0.3} % XXX update before release! \begin{document} @@ -65,7 +63,7 @@ a more formal definition of the language. If you ever wrote a large shell script, you probably know this feeling: you'd love to add yet another feature, but it's already so slow, and so big, and so complicated; or the feature involves a system -call or other funcion that is only accessible from C \ldots Usually +call or other function that is only accessible from C \ldots Usually the problem at hand isn't serious enough to warrant rewriting the script in C; perhaps because the problem requires variable-length strings or other data types (like sorted lists of file names) that are @@ -137,9 +135,10 @@ trying out the examples shown later. The rest of the tutorial introduces various features of the Python language and system though examples, beginning with simple expressions, statements and data types, through functions and modules, -and finally touching upon advanced concepts like exceptions. +and finally touching upon advanced concepts like exceptions +and user-defined classes. -When you're through with the turtorial (or just getting bored), you +When you're through with the tutorial (or just getting bored), you should read the Library Reference, which gives complete (though terse) reference material about built-in and standard types, functions and modules that can save you a lot of time when writing Python programs. @@ -219,8 +218,8 @@ and a copyright notice before printing the first prompt, e.g.: \bcode\begin{verbatim} python -Python 0.9.9 (Apr 2 1993). -Copyright 1990, 1991, 1992, 1993 Stichting Mathematisch Centrum, Amsterdam +Python 1.0.3 (Jul 14 1994) +Copyright 1991-1994 Stichting Mathematisch Centrum, Amsterdam >>> \end{verbatim}\ecode @@ -244,7 +243,7 @@ Typing the interrupt character (usually Control-C or DEL) to the primary or secondary prompt cancels the input and returns to the primary prompt.% \footnote{ - A problem with the GNU Readline package may prevent this. + A problem with the GNU Readline package may prevent this. } Typing an interrupt while a command is executing raises the {\tt KeyboardInterrupt} exception, which may be handled by a {\tt try} @@ -301,7 +300,7 @@ When you use Python interactively, it is frequently handy to have some standard commands executed every time the interpreter is started. You can do this by setting an environment variable named {\tt PYTHONSTARTUP} to the name of a file containing your start-up -commands. This is similar to the {\tt /profile} feature of the UNIX +commands. This is similar to the {\tt .profile} feature of the UNIX shells. This file is only read in interactive sessions, not when Python reads @@ -423,9 +422,9 @@ the example, you must type everything after the prompt, when the prompt appears; lines that do not begin with a prompt are output from the interpreter.% \footnote{ - I'd prefer to use different fonts to distinguish input - from output, but the amount of LaTeX hacking that would require - is currently beyond my ability. + I'd prefer to use different fonts to distinguish input + from output, but the amount of LaTeX hacking that would require + is currently beyond my ability. } Note that a secondary prompt on a line by itself in an example means you must type a blank line; this is used to end a multi-line command. @@ -444,15 +443,20 @@ work just like in most other languages (e.g., Pascal or C); parentheses can be used for grouping. For example: \bcode\begin{verbatim} ->>> # This is a comment >>> 2+2 4 ->>> +>>> # This is a comment +... 2+2 +4 +>>> 2+2 # and a comment on the same line as code +4 >>> (50-5*6)/4 5 ->>> # Division truncates towards zero: ->>> 7/3 +>>> # Integer division returns the floor: +... 7/3 2 +>>> 7/-3 +-3 >>> \end{verbatim}\ecode % @@ -470,9 +474,14 @@ variable. The value of an assignment is not written: A value can be assigned to several variables simultaneously: \bcode\begin{verbatim} ->>> # Zero x, y and z ->>> x = y = z = 0 ->>> +>>> x = y = z = 0 # Zero x, y and z +>>> x +0 +>>> y +0 +>>> z +0 +>>> \end{verbatim}\ecode % There is full support for floating point; operators with mixed type @@ -489,21 +498,30 @@ operands convert the integer operand to floating point: \subsection{Strings} Besides numbers, Python can also manipulate strings, enclosed in -single quotes: +single quotes or double quotes: \bcode\begin{verbatim} >>> 'foo bar' 'foo bar' >>> 'doesn\'t' -'doesn\'t' +"doesn't" +>>> "doesn't" +"doesn't" +>>> '"Yes," he said.' +'"Yes," he said.' +>>> "\"Yes,\" he said." +'"Yes," he said.' +>>> '"Isn\'t," she said.' +'"Isn\'t," she said.' >>> \end{verbatim}\ecode % Strings are written the same way as they are typed for input: inside -quotes and with quotes and other funny characters escaped by -backslashes, to show the precise value. (The {\tt print} statement, -described later, can be used to write strings without quotes or -escapes.) +quotes and with quotes and other funny characters escaped by backslashes, +to show the precise value. The string is enclosed in double quotes if +the string contains a single quote and no double quotes, else it's +enclosed in single quotes. (The {\tt print} statement, described later, +can be used to write strings without quotes or escapes.) Strings can be concatenated (glued together) with the {\tt +} operator, and repeated with {\tt *}: @@ -602,7 +620,9 @@ for single-element (non-slice) indices: >>> word[-100:] 'HelpA' >>> word[-10] # error -Unhandled exception: IndexError: string index out of range +Traceback (innermost last): + File "", line 1 +IndexError: string index out of range >>> \end{verbatim}\ecode % @@ -687,15 +707,15 @@ of the list: \bcode\begin{verbatim} >>> # Replace some items: ->>> a[0:2] = [1, 12] +... a[0:2] = [1, 12] >>> a [1, 12, 123, 1234] >>> # Remove some: ->>> a[0:2] = [] +... a[0:2] = [] >>> a [123, 1234] >>> # Insert some: ->>> a[1:1] = ['bletch', 'xyzzy'] +... a[1:1] = ['bletch', 'xyzzy'] >>> a [123, 'bletch', 'xyzzy', 1234] >>> a[:0] = a # Insert (a copy of) itself at the beginning @@ -743,8 +763,8 @@ subsequence of the {\em Fibonacci} series as follows: \bcode\begin{verbatim} >>> # Fibonacci series: ->>> # the sum of two elements defines the next ->>> a, b = 0, 1 +... # the sum of two elements defines the next +... a, b = 0, 1 >>> while b < 10: ... print b ... a, b = b, a+b @@ -864,7 +884,7 @@ example (no pun intended): \bcode\begin{verbatim} >>> # Measure some strings: ->>> a = ['cat', 'window', 'defenestrate'] +... a = ['cat', 'window', 'defenestrate'] >>> for x in a: ... print x, len(x) ... @@ -992,7 +1012,7 @@ arbitrary boundary: ... a, b = b, a+b ... >>> # Now call the function we just defined: ->>> fib(2000) +... fib(2000) 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 >>> \end{verbatim}\ecode @@ -1006,20 +1026,21 @@ The {\em execution} of a function introduces a new symbol table used for the local variables of the function. More precisely, all variable assignments in a function store the value in the local symbol table; whereas - variable references first look in the local symbol table, then +variable references first look in the local symbol table, then in the global symbol table, and then in the table of built-in names. Thus, global variables cannot be directly assigned to from within a -function, although they may be referenced. +function (unless named in a {\tt global} statement), although +they may be referenced. The actual parameters (arguments) to a function call are introduced in the local symbol table of the called function when it is called; thus, arguments are passed using {\em call\ by\ value}.% \footnote{ - Actually, {\em call by object reference} would be a better - description, since if a mutable object is passed, the caller - will see any changes the callee makes to it (e.g., items - inserted into a list). + Actually, {\em call by object reference} would be a better + description, since if a mutable object is passed, the caller + will see any changes the callee makes to it (e.g., items + inserted into a list). } When a function calls another function, a new local symbol table is created for that call. @@ -1081,7 +1102,7 @@ This example, as usual, demonstrates some new Python features: \item The {\tt return} statement returns with a value from a function. {\tt return} without an expression argument is used to return from the middle -of a procedure (falling off the end also returns from a proceduce), in +of a procedure (falling off the end also returns from a procedure), in which case the {\tt None} value is returned. \item @@ -1092,8 +1113,8 @@ object (this may be an expression), and {\tt methodname} is the name of a method that is defined by the object's type. Different types define different methods. Methods of different types may have the same name without causing ambiguity. (It is possible to define your -own object types and methods, using {\em classes}. This is an -advanced feature that is not discussed in this tutorial.) +own object types and methods, using {\em classes}, as discussed later +in this tutorial.) The method {\tt append} shown in the example, is defined for list objects; it adds a new element at the end of the list. In this example @@ -1137,12 +1158,17 @@ Sort the items of the list, in place. \item[{\tt reverse()}] Reverse the elements of the list, in place. +\item[{\tt count(x)}] +Return the number of times {\tt x} appears in the list. + \end{description} An example that uses all list methods: \bcode\begin{verbatim} >>> a = [66.6, 333, 333, 1, 1234.5] +>>> print a.count(333), a.count(66.6), a.count('x') +2 1 0 >>> a.insert(2, -1) >>> a.append(333) >>> a @@ -1194,7 +1220,7 @@ later. \section{Tuples and Sequences} We saw that lists and strings have many common properties, e.g., -indexinging and slicing operations. They are two examples of {\em +indexing and slicing operations. They are two examples of {\em sequence} data types. Since Python is an evolving language, other sequence data types may be added. There is also another standard sequence data type: the {\em tuple}. @@ -1209,7 +1235,7 @@ instance: >>> t (12345, 54321, 'hello!') >>> # Tuples may be nested: ->>> u = t, (1, 2, 3, 4, 5) +... u = t, (1, 2, 3, 4, 5) >>> u ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5)) >>> @@ -1227,7 +1253,7 @@ simulate much of the same effect with slicing and concatenation, though). A special problem is the construction of tuples containing 0 or 1 -items: the syntax has some extra quirks to accomodate these. Empty +items: the syntax has some extra quirks to accommodate these. Empty tuples are constructed by an empty pair of parentheses; a tuple with one item is constructed by following a value with a comma (it is not sufficient to enclose a single value in parentheses). @@ -1277,7 +1303,9 @@ Another useful data type built into Python is the {\em dictionary}. Dictionaries are sometimes found in other languages as ``associative memories'' or ``associative arrays''. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by {\em keys}, -which are strings. It is best to think of a dictionary as an unordered set of +which are strings (the use of non-string values as keys +is supported, but beyond the scope of this tutorial). +It is best to think of a dictionary as an unordered set of {\em key:value} pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: \verb/{}/. @@ -1291,7 +1319,7 @@ a key:value pair with {\tt del}. If you store using a key that is already in use, the old value associated with that key is forgotten. It is an error to extract a -value using a non-existant key. +value using a non-existent key. The {\tt keys()} method of a dictionary object returns a list of all the keys used in the dictionary, in random order (if you want it sorted, @@ -1351,14 +1379,17 @@ shortcut operator, when used as a general value and not as a Boolean, is the last evaluated argument. It is possible to assign the result of a comparison or other Boolean -expression to a variable, but you must enclose the entire Boolean -expression in parentheses. This is necessary because otherwise an -assignment like \verb/a = b = c/ would be ambiguous: does it assign the -value of {\tt c} to {\tt a} and {\tt b}, or does it compare {\tt b} to -{\tt c} and assign the outcome (0 or 1) to {\tt a}? As it is, the first -meaning is what you get, and to get the latter you have to write -\verb/a = (b = c)/. (In Python, unlike C, assignment cannot occur -inside expressions.) +expression to a variable. For example, + +\bcode\begin{verbatim} +>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance' +>>> non_null = string1 or string2 or string3 +>>> non_null +'Trondheim' +>>> +\end{verbatim}\ecode +% +Note that in Python, unlike C, assignment cannot occur inside expressions. \section{Comparing Sequences and Other Types} @@ -1368,7 +1399,7 @@ first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, -the lexiographical comparison is carried out recursively. If all +the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial subsequence of the other, the shorted sequence is the smaller one. Lexicographical ordering for @@ -1391,9 +1422,9 @@ Thus, a list is always smaller than a string, a string is always smaller than a tuple, etc. Mixed numeric types are compared according to their numeric value, so 0 equals 0.0, etc.% \footnote{ - The rules for comparing objects of different types should - not be relied upon; they may change in a future version of - the language. + The rules for comparing objects of different types should + not be relied upon; they may change in a future version of + the language. } @@ -1418,9 +1449,11 @@ executed at the top level and in calculator mode). A module is a file containing Python definitions and statements. The -file name is the module name with the suffix {\tt .py} appended. For -instance, use your favorite text editor to create a file called {\tt -fibo.py} in the current directory with the following contents: +file name is the module name with the suffix {\tt .py} appended. Within +a module, the module's name (as a string) is available as the value of +the global variable {\tt __name__}. For instance, use your favorite text +editor to create a file called {\tt fibo.py} in the current directory +with the following contents: \bcode\begin{verbatim} # Fibonacci numbers module @@ -1460,6 +1493,8 @@ Using the module name you can access the functions: 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 >>> fibo.fib2(100) [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89] +>>> fibo.__name__ +'fibo' >>> \end{verbatim}\ecode % @@ -1481,9 +1516,9 @@ They are executed only the {\em first} time the module is imported somewhere.% \footnote{ - In fact function definitions are also `statements' that are - `executed'; the execution enters the function name in the - module's global symbol table. + In fact function definitions are also `statements' that are + `executed'; the execution enters the function name in the + module's global symbol table. } Each module has its own private symbol table, which is used as the @@ -1586,9 +1621,11 @@ defines. It returns a sorted list of strings: \bcode\begin{verbatim} >>> import fibo, sys >>> dir(fibo) -['fib', 'fib2'] +['__name__', 'fib', 'fib2'] >>> dir(sys) -['argv', 'exit', 'modules', 'path', 'ps1', 'ps2', 'stderr', 'stdin', 'stdout'] +['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit', +'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace', +'stderr', 'stdin', 'stdout', 'version'] >>> \end{verbatim}\ecode % @@ -1599,7 +1636,7 @@ Without arguments, {\tt dir()} lists the names you have defined currently: >>> import fibo, sys >>> fib = fibo.fib >>> dir() -['a', 'fib', 'fibo', 'sys'] +['__name__', 'a', 'fib', 'fibo', 'sys'] >>> \end{verbatim}\ecode % @@ -1612,14 +1649,15 @@ If you want a list of those, they are defined in the standard module \bcode\begin{verbatim} >>> import __builtin__ >>> dir(__builtin__) -['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError', 'I -mportError', 'IndexError', 'KeyError', 'KeyboardInterrupt', 'MemoryError', ' -NameError', 'None', 'OverflowError', 'RuntimeError', 'SyntaxError', 'SystemE -rror', 'SystemExit', 'TypeError', 'ValueError', 'ZeroDivisionError', 'abs', -'apply', 'chr', 'cmp', 'coerce', 'compile', 'dir', 'divmod', 'eval', 'execfi -le', 'float', 'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'le -n', 'long', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input', - 'reload', 'repr', 'round', 'setattr', 'str', 'type'] +['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError', +'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt', +'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError', +'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError', +'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce', +'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float', +'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long', +'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input', +'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange'] >>> \end{verbatim}\ecode @@ -1638,6 +1676,10 @@ Python is to do all the string handling yourself; using string slicing and concatenation operations you can create any lay-out you can imagine. The standard module {\tt string} contains some useful operations for padding strings to a given column width; these will be discussed shortly. +Finally, the \code{\%} operator (modulo) with a string left argument +interprets this string as a C sprintf format string to be applied to the +right argument, and returns the string resulting from this formatting +operation. One question remains, of course: how do you convert values to strings? Luckily, Python has a way to convert any value to a string: just write @@ -1650,22 +1692,22 @@ the value between reverse quotes (\verb/``/). Some examples: >>> print s The value of x is 31.4, and y is 40000... >>> # Reverse quotes work on other types besides numbers: ->>> p = [x, y] +... p = [x, y] >>> ps = `p` >>> ps '[31.4, 40000]' >>> # Converting a string adds string quotes and backslashes: ->>> hello = 'hello, world\n' +... hello = 'hello, world\n' >>> hellos = `hello` >>> print hellos 'hello, world\012' >>> # The argument of reverse quotes may be a tuple: ->>> `x, y, ('foo', 'bar')` -'(31.4, 40000, (\'foo\', \'bar\'))' +... `x, y, ('foo', 'bar')` +"(31.4, 40000, ('foo', 'bar'))" >>> \end{verbatim}\ecode % -Here is how you write a table of squares and cubes: +Here are two ways to write a table of squares and cubes: \bcode\begin{verbatim} >>> import string @@ -1684,6 +1726,19 @@ Here is how you write a table of squares and cubes: 8 64 512 9 81 729 10 100 1000 +>>> for x in range(1,11): +... print '%2d %3d %4d' % (x, x*x, x*x*x) +... + 1 1 1 + 2 4 8 + 3 9 27 + 4 16 64 + 5 25 125 + 6 36 216 + 7 49 343 + 8 64 512 + 9 81 729 +10 100 1000 >>> \end{verbatim}\ecode % @@ -1702,11 +1757,7 @@ a slice operation, as in {\tt string.ljust(x,~n)[0:n]}.) There is another function, {\tt string.zfill}, which pads a numeric string on the left with zeros. It understands about plus and minus -signs:% -\footnote{ - Better facilities for formatting floating point numbers are - lacking at this moment. -} +signs: \bcode\begin{verbatim} >>> string.zfill('12', 5) @@ -1733,10 +1784,10 @@ kind of complaint you get while you are still learning Python: \bcode\begin{verbatim} >>> while 1 print 'Hello world' -Parsing error: file , line 1: -while 1 print 'Hello world' - ^ -Unhandled exception: run-time error: syntax error + File "", line 1 + while 1 print 'Hello world' + ^ +SyntaxError: invalid syntax >>> \end{verbatim}\ecode % @@ -1764,11 +1815,11 @@ Traceback (innermost last): File "", line 1 ZeroDivisionError: integer division or modulo >>> 4 + foo*3 -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 1 NameError: foo >>> '2' + 2 -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 1 TypeError: illegal argument type for built-in operation >>> @@ -1910,7 +1961,7 @@ For example: \bcode\begin{verbatim} >>> raise NameError, 'HiThere' -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 1 NameError: HiThere >>> @@ -1932,10 +1983,10 @@ For example: ... except my_exc, val: ... print 'My exception occurred, value:', val ... -My exception occured, value: 4 +My exception occurred, value: 4 >>> raise my_exc, 1 -Stack backtrace (innermost last): - File "", line 7 +Traceback (innermost last): + File "", line 1 my_exc: 1 >>> \end{verbatim}\ecode @@ -1956,7 +2007,7 @@ For example: ... print 'Goodbye, world!' ... Goodbye, world! -Stack backtrace (innermost last): +Traceback (innermost last): File "", line 2 KeyboardInterrupt >>> @@ -1964,7 +2015,7 @@ KeyboardInterrupt % A {\tt finally} clause is executed whether or not an exception has occurred in the {\tt try} clause. When an exception has occurred, it -is re-raised after the {\tt finally} clauses is executed. The +is re-raised after the {\tt finally} clause is executed. The {\tt finally} clause is also executed ``on the way out'' when the {\tt try} statement is left via a {\tt break} or {\tt return} statement. @@ -1988,7 +2039,7 @@ same name. Objects can contain an arbitrary amount of private data. In C++ terminology, all class members (including the data members) are {\em public}, and all member functions are {\em virtual}. There are -no special constructors or desctructors. As in Modula-3, there are no +no special constructors or destructors. As in Modula-3, there are no shorthands for referencing the object's members from its methods: the method function is declared with an explicit first argument representing the object, which is provided implicitly by the call. As @@ -1996,9 +2047,9 @@ in Smalltalk, classes themselves are objects, albeit in the wider sense of the word: in Python, all data types are objects. This provides semantics for importing and renaming. But, just like in C++ or Modula-3, built-in types cannot be used as base classes for -extension by the user. Also, like in Modula-3 but unlike in C++, the +extension by the user. Also, like in C++ but unlike in Modula-3, most built-in operators with special syntax (arithmetic operators, -subscriptong etc.) cannot be redefined for class members. +subscripting etc.) can be redefined for class members. \section{A word about terminology} @@ -2022,7 +2073,7 @@ can be bound to the same object. This is known as aliasing in other languages. This is usually not appreciated on a first glance at Python, and can be safely ignored when dealing with immutable basic types (numbers, strings, tuples). However, aliasing has an -(intended!) effect on the semantics of Python code involving mutable +(intended!) effect on the semantics of Python code involving mutable objects such as lists, dictionaries, and most types representing entities outside the program (files, windows, etc.). This is usually used to the benefit of the program, since aliases behave like pointers @@ -2065,13 +2116,13 @@ names in modules are attribute references: in the expression be a straightforward mapping between the module's attributes and the global names defined in the module: they share the same name space!% \footnote{ - Except for one thing. Module objects have a secret read-only - attribute called {\tt __dict__} which returns the dictionary - used to implement the module's name space; the name - {\tt __dict__} is an attribute but not a global name. - Obviously, using this violates the abstraction of name space - implementation, and should be restricted to things like - post-mortem debuggers... + Except for one thing. Module objects have a secret read-only + attribute called {\tt __dict__} which returns the dictionary + used to implement the module's name space; the name + {\tt __dict__} is an attribute but not a global name. + Obviously, using this violates the abstraction of name space + implementation, and should be restricted to things like + post-mortem debuggers... } Attributes may be read-only or writable. In the latter case, @@ -2314,7 +2365,7 @@ avoid accidental name conflicts, which may cause hard-to-find bugs in large programs, it is wise to use some kind of convention that minimizes the chance of conflicts, e.g., capitalize method names, prefix data attribute names with a small unique string (perhaps just -an undescore), or use verbs for methods and nouns for data attributes. +an underscore), or use verbs for methods and nouns for data attributes. Data attributes may be referenced by methods as well as by ordinary @@ -2392,8 +2443,9 @@ Methods may call other methods by using method attributes of the The instantiation operation (``calling'' a class object) creates an empty object. Many classes like to create objects in a known initial -state. There is no special syntax to enforce this, but a convention -works almost as well: add a method named \verb\init\ to the class, +state. In early versions of Python, there was no special syntax to +enforce this (see below), but a convention was widely used: +add a method named \verb\init\ to the class, which initializes the instance (by assigning to some important data attributes) and returns the instance itself. For example, class \verb\Bag\ above could have the following method: @@ -2411,13 +2463,39 @@ statement, as follows: x = Bag().init() \end{verbatim} -Of course, the \verb\init\ method may have arguments for greater -flexibility. +In later versions of Python, a special method named \verb\__init__\ may be +defined instead: + +\begin{verbatim} + def __init__(self): + self.empty() +\end{verbatim} + +When a class defines an \verb\__init__\ method, class instantiation +automatically invokes \verb\__init__\ for the newly-created class +instance. So in the \verb\Bag\ example, a new and initialized instance +can be obtained by: -Warning: a common mistake is to forget the \verb\return self\ at the -end of an init method! +\begin{verbatim} + x = Bag() +\end{verbatim} +Of course, the \verb\__init__\ method may have arguments for greater +flexibility. In that case, arguments given to the class instantiation +operator are passed on to \verb\__init__\. For example, +\bcode\begin{verbatim} +>>> class Complex: +... def __init__(self, realpart, imagpart): +... self.r = realpart +... self.i = imagpart +... +>>> x = Complex(3.0,-4.5) +>>> x.r, x.i +(3.0, -4.5) +>>> +\end{verbatim}\ecode +% Methods may reference global names in the same way as ordinary functions. The global scope associated with a method is the module containing the class definition. (The class itself is never used as a @@ -2484,7 +2562,7 @@ the base class is defined or imported directly in the global scope.) \subsection{Multiple inheritance} -Poython supports a limited form of multiple inheritance as well. A +Python supports a limited form of multiple inheritance as well. A class definition with multiple base classes looks as follows: \begin{verbatim} @@ -2559,7 +2637,422 @@ object of which the method is an instance, and \verb\m.im_func\ is the function object corresponding to the method. -XXX Mention bw compat hacks. +\chapter{Recent Additions} + +Python is an evolving language. Since this tutorial was last +thoroughly revised, several new features have been added to the +language. While ideally I should revise the tutorial to incorporate +them in the mainline of the text, lack of time currently requires me +to a more modest approach. In this chapter I will briefly list the +most important improvements to the language and how you can use them +to your benefit. + +\section{The Last Printed Expression} + +In interactive mode, the last printed expression is assigned to the +variable \code\_\. This means that when you are using Python as a +desk calculator, it is somewhat easier to continue calculations, for +example: + +\begin{verbatim} + >>> tax = 17.5 / 100 + >>> price = 3.50 + >>> price * tax + 0.6125 + >>> price + _ + 4.1125 + >>> round(_, 2) + 4.11 + >>> +\end{verbatim} + +\section{String Literals} + +\subsection{Double Quotes} + +Python can now also use double quotes to surround string literals, +e.g. \verb\"this doesn't hurt a bit"\. + +\subsection{Continuation Of String Literals} + +String literals can span multiple lines by escaping newlines with +backslashes, e.g. + +\begin{verbatim} + hello = "This is a rather long string containing\n\ + several lines of text just as you would do in C.\n\ + Note that whitespace at the beginning of the line is\ + significant.\n" + print hello +\end{verbatim} + +which would print the following: +\begin{verbatim} + This is a rather long string containing + several lines of text just as you would do in C. + Note that whitespace at the beginning of the line is significant. +\end{verbatim} + +\subsection{Triple-quoted strings} + +In some cases, when you need to include really long strings (e.g. +containing several paragraphs of informational text), it is annoying +that you have to terminate each line with \verb@\n\@, especially if +you would like to reformat the text occasionally with a powerful text +editor like Emacs. For such situations, ``triple-quoted'' strings can +be used, e.g. + +\begin{verbatim} + hello = """ + + This string is bounded by triple double quotes (3 times "). + Newlines in the string are retained, though \ + it is still possible\nto use all normal escape sequences. + + Whitespace at the beginning of a line is + significant. If you need to include three opening quotes + you have to escape at least one of them, e.g. \""". + + This string ends in a newline. + """ +\end{verbatim} + +Note that there is no semantic difference between strings quoted with +single quotes (\verb/'/) or double quotes (\verb\"\). + +\subsection{String Literal Juxtaposition} + +One final twist: you can juxtapose multiple string literals. Two or +more adjacent string literals (but not arbitrary expressions!) +separated only by whitespace will be concatenated (without intervening +whitespace) into a single string object at compile time. This makes +it possible to continue a long string on the next line without +sacrificing indentation or performance, unlike the use of the string +concatenation operator \verb\+\ or the continuation of the literal +itself on the next line (since leading whitespace is significant +inside all types of string literals). Note that this feature, like +all string features except triple-quoted strings, is borrowed from +Standard C. + +\section{The Formatting Operator} + +\subsection{Basic Usage} + +The chapter on output formatting is really out of date: there is now +an almost complete interface to C-style printf formats. This is done +by overloading the modulo operator (\verb\%\) for a left operand +which is a string, e.g. + +\begin{verbatim} + >>> import math + >>> print 'The value of PI is approximately %5.3f.' % math.pi + The value of PI is approximately 3.142. + >>> +\end{verbatim} + +If there is more than one format in the string you pass a tuple as +right operand, e.g. + +\begin{verbatim} + >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} + >>> for name, phone in table.items(): + ... print '%-10s ==> %10d' % (name, phone) + ... + Jack ==> 4098 + Dcab ==> 8637678 + Sjoerd ==> 4127 + >>> +\end{verbatim} + +Most formats work exactly as in C and require that you pass the proper +type (however, if you don't you get an exception, not a core dump). +The \verb\%s\ format is more relaxed: if the corresponding argument is +not a string object, it is converted to string using the \verb\str()\ +built-in function. Using \verb\*\ to pass the width or precision in +as a separate (integer) argument is supported. The C formats +\verb\%n\ and \verb\%p\ are not supported. + +\subsection{Referencing Variables By Name} + +If you have a really long format string that you don't want to split +up, it would be nice if you could reference the variables to be +formatted by name instead of by position. This can be done by using +an extension of C formats using the form \verb\%(name)format\, e.g. + +\begin{verbatim} + >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} + >>> print 'Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table + Jack: 4098; Sjoerd: 4127; Dcab: 8637678 + >>> +\end{verbatim} + +This is particularly useful in combination with the new built-in +\verb\vars()\ function, which returns a dictionary containing all +local variables. + +\section{Optional Function Arguments} + +It is now possible to define functions with a variable number of +arguments. There are two forms, which can be combined. + +\subsection{Default Argument Values} + +The most useful form is to specify a default value for one or more +arguments. This creates a function that can be called with fewer +arguments than it is defined, e.g. + +\begin{verbatim} + def ask_ok(prompt, retries = 4, complaint = 'Yes or no, please!'): + while 1: + ok = raw_input(prompt) + if ok in ('y', 'ye', 'yes'): return 1 + if ok in ('n', 'no', 'nop', 'nope'): return 0 + retries = retries - 1 + if retries < 0: raise IOError, 'refusenik user' + print complaint +\end{verbatim} + +This function can be called either like this: +\verb\ask_ok('Do you really want to quit?')\ or like this: +\verb\ask_ok('OK to overwrite the file?', 2)\. + +The default values are evaluated at the point of function definition +in the {\em defining} scope, so that e.g. + +\begin{verbatim} + i = 5 + def f(arg = i): print arg + i = 6 + f() +\end{verbatim} + +will print \verb\5\. + +\subsection{Arbitrary Argument Lists} + +It is also possible to specify that a function can be called with an +arbitrary number of arguments. These arguments will be wrapped up in +a tuple. Before the variable number of arguments, zero or more normal +arguments may occur, e.g. + +\begin{verbatim} + def fprintf(file, format, *args): + file.write(format % args) +\end{verbatim} + +This feature may be combined with the previous, e.g. + +\begin{verbatim} + def but_is_it_useful(required, optional = None, *remains): + print "I don't know" +\end{verbatim} + +\section{Lambda And Functional Programming Tools} + +\subsection{Lambda Forms} + +On popular demand, a few features commonly found in functional +programming languages and Lisp have been added to Python. With the +\verb\lambda\ keyword, small anonymous functions can be created. +Here's a function that returns the sum of its two arguments: +\verb\lambda a, b: a+b\. Lambda forms can be used wherever function +objects are required. They are syntactically restricted to a single +expression. Semantically, they are just syntactic sugar for a normal +function definition. Like nested function definitions, lambda forms +cannot reference variables from the containing scope, but this can be +overcome through the judicious use of default argument values, e.g. + +\begin{verbatim} + def make_incrementor(n): + return lambda(x, incr=n): x+incr +\end{verbatim} + +\subsection{Map, Reduce and Filter} + +Three new built-in functions on sequences are good candidate to pass +lambda forms. + +\subsubsection{Map.} + +\verb\map(function, sequence)\ calls \verb\function(item)\ for each of +the sequence's items and returns a list of the return values. For +example, to compute some cubes: + +\begin{verbatim} + >>> map(lambda x: x*x*x, range(1, 11)) + [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000] + >>> +\end{verbatim} + +More than one sequence may be passed; the function must then have as +many arguments as there are sequences and is called with the +corresponding item from each sequence (or \verb\None\ if some sequence +is shorter than another). If \verb\None\ is passed for the function, +a function returning its argument(s) is substituted. + +Combining these two special cases, we see that +\verb\map(None, list1, list2)\ is a convenient way of turning a pair +of lists into a list of pairs. For example: + +\begin{verbatim} + >>> seq = range(8) + >>> map(None, seq, map(lambda x: x*x, seq)) + [(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)] + >>> +\end{verbatim} + +\subsubsection{Filter.} + +\verb\filter(function, sequence)\ returns a sequence (of the same +type, if possible) consisting of those items from the sequence for +which \verb\function(item)\ is true. For example, to compute some +primes: + +\begin{verbatim} + >>> filter(lambda x: x%2 != 0 and x%3 != 0, range(2, 25)) + [5, 7, 11, 13, 17, 19, 23] + >>> +\end{verbatim} + +\subsubsection{Reduce.} + +\verb\reduce(function, sequence)\ returns a single value constructed +by calling the (binary) function on the first two items of the +sequence, then on the result and the next item, and so on. For +example, to compute the sum of the numbers 1 through 10: + +\begin{verbatim} + >>> reduce(lambda x, y: x+y, range(1, 11)) + 55 + >>> +\end{verbatim} + +If there's only one item in the sequence, its value is returned; if +the sequence is empty, an exception is raised. + +A third argument can be passed to indicate the starting value. In this +case the starting value is returned for an empty sequence, and the +function is first applied to the starting value and the first sequence +item, then to the result and the next item, and so on. For example, + +\begin{verbatim} + >>> def sum(seq): + ... return reduce(lambda x, y: x+y, seq, 0) + ... + >>> sum(range(1, 11)) + 55 + >>> sum([]) + 0 + >>> +\end{verbatim} + +\section{Continuation Lines Without Backslashes} +While the general mechanism for continuation of a source line on the +next physical line remains to place a backslash on the end of the +line, expressions inside matched parentheses (or square brackets, or +curly braces) can now also be continued without using a backslash. +This is particularly useful for calls to functions with many +arguments, and for initializations of large tables. + +For example: + +\begin{verbatim} + month_names = ['Januari', 'Februari', 'Maart', + 'April', 'Mei', 'Juni', + 'Juli', 'Augustus', 'September', + 'Oktober', 'November', 'December'] +\end{verbatim} + +and + +\begin{verbatim} + CopyInternalHyperLinks(self.context.hyperlinks, + copy.context.hyperlinks, + uidremap) +\end{verbatim} + +\section{Regular Expressions} + +While C's printf-style output formats, transformed into Python, are +adequate for most output formatting jobs, C's scanf-style input +formats are not very powerful. Instead of scanf-style input, Python +offers Emacs-style regular expressions as a powerful input and +scanning mechanism. Read the corresponding section in the Library +Reference for a full description. + +\section{Generalized Dictionaries} + +The keys of dictionaries are no longer restricted to strings -- they +can be numbers, tuples, or (certain) class instances. (Lists and +dictionaries are not acceptable as dictionary keys, in order to avoid +problems when the object used as a key is modified.) + +Dictionaries have two new methods: \verb\d.values()\ returns a list of +the dictionary's values, and \verb\d.items()\ returns a list of the +dictionary's (key, value) pairs. Like \verb\d.keys()\, these +operations are slow for large dictionaries. Examples: + +\begin{verbatim} + >>> d = {100: 'honderd', 1000: 'duizend', 10: 'tien'} + >>> d.keys() + [100, 10, 1000] + >>> d.values() + ['honderd', 'tien', 'duizend'] + >>> d.items() + [(100, 'honderd'), (10, 'tien'), (1000, 'duizend')] + >>> +\end{verbatim} + +\section{Miscellaneous New Built-in Functions} + +The function \verb\vars()\ returns a dictionary containing the current +local variables. With a module as argument, it returns that module's +global variables. The old function \verb\dir(x)\ returns +\verb\vars(x).keys()\. + +The function \verb\round(x)\ returns a floating point number rounded +to the nearest integer (but still expressed as a floating point +number). E.g. \verb\round(3.4) == 3.0\ and \verb\round(3.5) == 4.0\. +With a second argument it rounds to the specified number of digits, +e.g. \verb\round(math.pi, 4) == 3.1416\ or even +\verb\round(123.4, -2) == 100.0\. + +The function \verb\hash(x)\ returns a hash value for an object. +All object types acceptable as dictionary keys have a hash value (and +it is this hash value that the dictionary implementation uses). + +The function \verb\id(x)\ return a unique identifier for an object. +For two objects x and y, \verb\id(x) == id(y)\ if and only if +\verb\x is y\. (In fact the object's address is used.) + +The function \verb\hasattr(x, name)\ returns whether an object has an +attribute with the given name (a string value). The function +\verb\getattr(x, name)\ returns the object's attribute with the given +name. The function \verb\setattr(x, name, value)\ assigns a value to +an object's attribute with the given name. These three functions are +useful if the attribute names are not known beforehand. Note that +\verb\getattr(x, 'foo')\ is equivalent to \verb\x.foo\, and +\verb\setattr(x, 'foo', y)\ is equivalent to \verb\x.foo = y\. By +definition, \verb\hasattr(x, name)\ returns true if and only if +\verb\getattr(x, name)\ returns without raising an exception. + +\section{Else Clause For Try Statement} + +The \verb\try...except\ statement now has an optional \verb\else\ +clause, which must follow all \verb\except\ clauses. It is useful to +place code that must be executed if the \verb\try\ clause does not +raise an exception. For example: + +\begin{verbatim} + for arg in sys.argv: + try: + f = open(arg, 'r') + except IOError: + print 'cannot open', arg + else: + print arg, 'has', len(f.readlines()), 'lines' + f.close() +\end{verbatim} \end{document} -- cgit v0.12