diff options
author | Georg Brandl <georg@python.org> | 2007-08-15 14:28:22 (GMT) |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2007-08-15 14:28:22 (GMT) |
commit | 116aa62bf54a39697e25f21d6cf6799f7faa1349 (patch) | |
tree | 8db5729518ed4ca88e26f1e26cc8695151ca3eb3 /Doc/library | |
parent | 739c01d47b9118d04e5722333f0e6b4d0c8bdd9e (diff) | |
download | cpython-116aa62bf54a39697e25f21d6cf6799f7faa1349.zip cpython-116aa62bf54a39697e25f21d6cf6799f7faa1349.tar.gz cpython-116aa62bf54a39697e25f21d6cf6799f7faa1349.tar.bz2 |
Move the 3k reST doc tree in place.
Diffstat (limited to 'Doc/library')
286 files changed, 78473 insertions, 0 deletions
diff --git a/Doc/library/__builtin__.rst b/Doc/library/__builtin__.rst new file mode 100644 index 0000000..b3e1e11 --- /dev/null +++ b/Doc/library/__builtin__.rst @@ -0,0 +1,41 @@ + +:mod:`__builtin__` --- Built-in objects +======================================= + +.. module:: __builtin__ + :synopsis: The module that provides the built-in namespace. + + +This module provides direct access to all 'built-in' identifiers of Python; for +example, ``__builtin__.open`` is the full name for the built-in function +:func:`open`. See chapter :ref:`builtin`. + +This module is not normally accessed explicitly by most applications, but can be +useful in modules that provide objects with the same name as a built-in value, +but in which the built-in of that name is also needed. For example, in a module +that wants to implement an :func:`open` function that wraps the built-in +:func:`open`, this module can be used directly:: + + import __builtin__ + + def open(path): + f = __builtin__.open(path, 'r') + return UpperCaser(f) + + class UpperCaser: + '''Wrapper around a file that converts output to upper-case.''' + + def __init__(self, f): + self._f = f + + def read(self, count=-1): + return self._f.read(count).upper() + + # ... + +As an implementation detail, most modules have the name ``__builtins__`` (note +the ``'s'``) made available as part of their globals. The value of +``__builtins__`` is normally either this module or the value of this modules's +:attr:`__dict__` attribute. Since this is an implementation detail, it may not +be used by alternate implementations of Python. + diff --git a/Doc/library/__future__.rst b/Doc/library/__future__.rst new file mode 100644 index 0000000..6bf2830 --- /dev/null +++ b/Doc/library/__future__.rst @@ -0,0 +1,61 @@ + +:mod:`__future__` --- Future statement definitions +================================================== + +.. module:: __future__ + :synopsis: Future statement definitions + + +:mod:`__future__` is a real module, and serves three purposes: + +* To avoid confusing existing tools that analyze import statements and expect to + find the modules they're importing. + +* To ensure that future_statements run under releases prior to 2.1 at least + yield runtime exceptions (the import of :mod:`__future__` will fail, because + there was no module of that name prior to 2.1). + +* To document when incompatible changes were introduced, and when they will be + --- or were --- made mandatory. This is a form of executable documentation, and + can be inspected programatically via importing :mod:`__future__` and examining + its contents. + +Each statement in :file:`__future__.py` is of the form:: + + FeatureName = "_Feature(" OptionalRelease "," MandatoryRelease "," + CompilerFlag ")" + + +where, normally, *OptionalRelease* is less than *MandatoryRelease*, and both are +5-tuples of the same form as ``sys.version_info``:: + + (PY_MAJOR_VERSION, # the 2 in 2.1.0a3; an int + PY_MINOR_VERSION, # the 1; an int + PY_MICRO_VERSION, # the 0; an int + PY_RELEASE_LEVEL, # "alpha", "beta", "candidate" or "final"; string + PY_RELEASE_SERIAL # the 3; an int + ) + +*OptionalRelease* records the first release in which the feature was accepted. + +In the case of a *MandatoryRelease* that has not yet occurred, +*MandatoryRelease* predicts the release in which the feature will become part of +the language. + +Else *MandatoryRelease* records when the feature became part of the language; in +releases at or after that, modules no longer need a future statement to use the +feature in question, but may continue to use such imports. + +*MandatoryRelease* may also be ``None``, meaning that a planned feature got +dropped. + +Instances of class :class:`_Feature` have two corresponding methods, +:meth:`getOptionalRelease` and :meth:`getMandatoryRelease`. + +*CompilerFlag* is the (bitfield) flag that should be passed in the fourth +argument to the builtin function :func:`compile` to enable the feature in +dynamically compiled code. This flag is stored in the :attr:`compiler_flag` +attribute on :class:`_Feature` instances. + +No feature description will ever be deleted from :mod:`__future__`. + diff --git a/Doc/library/__main__.rst b/Doc/library/__main__.rst new file mode 100644 index 0000000..a1d3c24 --- /dev/null +++ b/Doc/library/__main__.rst @@ -0,0 +1,17 @@ + +:mod:`__main__` --- Top-level script environment +================================================ + +.. module:: __main__ + :synopsis: The environment where the top-level script is run. + + +This module represents the (otherwise anonymous) scope in which the +interpreter's main program executes --- commands read either from standard +input, from a script file, or from an interactive prompt. It is this +environment in which the idiomatic "conditional script" stanza causes a script +to run:: + + if __name__ == "__main__": + main() + diff --git a/Doc/library/_ast.rst b/Doc/library/_ast.rst new file mode 100644 index 0000000..9b195be --- /dev/null +++ b/Doc/library/_ast.rst @@ -0,0 +1,59 @@ +.. _ast: + +Abstract Syntax Trees +===================== + +.. module:: _ast + :synopsis: Abstract Syntax Tree classes. + +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.5 + +The ``_ast`` module helps Python applications to process trees of the Python +abstract syntax grammar. The Python compiler currently provides read-only access +to such trees, meaning that applications can only create a tree for a given +piece of Python source code; generating byte code from a (potentially modified) +tree is not supported. The abstract syntax itself might change with each Python +release; this module helps to find out programmatically what the current grammar +looks like. + +An abstract syntax tree can be generated by passing ``_ast.PyCF_ONLY_AST`` as a +flag to the :func:`compile` builtin function. The result will be a tree of +objects whose classes all inherit from ``_ast.AST``. + +The actual classes are derived from the ``Parser/Python.asdl`` file, which is +reproduced below. There is one class defined for each left-hand side symbol in +the abstract grammar (for example, ``_ast.stmt`` or ``_ast.expr``). In addition, +there is one class defined for each constructor on the right-hand side; these +classes inherit from the classes for the left-hand side trees. For example, +``_ast.BinOp`` inherits from ``_ast.expr``. For production rules with +alternatives (aka "sums"), the left-hand side class is abstract: only instances +of specific constructor nodes are ever created. + +Each concrete class has an attribute ``_fields`` which gives the names of all +child nodes. + +Each instance of a concrete class has one attribute for each child node, of the +type as defined in the grammar. For example, ``_ast.BinOp`` instances have an +attribute ``left`` of type ``_ast.expr``. Instances of ``_ast.expr`` and +``_ast.stmt`` subclasses also have lineno and col_offset attributes. The lineno +is the line number of source text (1 indexed so the first line is line 1) and +the col_offset is the utf8 byte offset of the first token that generated the +node. The utf8 offset is recorded because the parser uses utf8 internally. + +If these attributes are marked as optional in the grammar (using a question +mark), the value might be ``None``. If the attributes can have zero-or-more +values (marked with an asterisk), the values are represented as Python lists. + + +Abstract Grammar +---------------- + +The module defines a string constant ``__version__`` which is the decimal +subversion revision number of the file shown below. + +The abstract grammar is currently defined as follows: + +.. literalinclude:: ../../Parser/Python.asdl diff --git a/Doc/library/_winreg.rst b/Doc/library/_winreg.rst new file mode 100644 index 0000000..fddbfd1 --- /dev/null +++ b/Doc/library/_winreg.rst @@ -0,0 +1,420 @@ + +:mod:`_winreg` -- Windows registry access +========================================= + +.. module:: _winreg + :platform: Windows + :synopsis: Routines and objects for manipulating the Windows registry. +.. sectionauthor:: Mark Hammond <MarkH@ActiveState.com> + + +.. versionadded:: 2.0 + +These functions expose the Windows registry API to Python. Instead of using an +integer as the registry handle, a handle object is used to ensure that the +handles are closed correctly, even if the programmer neglects to explicitly +close them. + +This module exposes a very low-level interface to the Windows registry; it is +expected that in the future a new ``winreg`` module will be created offering a +higher-level interface to the registry API. + +This module offers the following functions: + + +.. function:: CloseKey(hkey) + + Closes a previously opened registry key. The hkey argument specifies a + previously opened key. + + Note that if *hkey* is not closed using this method (or via + :meth:`handle.Close`), it is closed when the *hkey* object is destroyed by + Python. + + +.. function:: ConnectRegistry(computer_name, key) + + Establishes a connection to a predefined registry handle on another computer, + and returns a :dfn:`handle object` + + *computer_name* is the name of the remote computer, of the form + ``r"\\computername"``. If ``None``, the local computer is used. + + *key* is the predefined handle to connect to. + + The return value is the handle of the opened key. If the function fails, an + :exc:`EnvironmentError` exception is raised. + + +.. function:: CreateKey(key, sub_key) + + Creates or opens the specified key, returning a :dfn:`handle object` + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + *sub_key* is a string that names the key this method opens or creates. + + If *key* is one of the predefined keys, *sub_key* may be ``None``. In that + case, the handle returned is the same key handle passed in to the function. + + If the key already exists, this function opens the existing key. + + The return value is the handle of the opened key. If the function fails, an + :exc:`EnvironmentError` exception is raised. + + +.. function:: DeleteKey(key, sub_key) + + Deletes the specified key. + + *key* is an already open key, or any one of the predefined :const:`HKEY_\*` + constants. + + *sub_key* is a string that must be a subkey of the key identified by the *key* + parameter. This value must not be ``None``, and the key may not have subkeys. + + *This method can not delete keys with subkeys.* + + If the method succeeds, the entire key, including all of its values, is removed. + If the method fails, an :exc:`EnvironmentError` exception is raised. + + +.. function:: DeleteValue(key, value) + + Removes a named value from a registry key. + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + *value* is a string that identifies the value to remove. + + +.. function:: EnumKey(key, index) + + Enumerates subkeys of an open registry key, returning a string. + + *key* is an already open key, or any one of the predefined :const:`HKEY_\*` + constants. + + *index* is an integer that identifies the index of the key to retrieve. + + The function retrieves the name of one subkey each time it is called. It is + typically called repeatedly until an :exc:`EnvironmentError` exception is + raised, indicating, no more values are available. + + +.. function:: EnumValue(key, index) + + Enumerates values of an open registry key, returning a tuple. + + *key* is an already open key, or any one of the predefined :const:`HKEY_\*` + constants. + + *index* is an integer that identifies the index of the value to retrieve. + + The function retrieves the name of one subkey each time it is called. It is + typically called repeatedly, until an :exc:`EnvironmentError` exception is + raised, indicating no more values. + + The result is a tuple of 3 items: + + +-------+--------------------------------------------+ + | Index | Meaning | + +=======+============================================+ + | ``0`` | A string that identifies the value name | + +-------+--------------------------------------------+ + | ``1`` | An object that holds the value data, and | + | | whose type depends on the underlying | + | | registry type | + +-------+--------------------------------------------+ + | ``2`` | An integer that identifies the type of the | + | | value data | + +-------+--------------------------------------------+ + + +.. function:: FlushKey(key) + + Writes all the attributes of a key to the registry. + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + It is not necessary to call RegFlushKey to change a key. Registry changes are + flushed to disk by the registry using its lazy flusher. Registry changes are + also flushed to disk at system shutdown. Unlike :func:`CloseKey`, the + :func:`FlushKey` method returns only when all the data has been written to the + registry. An application should only call :func:`FlushKey` if it requires + absolute certainty that registry changes are on disk. + + .. note:: + + If you don't know whether a :func:`FlushKey` call is required, it probably + isn't. + + +.. function:: RegLoadKey(key, sub_key, file_name) + + Creates a subkey under the specified key and stores registration information + from a specified file into that subkey. + + *key* is an already open key, or any of the predefined :const:`HKEY_\*` + constants. + + *sub_key* is a string that identifies the sub_key to load. + + *file_name* is the name of the file to load registry data from. This file must + have been created with the :func:`SaveKey` function. Under the file allocation + table (FAT) file system, the filename may not have an extension. + + A call to LoadKey() fails if the calling process does not have the + :const:`SE_RESTORE_PRIVILEGE` privilege. Note that privileges are different than + permissions - see the Win32 documentation for more details. + + If *key* is a handle returned by :func:`ConnectRegistry`, then the path + specified in *fileName* is relative to the remote computer. + + The Win32 documentation implies *key* must be in the :const:`HKEY_USER` or + :const:`HKEY_LOCAL_MACHINE` tree. This may or may not be true. + + +.. function:: OpenKey(key, sub_key[, res=0][, sam=KEY_READ]) + + Opens the specified key, returning a :dfn:`handle object` + + *key* is an already open key, or any one of the predefined :const:`HKEY_\*` + constants. + + *sub_key* is a string that identifies the sub_key to open. + + *res* is a reserved integer, and must be zero. The default is zero. + + *sam* is an integer that specifies an access mask that describes the desired + security access for the key. Default is :const:`KEY_READ` + + The result is a new handle to the specified key. + + If the function fails, :exc:`EnvironmentError` is raised. + + +.. function:: OpenKeyEx() + + The functionality of :func:`OpenKeyEx` is provided via :func:`OpenKey`, by the + use of default arguments. + + +.. function:: QueryInfoKey(key) + + Returns information about a key, as a tuple. + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + The result is a tuple of 3 items: + + +-------+---------------------------------------------+ + | Index | Meaning | + +=======+=============================================+ + | ``0`` | An integer giving the number of sub keys | + | | this key has. | + +-------+---------------------------------------------+ + | ``1`` | An integer giving the number of values this | + | | key has. | + +-------+---------------------------------------------+ + | ``2`` | A long integer giving when the key was last | + | | modified (if available) as 100's of | + | | nanoseconds since Jan 1, 1600. | + +-------+---------------------------------------------+ + + +.. function:: QueryValue(key, sub_key) + + Retrieves the unnamed value for a key, as a string + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + *sub_key* is a string that holds the name of the subkey with which the value is + associated. If this parameter is ``None`` or empty, the function retrieves the + value set by the :func:`SetValue` method for the key identified by *key*. + + Values in the registry have name, type, and data components. This method + retrieves the data for a key's first value that has a NULL name. But the + underlying API call doesn't return the type, Lame Lame Lame, DO NOT USE THIS!!! + + +.. function:: QueryValueEx(key, value_name) + + Retrieves the type and data for a specified value name associated with an open + registry key. + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + *value_name* is a string indicating the value to query. + + The result is a tuple of 2 items: + + +-------+-----------------------------------------+ + | Index | Meaning | + +=======+=========================================+ + | ``0`` | The value of the registry item. | + +-------+-----------------------------------------+ + | ``1`` | An integer giving the registry type for | + | | this value. | + +-------+-----------------------------------------+ + + +.. function:: SaveKey(key, file_name) + + Saves the specified key, and all its subkeys to the specified file. + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + *file_name* is the name of the file to save registry data to. This file cannot + already exist. If this filename includes an extension, it cannot be used on file + allocation table (FAT) file systems by the :meth:`LoadKey`, :meth:`ReplaceKey` + or :meth:`RestoreKey` methods. + + If *key* represents a key on a remote computer, the path described by + *file_name* is relative to the remote computer. The caller of this method must + possess the :const:`SeBackupPrivilege` security privilege. Note that + privileges are different than permissions - see the Win32 documentation for + more details. + + This function passes NULL for *security_attributes* to the API. + + +.. function:: SetValue(key, sub_key, type, value) + + Associates a value with a specified key. + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + *sub_key* is a string that names the subkey with which the value is associated. + + *type* is an integer that specifies the type of the data. Currently this must be + :const:`REG_SZ`, meaning only strings are supported. Use the :func:`SetValueEx` + function for support for other data types. + + *value* is a string that specifies the new value. + + If the key specified by the *sub_key* parameter does not exist, the SetValue + function creates it. + + Value lengths are limited by available memory. Long values (more than 2048 + bytes) should be stored as files with the filenames stored in the configuration + registry. This helps the registry perform efficiently. + + The key identified by the *key* parameter must have been opened with + :const:`KEY_SET_VALUE` access. + + +.. function:: SetValueEx(key, value_name, reserved, type, value) + + Stores data in the value field of an open registry key. + + *key* is an already open key, or one of the predefined :const:`HKEY_\*` + constants. + + *value_name* is a string that names the subkey with which the value is + associated. + + *type* is an integer that specifies the type of the data. This should be one + of the following constants defined in this module: + + +----------------------------------+---------------------------------------------+ + | Constant | Meaning | + +==================================+=============================================+ + | :const:`REG_BINARY` | Binary data in any form. | + +----------------------------------+---------------------------------------------+ + | :const:`REG_DWORD` | A 32-bit number. | + +----------------------------------+---------------------------------------------+ + | :const:`REG_DWORD_LITTLE_ENDIAN` | A 32-bit number in little-endian format. | + +----------------------------------+---------------------------------------------+ + | :const:`REG_DWORD_BIG_ENDIAN` | A 32-bit number in big-endian format. | + +----------------------------------+---------------------------------------------+ + | :const:`REG_EXPAND_SZ` | Null-terminated string containing | + | | references to environment variables | + | | (``%PATH%``). | + +----------------------------------+---------------------------------------------+ + | :const:`REG_LINK` | A Unicode symbolic link. | + +----------------------------------+---------------------------------------------+ + | :const:`REG_MULTI_SZ` | A sequence of null-terminated strings, | + | | terminated by two null characters. (Python | + | | handles this termination automatically.) | + +----------------------------------+---------------------------------------------+ + | :const:`REG_NONE` | No defined value type. | + +----------------------------------+---------------------------------------------+ + | :const:`REG_RESOURCE_LIST` | A device-driver resource list. | + +----------------------------------+---------------------------------------------+ + | :const:`REG_SZ` | A null-terminated string. | + +----------------------------------+---------------------------------------------+ + + *reserved* can be anything - zero is always passed to the API. + + *value* is a string that specifies the new value. + + This method can also set additional value and type information for the specified + key. The key identified by the key parameter must have been opened with + :const:`KEY_SET_VALUE` access. + + To open the key, use the :func:`CreateKeyEx` or :func:`OpenKey` methods. + + Value lengths are limited by available memory. Long values (more than 2048 + bytes) should be stored as files with the filenames stored in the configuration + registry. This helps the registry perform efficiently. + + +.. _handle-object: + +Registry Handle Objects +----------------------- + +This object wraps a Windows HKEY object, automatically closing it when the +object is destroyed. To guarantee cleanup, you can call either the +:meth:`Close` method on the object, or the :func:`CloseKey` function. + +All registry functions in this module return one of these objects. + +All registry functions in this module which accept a handle object also accept +an integer, however, use of the handle object is encouraged. + +Handle objects provide semantics for :meth:`__bool__` - thus :: + + if handle: + print "Yes" + +will print ``Yes`` if the handle is currently valid (has not been closed or +detached). + +The object also support comparison semantics, so handle objects will compare +true if they both reference the same underlying Windows handle value. + +Handle objects can be converted to an integer (e.g., using the builtin +:func:`int` function), in which case the underlying Windows handle value is +returned. You can also use the :meth:`Detach` method to return the integer +handle, and also disconnect the Windows handle from the handle object. + + +.. method:: PyHKEY.Close() + + Closes the underlying Windows handle. + + If the handle is already closed, no error is raised. + + +.. method:: PyHKEY.Detach() + + Detaches the Windows handle from the handle object. + + The result is an integer (or long on 64 bit Windows) that holds the value of the + handle before it is detached. If the handle is already detached or closed, this + will return zero. + + After calling this function, the handle is effectively invalidated, but the + handle is not closed. You would call this function when you need the + underlying Win32 handle to exist beyond the lifetime of the handle object. + diff --git a/Doc/library/aepack.rst b/Doc/library/aepack.rst new file mode 100644 index 0000000..7eaffd8 --- /dev/null +++ b/Doc/library/aepack.rst @@ -0,0 +1,92 @@ + +:mod:`aepack` --- Conversion between Python variables and AppleEvent data containers +==================================================================================== + +.. module:: aepack + :platform: Mac + :synopsis: Conversion between Python variables and AppleEvent data containers. +.. sectionauthor:: Vincent Marchetti <vincem@en.com> + + +.. % \moduleauthor{Jack Jansen?}{email} + +The :mod:`aepack` module defines functions for converting (packing) Python +variables to AppleEvent descriptors and back (unpacking). Within Python the +AppleEvent descriptor is handled by Python objects of built-in type +:class:`AEDesc`, defined in module :mod:`Carbon.AE`. + +The :mod:`aepack` module defines the following functions: + + +.. function:: pack(x[, forcetype]) + + Returns an :class:`AEDesc` object containing a conversion of Python value x. If + *forcetype* is provided it specifies the descriptor type of the result. + Otherwise, a default mapping of Python types to Apple Event descriptor types is + used, as follows: + + +-----------------+-----------------------------------+ + | Python type | descriptor type | + +=================+===================================+ + | :class:`FSSpec` | typeFSS | + +-----------------+-----------------------------------+ + | :class:`FSRef` | typeFSRef | + +-----------------+-----------------------------------+ + | :class:`Alias` | typeAlias | + +-----------------+-----------------------------------+ + | integer | typeLong (32 bit integer) | + +-----------------+-----------------------------------+ + | float | typeFloat (64 bit floating point) | + +-----------------+-----------------------------------+ + | string | typeText | + +-----------------+-----------------------------------+ + | unicode | typeUnicodeText | + +-----------------+-----------------------------------+ + | list | typeAEList | + +-----------------+-----------------------------------+ + | dictionary | typeAERecord | + +-----------------+-----------------------------------+ + | instance | *see below* | + +-----------------+-----------------------------------+ + + If *x* is a Python instance then this function attempts to call an + :meth:`__aepack__` method. This method should return an :class:`AEDesc` object. + + If the conversion *x* is not defined above, this function returns the Python + string representation of a value (the repr() function) encoded as a text + descriptor. + + +.. function:: unpack(x[, formodulename]) + + *x* must be an object of type :class:`AEDesc`. This function returns a Python + object representation of the data in the Apple Event descriptor *x*. Simple + AppleEvent data types (integer, text, float) are returned as their obvious + Python counterparts. Apple Event lists are returned as Python lists, and the + list elements are recursively unpacked. Object references (ex. ``line 3 of + document 1``) are returned as instances of :class:`aetypes.ObjectSpecifier`, + unless ``formodulename`` is specified. AppleEvent descriptors with descriptor + type typeFSS are returned as :class:`FSSpec` objects. AppleEvent record + descriptors are returned as Python dictionaries, with 4-character string keys + and elements recursively unpacked. + + The optional ``formodulename`` argument is used by the stub packages generated + by :mod:`gensuitemodule`, and ensures that the OSA classes for object specifiers + are looked up in the correct module. This ensures that if, say, the Finder + returns an object specifier for a window you get an instance of + ``Finder.Window`` and not a generic ``aetypes.Window``. The former knows about + all the properties and elements a window has in the Finder, while the latter + knows no such things. + + +.. seealso:: + + Module :mod:`Carbon.AE` + Built-in access to Apple Event Manager routines. + + Module :mod:`aetypes` + Python definitions of codes for Apple Event descriptor types. + + ` Inside Macintosh: Interapplication Communication <http://developer.apple.com/techpubs/mac/IAC/IAC-2.html>`_ + Information about inter-process communications on the Macintosh. + diff --git a/Doc/library/aetools.rst b/Doc/library/aetools.rst new file mode 100644 index 0000000..b5fd4ad --- /dev/null +++ b/Doc/library/aetools.rst @@ -0,0 +1,86 @@ + +:mod:`aetools` --- OSA client support +===================================== + +.. module:: aetools + :platform: Mac + :synopsis: Basic support for sending Apple Events +.. sectionauthor:: Jack Jansen <Jack.Jansen@cwi.nl> + + +.. % \moduleauthor{Jack Jansen?}{email} + +The :mod:`aetools` module contains the basic functionality on which Python +AppleScript client support is built. It also imports and re-exports the core +functionality of the :mod:`aetypes` and :mod:`aepack` modules. The stub packages +generated by :mod:`gensuitemodule` import the relevant portions of +:mod:`aetools`, so usually you do not need to import it yourself. The exception +to this is when you cannot use a generated suite package and need lower-level +access to scripting. + +The :mod:`aetools` module itself uses the AppleEvent support provided by the +:mod:`Carbon.AE` module. This has one drawback: you need access to the window +manager, see section :ref:`osx-gui-scripts` for details. This restriction may be +lifted in future releases. + +The :mod:`aetools` module defines the following functions: + + +.. function:: packevent(ae, parameters, attributes) + + Stores parameters and attributes in a pre-created ``Carbon.AE.AEDesc`` object. + ``parameters`` and ``attributes`` are dictionaries mapping 4-character OSA + parameter keys to Python objects. The objects are packed using + ``aepack.pack()``. + + +.. function:: unpackevent(ae[, formodulename]) + + Recursively unpacks a ``Carbon.AE.AEDesc`` event to Python objects. The function + returns the parameter dictionary and the attribute dictionary. The + ``formodulename`` argument is used by generated stub packages to control where + AppleScript classes are looked up. + + +.. function:: keysubst(arguments, keydict) + + Converts a Python keyword argument dictionary ``arguments`` to the format + required by ``packevent`` by replacing the keys, which are Python identifiers, + by the four-character OSA keys according to the mapping specified in + ``keydict``. Used by the generated suite packages. + + +.. function:: enumsubst(arguments, key, edict) + + If the ``arguments`` dictionary contains an entry for ``key`` convert the value + for that entry according to dictionary ``edict``. This converts human-readable + Python enumeration names to the OSA 4-character codes. Used by the generated + suite packages. + +The :mod:`aetools` module defines the following class: + + +.. class:: TalkTo([signature=None, start=0, timeout=0]) + + Base class for the proxy used to talk to an application. ``signature`` overrides + the class attribute ``_signature`` (which is usually set by subclasses) and is + the 4-char creator code defining the application to talk to. ``start`` can be + set to true to enable running the application on class instantiation. + ``timeout`` can be specified to change the default timeout used while waiting + for an AppleEvent reply. + + +.. method:: TalkTo._start() + + Test whether the application is running, and attempt to start it if not. + + +.. method:: TalkTo.send(code, subcode[, parameters, attributes]) + + Create the AppleEvent ``Carbon.AE.AEDesc`` for the verb with the OSA designation + ``code, subcode`` (which are the usual 4-character strings), pack the + ``parameters`` and ``attributes`` into it, send it to the target application, + wait for the reply, unpack the reply with ``unpackevent`` and return the reply + appleevent, the unpacked return values as a dictionary and the return + attributes. + diff --git a/Doc/library/aetypes.rst b/Doc/library/aetypes.rst new file mode 100644 index 0000000..0dd0a88 --- /dev/null +++ b/Doc/library/aetypes.rst @@ -0,0 +1,150 @@ + +:mod:`aetypes` --- AppleEvent objects +===================================== + +.. module:: aetypes + :platform: Mac + :synopsis: Python representation of the Apple Event Object Model. +.. sectionauthor:: Vincent Marchetti <vincem@en.com> + + +.. % \moduleauthor{Jack Jansen?}{email} + +The :mod:`aetypes` defines classes used to represent Apple Event data +descriptors and Apple Event object specifiers. + +Apple Event data is contained in descriptors, and these descriptors are typed. +For many descriptors the Python representation is simply the corresponding +Python type: ``typeText`` in OSA is a Python string, ``typeFloat`` is a float, +etc. For OSA types that have no direct Python counterpart this module declares +classes. Packing and unpacking instances of these classes is handled +automatically by :mod:`aepack`. + +An object specifier is essentially an address of an object implemented in a +Apple Event server. An Apple Event specifier is used as the direct object for an +Apple Event or as the argument of an optional parameter. The :mod:`aetypes` +module contains the base classes for OSA classes and properties, which are used +by the packages generated by :mod:`gensuitemodule` to populate the classes and +properties in a given suite. + +For reasons of backward compatibility, and for cases where you need to script an +application for which you have not generated the stub package this module also +contains object specifiers for a number of common OSA classes such as +``Document``, ``Window``, ``Character``, etc. + +The :mod:`AEObjects` module defines the following classes to represent Apple +Event descriptor data: + + +.. class:: Unknown(type, data) + + The representation of OSA descriptor data for which the :mod:`aepack` and + :mod:`aetypes` modules have no support, i.e. anything that is not represented by + the other classes here and that is not equivalent to a simple Python value. + + +.. class:: Enum(enum) + + An enumeration value with the given 4-character string value. + + +.. class:: InsertionLoc(of, pos) + + Position ``pos`` in object ``of``. + + +.. class:: Boolean(bool) + + A boolean. + + +.. class:: StyledText(style, text) + + Text with style information (font, face, etc) included. + + +.. class:: AEText(script, style, text) + + Text with script system and style information included. + + +.. class:: IntlText(script, language, text) + + Text with script system and language information included. + + +.. class:: IntlWritingCode(script, language) + + Script system and language information. + + +.. class:: QDPoint(v, h) + + A quickdraw point. + + +.. class:: QDRectangle(v0, h0, v1, h1) + + A quickdraw rectangle. + + +.. class:: RGBColor(r, g, b) + + A color. + + +.. class:: Type(type) + + An OSA type value with the given 4-character name. + + +.. class:: Keyword(name) + + An OSA keyword with the given 4-character name. + + +.. class:: Range(start, stop) + + A range. + + +.. class:: Ordinal(abso) + + Non-numeric absolute positions, such as ``"firs"``, first, or ``"midd"``, + middle. + + +.. class:: Logical(logc, term) + + The logical expression of applying operator ``logc`` to ``term``. + + +.. class:: Comparison(obj1, relo, obj2) + + The comparison ``relo`` of ``obj1`` to ``obj2``. + +The following classes are used as base classes by the generated stub packages to +represent AppleScript classes and properties in Python: + + +.. class:: ComponentItem(which[, fr]) + + Abstract baseclass for an OSA class. The subclass should set the class attribute + ``want`` to the 4-character OSA class code. Instances of subclasses of this + class are equivalent to AppleScript Object Specifiers. Upon instantiation you + should pass a selector in ``which``, and optionally a parent object in ``fr``. + + +.. class:: NProperty(fr) + + Abstract baseclass for an OSA property. The subclass should set the class + attributes ``want`` and ``which`` to designate which property we are talking + about. Instances of subclasses of this class are Object Specifiers. + + +.. class:: ObjectSpecifier(want, form, seld[, fr]) + + Base class of ``ComponentItem`` and ``NProperty``, a general OSA Object + Specifier. See the Apple Open Scripting Architecture documentation for the + parameters. Note that this class is not abstract. + diff --git a/Doc/library/aifc.rst b/Doc/library/aifc.rst new file mode 100644 index 0000000..0cfcb52 --- /dev/null +++ b/Doc/library/aifc.rst @@ -0,0 +1,225 @@ + +:mod:`aifc` --- Read and write AIFF and AIFC files +================================================== + +.. module:: aifc + :synopsis: Read and write audio files in AIFF or AIFC format. + + +.. index:: + single: Audio Interchange File Format + single: AIFF + single: AIFF-C + +This module provides support for reading and writing AIFF and AIFF-C files. +AIFF is Audio Interchange File Format, a format for storing digital audio +samples in a file. AIFF-C is a newer version of the format that includes the +ability to compress the audio data. + +**Caveat:** Some operations may only work under IRIX; these will raise +:exc:`ImportError` when attempting to import the :mod:`cl` module, which is only +available on IRIX. + +Audio files have a number of parameters that describe the audio data. The +sampling rate or frame rate is the number of times per second the sound is +sampled. The number of channels indicate if the audio is mono, stereo, or +quadro. Each frame consists of one sample per channel. The sample size is the +size in bytes of each sample. Thus a frame consists of +*nchannels*\**samplesize* bytes, and a second's worth of audio consists of +*nchannels*\**samplesize*\**framerate* bytes. + +For example, CD quality audio has a sample size of two bytes (16 bits), uses two +channels (stereo) and has a frame rate of 44,100 frames/second. This gives a +frame size of 4 bytes (2\*2), and a second's worth occupies 2\*2\*44100 bytes +(176,400 bytes). + +Module :mod:`aifc` defines the following function: + + +.. function:: open(file[, mode]) + + Open an AIFF or AIFF-C file and return an object instance with methods that are + described below. The argument *file* is either a string naming a file or a file + object. *mode* must be ``'r'`` or ``'rb'`` when the file must be opened for + reading, or ``'w'`` or ``'wb'`` when the file must be opened for writing. If + omitted, ``file.mode`` is used if it exists, otherwise ``'rb'`` is used. When + used for writing, the file object should be seekable, unless you know ahead of + time how many samples you are going to write in total and use + :meth:`writeframesraw` and :meth:`setnframes`. + +Objects returned by :func:`open` when a file is opened for reading have the +following methods: + + +.. method:: aifc.getnchannels() + + Return the number of audio channels (1 for mono, 2 for stereo). + + +.. method:: aifc.getsampwidth() + + Return the size in bytes of individual samples. + + +.. method:: aifc.getframerate() + + Return the sampling rate (number of audio frames per second). + + +.. method:: aifc.getnframes() + + Return the number of audio frames in the file. + + +.. method:: aifc.getcomptype() + + Return a four-character string describing the type of compression used in the + audio file. For AIFF files, the returned value is ``'NONE'``. + + +.. method:: aifc.getcompname() + + Return a human-readable description of the type of compression used in the audio + file. For AIFF files, the returned value is ``'not compressed'``. + + +.. method:: aifc.getparams() + + Return a tuple consisting of all of the above values in the above order. + + +.. method:: aifc.getmarkers() + + Return a list of markers in the audio file. A marker consists of a tuple of + three elements. The first is the mark ID (an integer), the second is the mark + position in frames from the beginning of the data (an integer), the third is the + name of the mark (a string). + + +.. method:: aifc.getmark(id) + + Return the tuple as described in :meth:`getmarkers` for the mark with the given + *id*. + + +.. method:: aifc.readframes(nframes) + + Read and return the next *nframes* frames from the audio file. The returned + data is a string containing for each frame the uncompressed samples of all + channels. + + +.. method:: aifc.rewind() + + Rewind the read pointer. The next :meth:`readframes` will start from the + beginning. + + +.. method:: aifc.setpos(pos) + + Seek to the specified frame number. + + +.. method:: aifc.tell() + + Return the current frame number. + + +.. method:: aifc.close() + + Close the AIFF file. After calling this method, the object can no longer be + used. + +Objects returned by :func:`open` when a file is opened for writing have all the +above methods, except for :meth:`readframes` and :meth:`setpos`. In addition +the following methods exist. The :meth:`get\*` methods can only be called after +the corresponding :meth:`set\*` methods have been called. Before the first +:meth:`writeframes` or :meth:`writeframesraw`, all parameters except for the +number of frames must be filled in. + + +.. method:: aifc.aiff() + + Create an AIFF file. The default is that an AIFF-C file is created, unless the + name of the file ends in ``'.aiff'`` in which case the default is an AIFF file. + + +.. method:: aifc.aifc() + + Create an AIFF-C file. The default is that an AIFF-C file is created, unless + the name of the file ends in ``'.aiff'`` in which case the default is an AIFF + file. + + +.. method:: aifc.setnchannels(nchannels) + + Specify the number of channels in the audio file. + + +.. method:: aifc.setsampwidth(width) + + Specify the size in bytes of audio samples. + + +.. method:: aifc.setframerate(rate) + + Specify the sampling frequency in frames per second. + + +.. method:: aifc.setnframes(nframes) + + Specify the number of frames that are to be written to the audio file. If this + parameter is not set, or not set correctly, the file needs to support seeking. + + +.. method:: aifc.setcomptype(type, name) + + .. index:: + single: u-LAW + single: A-LAW + single: G.722 + + Specify the compression type. If not specified, the audio data will not be + compressed. In AIFF files, compression is not possible. The name parameter + should be a human-readable description of the compression type, the type + parameter should be a four-character string. Currently the following + compression types are supported: NONE, ULAW, ALAW, G722. + + +.. method:: aifc.setparams(nchannels, sampwidth, framerate, comptype, compname) + + Set all the above parameters at once. The argument is a tuple consisting of the + various parameters. This means that it is possible to use the result of a + :meth:`getparams` call as argument to :meth:`setparams`. + + +.. method:: aifc.setmark(id, pos, name) + + Add a mark with the given id (larger than 0), and the given name at the given + position. This method can be called at any time before :meth:`close`. + + +.. method:: aifc.tell() + + Return the current write position in the output file. Useful in combination + with :meth:`setmark`. + + +.. method:: aifc.writeframes(data) + + Write data to the output file. This method can only be called after the audio + file parameters have been set. + + +.. method:: aifc.writeframesraw(data) + + Like :meth:`writeframes`, except that the header of the audio file is not + updated. + + +.. method:: aifc.close() + + Close the AIFF file. The header of the file is updated to reflect the actual + size of the audio data. After calling this method, the object can no longer be + used. + diff --git a/Doc/library/allos.rst b/Doc/library/allos.rst new file mode 100644 index 0000000..900d6d3 --- /dev/null +++ b/Doc/library/allos.rst @@ -0,0 +1,27 @@ + +.. _allos: + +********************************* +Generic Operating System Services +********************************* + +The modules described in this chapter provide interfaces to operating system +features that are available on (almost) all operating systems, such as files and +a clock. The interfaces are generally modeled after the Unix or C interfaces, +but they are available on most other systems as well. Here's an overview: + + +.. toctree:: + + os.rst + time.rst + optparse.rst + getopt.rst + logging.rst + getpass.rst + curses.rst + curses.ascii.rst + curses.panel.rst + platform.rst + errno.rst + ctypes.rst diff --git a/Doc/library/anydbm.rst b/Doc/library/anydbm.rst new file mode 100644 index 0000000..413b7de --- /dev/null +++ b/Doc/library/anydbm.rst @@ -0,0 +1,96 @@ + +:mod:`anydbm` --- Generic access to DBM-style databases +======================================================= + +.. module:: anydbm + :synopsis: Generic interface to DBM-style database modules. + + +.. index:: + module: dbhash + module: bsddb + module: gdbm + module: dbm + module: dumbdbm + +:mod:`anydbm` is a generic interface to variants of the DBM database --- +:mod:`dbhash` (requires :mod:`bsddb`), :mod:`gdbm`, or :mod:`dbm`. If none of +these modules is installed, the slow-but-simple implementation in module +:mod:`dumbdbm` will be used. + + +.. function:: open(filename[, flag[, mode]]) + + Open the database file *filename* and return a corresponding object. + + If the database file already exists, the :mod:`whichdb` module is used to + determine its type and the appropriate module is used; if it does not exist, the + first module listed above that can be imported is used. + + The optional *flag* argument can be ``'r'`` to open an existing database for + reading only, ``'w'`` to open an existing database for reading and writing, + ``'c'`` to create the database if it doesn't exist, or ``'n'``, which will + always create a new empty database. If not specified, the default value is + ``'r'``. + + The optional *mode* argument is the Unix mode of the file, used only when the + database has to be created. It defaults to octal ``0666`` (and will be modified + by the prevailing umask). + + +.. exception:: error + + A tuple containing the exceptions that can be raised by each of the supported + modules, with a unique exception also named :exc:`anydbm.error` as the first + item --- the latter is used when :exc:`anydbm.error` is raised. + +The object returned by :func:`open` supports most of the same functionality as +dictionaries; keys and their corresponding values can be stored, retrieved, and +deleted, and the :meth:`has_key` and :meth:`keys` methods are available. Keys +and values must always be strings. + +The following example records some hostnames and a corresponding title, and +then prints out the contents of the database:: + + import anydbm + + # Open database, creating it if necessary. + db = anydbm.open('cache', 'c') + + # Record some values + db['www.python.org'] = 'Python Website' + db['www.cnn.com'] = 'Cable News Network' + + # Loop through contents. Other dictionary methods + # such as .keys(), .values() also work. + for k, v in db.iteritems(): + print k, '\t', v + + # Storing a non-string key or value will raise an exception (most + # likely a TypeError). + db['www.yahoo.com'] = 4 + + # Close when done. + db.close() + + +.. seealso:: + + Module :mod:`dbhash` + BSD ``db`` database interface. + + Module :mod:`dbm` + Standard Unix database interface. + + Module :mod:`dumbdbm` + Portable implementation of the ``dbm`` interface. + + Module :mod:`gdbm` + GNU database interface, based on the ``dbm`` interface. + + Module :mod:`shelve` + General object persistence built on top of the Python ``dbm`` interface. + + Module :mod:`whichdb` + Utility module used to determine the type of an existing database. + diff --git a/Doc/library/archiving.rst b/Doc/library/archiving.rst new file mode 100644 index 0000000..7d0df5f --- /dev/null +++ b/Doc/library/archiving.rst @@ -0,0 +1,18 @@ + +.. _archiving: + +****************************** +Data Compression and Archiving +****************************** + +The modules described in this chapter support data compression with the zlib, +gzip, and bzip2 algorithms, and the creation of ZIP- and tar-format archives. + + +.. toctree:: + + zlib.rst + gzip.rst + bz2.rst + zipfile.rst + tarfile.rst diff --git a/Doc/library/array.rst b/Doc/library/array.rst new file mode 100644 index 0000000..5194edc --- /dev/null +++ b/Doc/library/array.rst @@ -0,0 +1,272 @@ + +:mod:`array` --- Efficient arrays of numeric values +=================================================== + +.. module:: array + :synopsis: Efficient arrays of uniformly typed numeric values. + + +.. index:: single: arrays + +This module defines an object type which can efficiently represent an array of +basic values: characters, integers, floating point numbers. Arrays are sequence +types and behave very much like lists, except that the type of objects stored in +them is constrained. The type is specified at object creation time by using a +:dfn:`type code`, which is a single character. The following type codes are +defined: + ++-----------+----------------+-------------------+-----------------------+ +| Type code | C Type | Python Type | Minimum size in bytes | ++===========+================+===================+=======================+ +| ``'c'`` | char | character | 1 | ++-----------+----------------+-------------------+-----------------------+ +| ``'b'`` | signed char | int | 1 | ++-----------+----------------+-------------------+-----------------------+ +| ``'B'`` | unsigned char | int | 1 | ++-----------+----------------+-------------------+-----------------------+ +| ``'u'`` | Py_UNICODE | Unicode character | 2 | ++-----------+----------------+-------------------+-----------------------+ +| ``'h'`` | signed short | int | 2 | ++-----------+----------------+-------------------+-----------------------+ +| ``'H'`` | unsigned short | int | 2 | ++-----------+----------------+-------------------+-----------------------+ +| ``'i'`` | signed int | int | 2 | ++-----------+----------------+-------------------+-----------------------+ +| ``'I'`` | unsigned int | long | 2 | ++-----------+----------------+-------------------+-----------------------+ +| ``'l'`` | signed long | int | 4 | ++-----------+----------------+-------------------+-----------------------+ +| ``'L'`` | unsigned long | long | 4 | ++-----------+----------------+-------------------+-----------------------+ +| ``'f'`` | float | float | 4 | ++-----------+----------------+-------------------+-----------------------+ +| ``'d'`` | double | float | 8 | ++-----------+----------------+-------------------+-----------------------+ + +The actual representation of values is determined by the machine architecture +(strictly speaking, by the C implementation). The actual size can be accessed +through the :attr:`itemsize` attribute. The values stored for ``'L'`` and +``'I'`` items will be represented as Python long integers when retrieved, +because Python's plain integer type cannot represent the full range of C's +unsigned (long) integers. + +The module defines the following type: + + +.. function:: array(typecode[, initializer]) + + Return a new array whose items are restricted by *typecode*, and initialized + from the optional *initializer* value, which must be a list, string, or iterable + over elements of the appropriate type. + + .. versionchanged:: 2.4 + Formerly, only lists or strings were accepted. + + If given a list or string, the initializer is passed to the new array's + :meth:`fromlist`, :meth:`fromstring`, or :meth:`fromunicode` method (see below) + to add initial items to the array. Otherwise, the iterable initializer is + passed to the :meth:`extend` method. + + +.. data:: ArrayType + + Obsolete alias for :func:`array`. + +Array objects support the ordinary sequence operations of indexing, slicing, +concatenation, and multiplication. When using slice assignment, the assigned +value must be an array object with the same type code; in all other cases, +:exc:`TypeError` is raised. Array objects also implement the buffer interface, +and may be used wherever buffer objects are supported. + +The following data items and methods are also supported: + + +.. attribute:: array.typecode + + The typecode character used to create the array. + + +.. attribute:: array.itemsize + + The length in bytes of one array item in the internal representation. + + +.. method:: array.append(x) + + Append a new item with value *x* to the end of the array. + + +.. method:: array.buffer_info() + + Return a tuple ``(address, length)`` giving the current memory address and the + length in elements of the buffer used to hold array's contents. The size of the + memory buffer in bytes can be computed as ``array.buffer_info()[1] * + array.itemsize``. This is occasionally useful when working with low-level (and + inherently unsafe) I/O interfaces that require memory addresses, such as certain + :cfunc:`ioctl` operations. The returned numbers are valid as long as the array + exists and no length-changing operations are applied to it. + + .. note:: + + When using array objects from code written in C or C++ (the only way to + effectively make use of this information), it makes more sense to use the buffer + interface supported by array objects. This method is maintained for backward + compatibility and should be avoided in new code. The buffer interface is + documented in :ref:`bufferobjects`. + + +.. method:: array.byteswap() + + "Byteswap" all items of the array. This is only supported for values which are + 1, 2, 4, or 8 bytes in size; for other types of values, :exc:`RuntimeError` is + raised. It is useful when reading data from a file written on a machine with a + different byte order. + + +.. method:: array.count(x) + + Return the number of occurrences of *x* in the array. + + +.. method:: array.extend(iterable) + + Append items from *iterable* to the end of the array. If *iterable* is another + array, it must have *exactly* the same type code; if not, :exc:`TypeError` will + be raised. If *iterable* is not an array, it must be iterable and its elements + must be the right type to be appended to the array. + + .. versionchanged:: 2.4 + Formerly, the argument could only be another array. + + +.. method:: array.fromfile(f, n) + + Read *n* items (as machine values) from the file object *f* and append them to + the end of the array. If less than *n* items are available, :exc:`EOFError` is + raised, but the items that were available are still inserted into the array. + *f* must be a real built-in file object; something else with a :meth:`read` + method won't do. + + +.. method:: array.fromlist(list) + + Append items from the list. This is equivalent to ``for x in list: + a.append(x)`` except that if there is a type error, the array is unchanged. + + +.. method:: array.fromstring(s) + + Appends items from the string, interpreting the string as an array of machine + values (as if it had been read from a file using the :meth:`fromfile` method). + + +.. method:: array.fromunicode(s) + + Extends this array with data from the given unicode string. The array must + be a type ``'u'`` array; otherwise a :exc:`ValueError` is raised. Use + ``array.fromstring(unicodestring.encode(enc))`` to append Unicode data to an + array of some other type. + + +.. method:: array.index(x) + + Return the smallest *i* such that *i* is the index of the first occurrence of + *x* in the array. + + +.. method:: array.insert(i, x) + + Insert a new item with value *x* in the array before position *i*. Negative + values are treated as being relative to the end of the array. + + +.. method:: array.pop([i]) + + Removes the item with the index *i* from the array and returns it. The optional + argument defaults to ``-1``, so that by default the last item is removed and + returned. + + +.. method:: array.read(f, n) + + .. deprecated:: 1.5.1 + Use the :meth:`fromfile` method. + + Read *n* items (as machine values) from the file object *f* and append them to + the end of the array. If less than *n* items are available, :exc:`EOFError` is + raised, but the items that were available are still inserted into the array. + *f* must be a real built-in file object; something else with a :meth:`read` + method won't do. + + +.. method:: array.remove(x) + + Remove the first occurrence of *x* from the array. + + +.. method:: array.reverse() + + Reverse the order of the items in the array. + + +.. method:: array.tofile(f) + + Write all items (as machine values) to the file object *f*. + + +.. method:: array.tolist() + + Convert the array to an ordinary list with the same items. + + +.. method:: array.tostring() + + Convert the array to an array of machine values and return the string + representation (the same sequence of bytes that would be written to a file by + the :meth:`tofile` method.) + + +.. method:: array.tounicode() + + Convert the array to a unicode string. The array must be a type ``'u'`` array; + otherwise a :exc:`ValueError` is raised. Use ``array.tostring().decode(enc)`` to + obtain a unicode string from an array of some other type. + + +.. method:: array.write(f) + + .. deprecated:: 1.5.1 + Use the :meth:`tofile` method. + + Write all items (as machine values) to the file object *f*. + +When an array object is printed or converted to a string, it is represented as +``array(typecode, initializer)``. The *initializer* is omitted if the array is +empty, otherwise it is a string if the *typecode* is ``'c'``, otherwise it is a +list of numbers. The string is guaranteed to be able to be converted back to an +array with the same type and value using :func:`eval`, so long as the +:func:`array` function has been imported using ``from array import array``. +Examples:: + + array('l') + array('c', 'hello world') + array('u', u'hello \u2641') + array('l', [1, 2, 3, 4, 5]) + array('d', [1.0, 2.0, 3.14]) + + +.. seealso:: + + Module :mod:`struct` + Packing and unpacking of heterogeneous binary data. + + Module :mod:`xdrlib` + Packing and unpacking of External Data Representation (XDR) data as used in some + remote procedure call systems. + + `The Numerical Python Manual <http://numpy.sourceforge.net/numdoc/HTML/numdoc.htm>`_ + The Numeric Python extension (NumPy) defines another array type; see + http://numpy.sourceforge.net/ for further information about Numerical Python. + (A PDF version of the NumPy manual is available at + http://numpy.sourceforge.net/numdoc/numdoc.pdf). + diff --git a/Doc/library/asynchat.rst b/Doc/library/asynchat.rst new file mode 100644 index 0000000..b651c40 --- /dev/null +++ b/Doc/library/asynchat.rst @@ -0,0 +1,284 @@ + +:mod:`asynchat` --- Asynchronous socket command/response handler +================================================================ + +.. module:: asynchat + :synopsis: Support for asynchronous command/response protocols. +.. moduleauthor:: Sam Rushing <rushing@nightmare.com> +.. sectionauthor:: Steve Holden <sholden@holdenweb.com> + + +This module builds on the :mod:`asyncore` infrastructure, simplifying +asynchronous clients and servers and making it easier to handle protocols whose +elements are terminated by arbitrary strings, or are of variable length. +:mod:`asynchat` defines the abstract class :class:`async_chat` that you +subclass, providing implementations of the :meth:`collect_incoming_data` and +:meth:`found_terminator` methods. It uses the same asynchronous loop as +:mod:`asyncore`, and the two types of channel, :class:`asyncore.dispatcher` and +:class:`asynchat.async_chat`, can freely be mixed in the channel map. Typically +an :class:`asyncore.dispatcher` server channel generates new +:class:`asynchat.async_chat` channel objects as it receives incoming connection +requests. + + +.. class:: async_chat() + + This class is an abstract subclass of :class:`asyncore.dispatcher`. To make + practical use of the code you must subclass :class:`async_chat`, providing + meaningful :meth:`collect_incoming_data` and :meth:`found_terminator` methods. + The :class:`asyncore.dispatcher` methods can be used, although not all make + sense in a message/response context. + + Like :class:`asyncore.dispatcher`, :class:`async_chat` defines a set of events + that are generated by an analysis of socket conditions after a :cfunc:`select` + call. Once the polling loop has been started the :class:`async_chat` object's + methods are called by the event-processing framework with no action on the part + of the programmer. + + Unlike :class:`asyncore.dispatcher`, :class:`async_chat` allows you to define a + first-in-first-out queue (fifo) of *producers*. A producer need have only one + method, :meth:`more`, which should return data to be transmitted on the channel. + The producer indicates exhaustion (*i.e.* that it contains no more data) by + having its :meth:`more` method return the empty string. At this point the + :class:`async_chat` object removes the producer from the fifo and starts using + the next producer, if any. When the producer fifo is empty the + :meth:`handle_write` method does nothing. You use the channel object's + :meth:`set_terminator` method to describe how to recognize the end of, or an + important breakpoint in, an incoming transmission from the remote endpoint. + + To build a functioning :class:`async_chat` subclass your input methods + :meth:`collect_incoming_data` and :meth:`found_terminator` must handle the data + that the channel receives asynchronously. The methods are described below. + + +.. method:: async_chat.close_when_done() + + Pushes a ``None`` on to the producer fifo. When this producer is popped off the + fifo it causes the channel to be closed. + + +.. method:: async_chat.collect_incoming_data(data) + + Called with *data* holding an arbitrary amount of received data. The default + method, which must be overridden, raises a :exc:`NotImplementedError` exception. + + +.. method:: async_chat.discard_buffers() + + In emergencies this method will discard any data held in the input and/or output + buffers and the producer fifo. + + +.. method:: async_chat.found_terminator() + + Called when the incoming data stream matches the termination condition set by + :meth:`set_terminator`. The default method, which must be overridden, raises a + :exc:`NotImplementedError` exception. The buffered input data should be + available via an instance attribute. + + +.. method:: async_chat.get_terminator() + + Returns the current terminator for the channel. + + +.. method:: async_chat.handle_close() + + Called when the channel is closed. The default method silently closes the + channel's socket. + + +.. method:: async_chat.handle_read() + + Called when a read event fires on the channel's socket in the asynchronous loop. + The default method checks for the termination condition established by + :meth:`set_terminator`, which can be either the appearance of a particular + string in the input stream or the receipt of a particular number of characters. + When the terminator is found, :meth:`handle_read` calls the + :meth:`found_terminator` method after calling :meth:`collect_incoming_data` with + any data preceding the terminating condition. + + +.. method:: async_chat.handle_write() + + Called when the application may write data to the channel. The default method + calls the :meth:`initiate_send` method, which in turn will call + :meth:`refill_buffer` to collect data from the producer fifo associated with the + channel. + + +.. method:: async_chat.push(data) + + Creates a :class:`simple_producer` object (*see below*) containing the data and + pushes it on to the channel's ``producer_fifo`` to ensure its transmission. This + is all you need to do to have the channel write the data out to the network, + although it is possible to use your own producers in more complex schemes to + implement encryption and chunking, for example. + + +.. method:: async_chat.push_with_producer(producer) + + Takes a producer object and adds it to the producer fifo associated with the + channel. When all currently-pushed producers have been exhausted the channel + will consume this producer's data by calling its :meth:`more` method and send + the data to the remote endpoint. + + +.. method:: async_chat.readable() + + Should return ``True`` for the channel to be included in the set of channels + tested by the :cfunc:`select` loop for readability. + + +.. method:: async_chat.refill_buffer() + + Refills the output buffer by calling the :meth:`more` method of the producer at + the head of the fifo. If it is exhausted then the producer is popped off the + fifo and the next producer is activated. If the current producer is, or becomes, + ``None`` then the channel is closed. + + +.. method:: async_chat.set_terminator(term) + + Sets the terminating condition to be recognised on the channel. ``term`` may be + any of three types of value, corresponding to three different ways to handle + incoming protocol data. + + +-----------+---------------------------------------------+ + | term | Description | + +===========+=============================================+ + | *string* | Will call :meth:`found_terminator` when the | + | | string is found in the input stream | + +-----------+---------------------------------------------+ + | *integer* | Will call :meth:`found_terminator` when the | + | | indicated number of characters have been | + | | received | + +-----------+---------------------------------------------+ + | ``None`` | The channel continues to collect data | + | | forever | + +-----------+---------------------------------------------+ + + Note that any data following the terminator will be available for reading by the + channel after :meth:`found_terminator` is called. + + +.. method:: async_chat.writable() + + Should return ``True`` as long as items remain on the producer fifo, or the + channel is connected and the channel's output buffer is non-empty. + + +asynchat - Auxiliary Classes and Functions +------------------------------------------ + + +.. class:: simple_producer(data[, buffer_size=512]) + + A :class:`simple_producer` takes a chunk of data and an optional buffer size. + Repeated calls to its :meth:`more` method yield successive chunks of the data no + larger than *buffer_size*. + + +.. method:: simple_producer.more() + + Produces the next chunk of information from the producer, or returns the empty + string. + + +.. class:: fifo([list=None]) + + Each channel maintains a :class:`fifo` holding data which has been pushed by the + application but not yet popped for writing to the channel. A :class:`fifo` is a + list used to hold data and/or producers until they are required. If the *list* + argument is provided then it should contain producers or data items to be + written to the channel. + + +.. method:: fifo.is_empty() + + Returns ``True`` iff the fifo is empty. + + +.. method:: fifo.first() + + Returns the least-recently :meth:`push`\ ed item from the fifo. + + +.. method:: fifo.push(data) + + Adds the given data (which may be a string or a producer object) to the producer + fifo. + + +.. method:: fifo.pop() + + If the fifo is not empty, returns ``True, first()``, deleting the popped item. + Returns ``False, None`` for an empty fifo. + +The :mod:`asynchat` module also defines one utility function, which may be of +use in network and textual analysis operations. + + +.. function:: find_prefix_at_end(haystack, needle) + + Returns ``True`` if string *haystack* ends with any non-empty prefix of string + *needle*. + + +.. _asynchat-example: + +asynchat Example +---------------- + +The following partial example shows how HTTP requests can be read with +:class:`async_chat`. A web server might create an :class:`http_request_handler` +object for each incoming client connection. Notice that initially the channel +terminator is set to match the blank line at the end of the HTTP headers, and a +flag indicates that the headers are being read. + +Once the headers have been read, if the request is of type POST (indicating that +further data are present in the input stream) then the ``Content-Length:`` +header is used to set a numeric terminator to read the right amount of data from +the channel. + +The :meth:`handle_request` method is called once all relevant input has been +marshalled, after setting the channel terminator to ``None`` to ensure that any +extraneous data sent by the web client are ignored. :: + + class http_request_handler(asynchat.async_chat): + + def __init__(self, conn, addr, sessions, log): + asynchat.async_chat.__init__(self, conn=conn) + self.addr = addr + self.sessions = sessions + self.ibuffer = [] + self.obuffer = "" + self.set_terminator("\r\n\r\n") + self.reading_headers = True + self.handling = False + self.cgi_data = None + self.log = log + + def collect_incoming_data(self, data): + """Buffer the data""" + self.ibuffer.append(data) + + def found_terminator(self): + if self.reading_headers: + self.reading_headers = False + self.parse_headers("".join(self.ibuffer)) + self.ibuffer = [] + if self.op.upper() == "POST": + clen = self.headers.getheader("content-length") + self.set_terminator(int(clen)) + else: + self.handling = True + self.set_terminator(None) + self.handle_request() + elif not self.handling: + self.set_terminator(None) # browsers sometimes over-send + self.cgi_data = parse(self.headers, "".join(self.ibuffer)) + self.handling = True + self.ibuffer = [] + self.handle_request() + diff --git a/Doc/library/asyncore.rst b/Doc/library/asyncore.rst new file mode 100644 index 0000000..7f80dd3 --- /dev/null +++ b/Doc/library/asyncore.rst @@ -0,0 +1,269 @@ + +:mod:`asyncore` --- Asynchronous socket handler +=============================================== + +.. module:: asyncore + :synopsis: A base class for developing asynchronous socket handling services. +.. moduleauthor:: Sam Rushing <rushing@nightmare.com> +.. sectionauthor:: Christopher Petrilli <petrilli@amber.org> +.. sectionauthor:: Steve Holden <sholden@holdenweb.com> + + +This module provides the basic infrastructure for writing asynchronous socket +service clients and servers. + +.. % Heavily adapted from original documentation by Sam Rushing. + +There are only two ways to have a program on a single processor do "more than +one thing at a time." Multi-threaded programming is the simplest and most +popular way to do it, but there is another very different technique, that lets +you have nearly all the advantages of multi-threading, without actually using +multiple threads. It's really only practical if your program is largely I/O +bound. If your program is processor bound, then pre-emptive scheduled threads +are probably what you really need. Network servers are rarely processor bound, +however. + +If your operating system supports the :cfunc:`select` system call in its I/O +library (and nearly all do), then you can use it to juggle multiple +communication channels at once; doing other work while your I/O is taking place +in the "background." Although this strategy can seem strange and complex, +especially at first, it is in many ways easier to understand and control than +multi-threaded programming. The :mod:`asyncore` module solves many of the +difficult problems for you, making the task of building sophisticated +high-performance network servers and clients a snap. For "conversational" +applications and protocols the companion :mod:`asynchat` module is invaluable. + +The basic idea behind both modules is to create one or more network *channels*, +instances of class :class:`asyncore.dispatcher` and +:class:`asynchat.async_chat`. Creating the channels adds them to a global map, +used by the :func:`loop` function if you do not provide it with your own *map*. + +Once the initial channel(s) is(are) created, calling the :func:`loop` function +activates channel service, which continues until the last channel (including any +that have been added to the map during asynchronous service) is closed. + + +.. function:: loop([timeout[, use_poll[, map[,count]]]]) + + Enter a polling loop that terminates after count passes or all open channels + have been closed. All arguments are optional. The *count* parameter defaults + to None, resulting in the loop terminating only when all channels have been + closed. The *timeout* argument sets the timeout parameter for the appropriate + :func:`select` or :func:`poll` call, measured in seconds; the default is 30 + seconds. The *use_poll* parameter, if true, indicates that :func:`poll` should + be used in preference to :func:`select` (the default is ``False``). + + The *map* parameter is a dictionary whose items are the channels to watch. As + channels are closed they are deleted from their map. If *map* is omitted, a + global map is used. Channels (instances of :class:`asyncore.dispatcher`, + :class:`asynchat.async_chat` and subclasses thereof) can freely be mixed in the + map. + + +.. class:: dispatcher() + + The :class:`dispatcher` class is a thin wrapper around a low-level socket + object. To make it more useful, it has a few methods for event-handling which + are called from the asynchronous loop. Otherwise, it can be treated as a + normal non-blocking socket object. + + Two class attributes can be modified, to improve performance, or possibly even + to conserve memory. + + + .. data:: ac_in_buffer_size + + The asynchronous input buffer size (default ``4096``). + + + .. data:: ac_out_buffer_size + + The asynchronous output buffer size (default ``4096``). + + The firing of low-level events at certain times or in certain connection states + tells the asynchronous loop that certain higher-level events have taken place. + For example, if we have asked for a socket to connect to another host, we know + that the connection has been made when the socket becomes writable for the first + time (at this point you know that you may write to it with the expectation of + success). The implied higher-level events are: + + +----------------------+----------------------------------------+ + | Event | Description | + +======================+========================================+ + | ``handle_connect()`` | Implied by the first write event | + +----------------------+----------------------------------------+ + | ``handle_close()`` | Implied by a read event with no data | + | | available | + +----------------------+----------------------------------------+ + | ``handle_accept()`` | Implied by a read event on a listening | + | | socket | + +----------------------+----------------------------------------+ + + During asynchronous processing, each mapped channel's :meth:`readable` and + :meth:`writable` methods are used to determine whether the channel's socket + should be added to the list of channels :cfunc:`select`\ ed or :cfunc:`poll`\ ed + for read and write events. + +Thus, the set of channel events is larger than the basic socket events. The full +set of methods that can be overridden in your subclass follows: + + +.. method:: dispatcher.handle_read() + + Called when the asynchronous loop detects that a :meth:`read` call on the + channel's socket will succeed. + + +.. method:: dispatcher.handle_write() + + Called when the asynchronous loop detects that a writable socket can be written. + Often this method will implement the necessary buffering for performance. For + example:: + + def handle_write(self): + sent = self.send(self.buffer) + self.buffer = self.buffer[sent:] + + +.. method:: dispatcher.handle_expt() + + Called when there is out of band (OOB) data for a socket connection. This will + almost never happen, as OOB is tenuously supported and rarely used. + + +.. method:: dispatcher.handle_connect() + + Called when the active opener's socket actually makes a connection. Might send a + "welcome" banner, or initiate a protocol negotiation with the remote endpoint, + for example. + + +.. method:: dispatcher.handle_close() + + Called when the socket is closed. + + +.. method:: dispatcher.handle_error() + + Called when an exception is raised and not otherwise handled. The default + version prints a condensed traceback. + + +.. method:: dispatcher.handle_accept() + + Called on listening channels (passive openers) when a connection can be + established with a new remote endpoint that has issued a :meth:`connect` call + for the local endpoint. + + +.. method:: dispatcher.readable() + + Called each time around the asynchronous loop to determine whether a channel's + socket should be added to the list on which read events can occur. The default + method simply returns ``True``, indicating that by default, all channels will + be interested in read events. + + +.. method:: dispatcher.writable() + + Called each time around the asynchronous loop to determine whether a channel's + socket should be added to the list on which write events can occur. The default + method simply returns ``True``, indicating that by default, all channels will + be interested in write events. + +In addition, each channel delegates or extends many of the socket methods. Most +of these are nearly identical to their socket partners. + + +.. method:: dispatcher.create_socket(family, type) + + This is identical to the creation of a normal socket, and will use the same + options for creation. Refer to the :mod:`socket` documentation for information + on creating sockets. + + +.. method:: dispatcher.connect(address) + + As with the normal socket object, *address* is a tuple with the first element + the host to connect to, and the second the port number. + + +.. method:: dispatcher.send(data) + + Send *data* to the remote end-point of the socket. + + +.. method:: dispatcher.recv(buffer_size) + + Read at most *buffer_size* bytes from the socket's remote end-point. An empty + string implies that the channel has been closed from the other end. + + +.. method:: dispatcher.listen(backlog) + + Listen for connections made to the socket. The *backlog* argument specifies the + maximum number of queued connections and should be at least 1; the maximum value + is system-dependent (usually 5). + + +.. method:: dispatcher.bind(address) + + Bind the socket to *address*. The socket must not already be bound. (The + format of *address* depends on the address family --- see above.) To mark the + socket as re-usable (setting the :const:`SO_REUSEADDR` option), call the + :class:`dispatcher` object's :meth:`set_reuse_addr` method. + + +.. method:: dispatcher.accept() + + Accept a connection. The socket must be bound to an address and listening for + connections. The return value is a pair ``(conn, address)`` where *conn* is a + *new* socket object usable to send and receive data on the connection, and + *address* is the address bound to the socket on the other end of the connection. + + +.. method:: dispatcher.close() + + Close the socket. All future operations on the socket object will fail. The + remote end-point will receive no more data (after queued data is flushed). + Sockets are automatically closed when they are garbage-collected. + + +.. _asyncore-example: + +asyncore Example basic HTTP client +---------------------------------- + +Here is a very basic HTTP client that uses the :class:`dispatcher` class to +implement its socket handling:: + + import asyncore, socket + + class http_client(asyncore.dispatcher): + + def __init__(self, host, path): + asyncore.dispatcher.__init__(self) + self.create_socket(socket.AF_INET, socket.SOCK_STREAM) + self.connect( (host, 80) ) + self.buffer = 'GET %s HTTP/1.0\r\n\r\n' % path + + def handle_connect(self): + pass + + def handle_close(self): + self.close() + + def handle_read(self): + print self.recv(8192) + + def writable(self): + return (len(self.buffer) > 0) + + def handle_write(self): + sent = self.send(self.buffer) + self.buffer = self.buffer[sent:] + + c = http_client('www.python.org', '/') + + asyncore.loop() + diff --git a/Doc/library/atexit.rst b/Doc/library/atexit.rst new file mode 100644 index 0000000..94d750b --- /dev/null +++ b/Doc/library/atexit.rst @@ -0,0 +1,105 @@ + +:mod:`atexit` --- Exit handlers +=============================== + +.. module:: atexit + :synopsis: Register and execute cleanup functions. +.. moduleauthor:: Skip Montanaro <skip@mojam.com> +.. sectionauthor:: Skip Montanaro <skip@mojam.com> + + +.. versionadded:: 2.0 + +The :mod:`atexit` module defines functions to register and unregister cleanup +functions. Functions thus registered are automatically executed upon normal +interpreter termination. + +Note: the functions registered via this module are not called when the program +is killed by a signal, when a Python fatal internal error is detected, or when +:func:`os._exit` is called. + + +.. function:: register(func[, *args[, **kargs]]) + + Register *func* as a function to be executed at termination. Any optional + arguments that are to be passed to *func* must be passed as arguments to + :func:`register`. + + At normal program termination (for instance, if :func:`sys.exit` is called or + the main module's execution completes), all functions registered are called in + last in, first out order. The assumption is that lower level modules will + normally be imported before higher level modules and thus must be cleaned up + later. + + If an exception is raised during execution of the exit handlers, a traceback is + printed (unless :exc:`SystemExit` is raised) and the exception information is + saved. After all exit handlers have had a chance to run the last exception to + be raised is re-raised. + + .. versionchanged:: 2.6 + This function now returns *func* which makes it possible to use it as a + decorator without binding the original name to ``None``. + + +.. function:: unregister(func) + + Remove a function *func* from the list of functions to be run at interpreter- + shutdown. After calling :func:`unregister`, *func* is guaranteed not to be + called when the interpreter shuts down. + + .. versionadded:: 3.0 + + +.. seealso:: + + Module :mod:`readline` + Useful example of :mod:`atexit` to read and write :mod:`readline` history files. + + +.. _atexit-example: + +:mod:`atexit` Example +--------------------- + +The following simple example demonstrates how a module can initialize a counter +from a file when it is imported and save the counter's updated value +automatically when the program terminates without relying on the application +making an explicit call into this module at termination. :: + + try: + _count = int(open("/tmp/counter").read()) + except IOError: + _count = 0 + + def incrcounter(n): + global _count + _count = _count + n + + def savecounter(): + open("/tmp/counter", "w").write("%d" % _count) + + import atexit + atexit.register(savecounter) + +Positional and keyword arguments may also be passed to :func:`register` to be +passed along to the registered function when it is called:: + + def goodbye(name, adjective): + print 'Goodbye, %s, it was %s to meet you.' % (name, adjective) + + import atexit + atexit.register(goodbye, 'Donny', 'nice') + + # or: + atexit.register(goodbye, adjective='nice', name='Donny') + +Usage as a decorator:: + + import atexit + + @atexit.register + def goodbye(): + print "You are now leaving the Python sector." + +This obviously only works with functions that don't take arguments. + diff --git a/Doc/library/audioop.rst b/Doc/library/audioop.rst new file mode 100644 index 0000000..84a2690 --- /dev/null +++ b/Doc/library/audioop.rst @@ -0,0 +1,261 @@ + +:mod:`audioop` --- Manipulate raw audio data +============================================ + +.. module:: audioop + :synopsis: Manipulate raw audio data. + + +The :mod:`audioop` module contains some useful operations on sound fragments. +It operates on sound fragments consisting of signed integer samples 8, 16 or 32 +bits wide, stored in Python strings. All scalar items are integers, unless +specified otherwise. + +.. index:: + single: Intel/DVI ADPCM + single: ADPCM, Intel/DVI + single: a-LAW + single: u-LAW + +This module provides support for a-LAW, u-LAW and Intel/DVI ADPCM encodings. + +.. % This para is mostly here to provide an excuse for the index entries... + +A few of the more complicated operations only take 16-bit samples, otherwise the +sample size (in bytes) is always a parameter of the operation. + +The module defines the following variables and functions: + + +.. exception:: error + + This exception is raised on all errors, such as unknown number of bytes per + sample, etc. + + +.. function:: add(fragment1, fragment2, width) + + Return a fragment which is the addition of the two samples passed as parameters. + *width* is the sample width in bytes, either ``1``, ``2`` or ``4``. Both + fragments should have the same length. + + +.. function:: adpcm2lin(adpcmfragment, width, state) + + Decode an Intel/DVI ADPCM coded fragment to a linear fragment. See the + description of :func:`lin2adpcm` for details on ADPCM coding. Return a tuple + ``(sample, newstate)`` where the sample has the width specified in *width*. + + +.. function:: alaw2lin(fragment, width) + + Convert sound fragments in a-LAW encoding to linearly encoded sound fragments. + a-LAW encoding always uses 8 bits samples, so *width* refers only to the sample + width of the output fragment here. + + .. versionadded:: 2.5 + + +.. function:: avg(fragment, width) + + Return the average over all samples in the fragment. + + +.. function:: avgpp(fragment, width) + + Return the average peak-peak value over all samples in the fragment. No + filtering is done, so the usefulness of this routine is questionable. + + +.. function:: bias(fragment, width, bias) + + Return a fragment that is the original fragment with a bias added to each + sample. + + +.. function:: cross(fragment, width) + + Return the number of zero crossings in the fragment passed as an argument. + + +.. function:: findfactor(fragment, reference) + + Return a factor *F* such that ``rms(add(fragment, mul(reference, -F)))`` is + minimal, i.e., return the factor with which you should multiply *reference* to + make it match as well as possible to *fragment*. The fragments should both + contain 2-byte samples. + + The time taken by this routine is proportional to ``len(fragment)``. + + +.. function:: findfit(fragment, reference) + + Try to match *reference* as well as possible to a portion of *fragment* (which + should be the longer fragment). This is (conceptually) done by taking slices + out of *fragment*, using :func:`findfactor` to compute the best match, and + minimizing the result. The fragments should both contain 2-byte samples. + Return a tuple ``(offset, factor)`` where *offset* is the (integer) offset into + *fragment* where the optimal match started and *factor* is the (floating-point) + factor as per :func:`findfactor`. + + +.. function:: findmax(fragment, length) + + Search *fragment* for a slice of length *length* samples (not bytes!) with + maximum energy, i.e., return *i* for which ``rms(fragment[i*2:(i+length)*2])`` + is maximal. The fragments should both contain 2-byte samples. + + The routine takes time proportional to ``len(fragment)``. + + +.. function:: getsample(fragment, width, index) + + Return the value of sample *index* from the fragment. + + +.. function:: lin2adpcm(fragment, width, state) + + Convert samples to 4 bit Intel/DVI ADPCM encoding. ADPCM coding is an adaptive + coding scheme, whereby each 4 bit number is the difference between one sample + and the next, divided by a (varying) step. The Intel/DVI ADPCM algorithm has + been selected for use by the IMA, so it may well become a standard. + + *state* is a tuple containing the state of the coder. The coder returns a tuple + ``(adpcmfrag, newstate)``, and the *newstate* should be passed to the next call + of :func:`lin2adpcm`. In the initial call, ``None`` can be passed as the state. + *adpcmfrag* is the ADPCM coded fragment packed 2 4-bit values per byte. + + +.. function:: lin2alaw(fragment, width) + + Convert samples in the audio fragment to a-LAW encoding and return this as a + Python string. a-LAW is an audio encoding format whereby you get a dynamic + range of about 13 bits using only 8 bit samples. It is used by the Sun audio + hardware, among others. + + .. versionadded:: 2.5 + + +.. function:: lin2lin(fragment, width, newwidth) + + Convert samples between 1-, 2- and 4-byte formats. + + +.. function:: lin2ulaw(fragment, width) + + Convert samples in the audio fragment to u-LAW encoding and return this as a + Python string. u-LAW is an audio encoding format whereby you get a dynamic + range of about 14 bits using only 8 bit samples. It is used by the Sun audio + hardware, among others. + + +.. function:: minmax(fragment, width) + + Return a tuple consisting of the minimum and maximum values of all samples in + the sound fragment. + + +.. function:: max(fragment, width) + + Return the maximum of the *absolute value* of all samples in a fragment. + + +.. function:: maxpp(fragment, width) + + Return the maximum peak-peak value in the sound fragment. + + +.. function:: mul(fragment, width, factor) + + Return a fragment that has all samples in the original fragment multiplied by + the floating-point value *factor*. Overflow is silently ignored. + + +.. function:: ratecv(fragment, width, nchannels, inrate, outrate, state[, weightA[, weightB]]) + + Convert the frame rate of the input fragment. + + *state* is a tuple containing the state of the converter. The converter returns + a tuple ``(newfragment, newstate)``, and *newstate* should be passed to the next + call of :func:`ratecv`. The initial call should pass ``None`` as the state. + + The *weightA* and *weightB* arguments are parameters for a simple digital filter + and default to ``1`` and ``0`` respectively. + + +.. function:: reverse(fragment, width) + + Reverse the samples in a fragment and returns the modified fragment. + + +.. function:: rms(fragment, width) + + Return the root-mean-square of the fragment, i.e. ``sqrt(sum(S_i^2)/n)``. + + This is a measure of the power in an audio signal. + + +.. function:: tomono(fragment, width, lfactor, rfactor) + + Convert a stereo fragment to a mono fragment. The left channel is multiplied by + *lfactor* and the right channel by *rfactor* before adding the two channels to + give a mono signal. + + +.. function:: tostereo(fragment, width, lfactor, rfactor) + + Generate a stereo fragment from a mono fragment. Each pair of samples in the + stereo fragment are computed from the mono sample, whereby left channel samples + are multiplied by *lfactor* and right channel samples by *rfactor*. + + +.. function:: ulaw2lin(fragment, width) + + Convert sound fragments in u-LAW encoding to linearly encoded sound fragments. + u-LAW encoding always uses 8 bits samples, so *width* refers only to the sample + width of the output fragment here. + +Note that operations such as :func:`mul` or :func:`max` make no distinction +between mono and stereo fragments, i.e. all samples are treated equal. If this +is a problem the stereo fragment should be split into two mono fragments first +and recombined later. Here is an example of how to do that:: + + def mul_stereo(sample, width, lfactor, rfactor): + lsample = audioop.tomono(sample, width, 1, 0) + rsample = audioop.tomono(sample, width, 0, 1) + lsample = audioop.mul(sample, width, lfactor) + rsample = audioop.mul(sample, width, rfactor) + lsample = audioop.tostereo(lsample, width, 1, 0) + rsample = audioop.tostereo(rsample, width, 0, 1) + return audioop.add(lsample, rsample, width) + +If you use the ADPCM coder to build network packets and you want your protocol +to be stateless (i.e. to be able to tolerate packet loss) you should not only +transmit the data but also the state. Note that you should send the *initial* +state (the one you passed to :func:`lin2adpcm`) along to the decoder, not the +final state (as returned by the coder). If you want to use +:func:`struct.struct` to store the state in binary you can code the first +element (the predicted value) in 16 bits and the second (the delta index) in 8. + +The ADPCM coders have never been tried against other ADPCM coders, only against +themselves. It could well be that I misinterpreted the standards in which case +they will not be interoperable with the respective standards. + +The :func:`find\*` routines might look a bit funny at first sight. They are +primarily meant to do echo cancellation. A reasonably fast way to do this is to +pick the most energetic piece of the output sample, locate that in the input +sample and subtract the whole output sample from the input sample:: + + def echocancel(outputdata, inputdata): + pos = audioop.findmax(outputdata, 800) # one tenth second + out_test = outputdata[pos*2:] + in_test = inputdata[pos*2:] + ipos, factor = audioop.findfit(in_test, out_test) + # Optional (for better cancellation): + # factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)], + # out_test) + prefill = '\0'*(pos+ipos)*2 + postfill = '\0'*(len(inputdata)-len(prefill)-len(outputdata)) + outputdata = prefill + audioop.mul(outputdata,2,-factor) + postfill + return audioop.add(inputdata, outputdata, 2) + diff --git a/Doc/library/autogil.rst b/Doc/library/autogil.rst new file mode 100644 index 0000000..93f0d04 --- /dev/null +++ b/Doc/library/autogil.rst @@ -0,0 +1,30 @@ + +:mod:`autoGIL` --- Global Interpreter Lock handling in event loops +================================================================== + +.. module:: autoGIL + :platform: Mac + :synopsis: Global Interpreter Lock handling in event loops. +.. moduleauthor:: Just van Rossum <just@letterror.com> + + +The :mod:`autoGIL` module provides a function :func:`installAutoGIL` that +automatically locks and unlocks Python's Global Interpreter Lock when running an +event loop. + + +.. exception:: AutoGILError + + Raised if the observer callback cannot be installed, for example because the + current thread does not have a run loop. + + +.. function:: installAutoGIL() + + Install an observer callback in the event loop (CFRunLoop) for the current + thread, that will lock and unlock the Global Interpreter Lock (GIL) at + appropriate times, allowing other Python threads to run while the event loop is + idle. + + Availability: OSX 10.1 or later. + diff --git a/Doc/library/base64.rst b/Doc/library/base64.rst new file mode 100644 index 0000000..daa8fd5 --- /dev/null +++ b/Doc/library/base64.rst @@ -0,0 +1,172 @@ + +:mod:`base64` --- RFC 3548: Base16, Base32, Base64 Data Encodings +================================================================= + +.. module:: base64 + :synopsis: RFC 3548: Base16, Base32, Base64 Data Encodings + + +.. index:: + pair: base64; encoding + single: MIME; base64 encoding + +This module provides data encoding and decoding as specified in :rfc:`3548`. +This standard defines the Base16, Base32, and Base64 algorithms for encoding and +decoding arbitrary binary strings into text strings that can be safely sent by +email, used as parts of URLs, or included as part of an HTTP POST request. The +encoding algorithm is not the same as the :program:`uuencode` program. + +There are two interfaces provided by this module. The modern interface supports +encoding and decoding string objects using all three alphabets. The legacy +interface provides for encoding and decoding to and from file-like objects as +well as strings, but only using the Base64 standard alphabet. + +The modern interface, which was introduced in Python 2.4, provides: + + +.. function:: b64encode(s[, altchars]) + + Encode a string use Base64. + + *s* is the string to encode. Optional *altchars* must be a string of at least + length 2 (additional characters are ignored) which specifies an alternative + alphabet for the ``+`` and ``/`` characters. This allows an application to e.g. + generate URL or filesystem safe Base64 strings. The default is ``None``, for + which the standard Base64 alphabet is used. + + The encoded string is returned. + + +.. function:: b64decode(s[, altchars]) + + Decode a Base64 encoded string. + + *s* is the string to decode. Optional *altchars* must be a string of at least + length 2 (additional characters are ignored) which specifies the alternative + alphabet used instead of the ``+`` and ``/`` characters. + + The decoded string is returned. A :exc:`TypeError` is raised if *s* were + incorrectly padded or if there are non-alphabet characters present in the + string. + + +.. function:: standard_b64encode(s) + + Encode string *s* using the standard Base64 alphabet. + + +.. function:: standard_b64decode(s) + + Decode string *s* using the standard Base64 alphabet. + + +.. function:: urlsafe_b64encode(s) + + Encode string *s* using a URL-safe alphabet, which substitutes ``-`` instead of + ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. + + +.. function:: urlsafe_b64decode(s) + + Decode string *s* using a URL-safe alphabet, which substitutes ``-`` instead of + ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. + + +.. function:: b32encode(s) + + Encode a string using Base32. *s* is the string to encode. The encoded string + is returned. + + +.. function:: b32decode(s[, casefold[, map01]]) + + Decode a Base32 encoded string. + + *s* is the string to decode. Optional *casefold* is a flag specifying whether a + lowercase alphabet is acceptable as input. For security purposes, the default + is ``False``. + + :rfc:`3548` allows for optional mapping of the digit 0 (zero) to the letter O + (oh), and for optional mapping of the digit 1 (one) to either the letter I (eye) + or letter L (el). The optional argument *map01* when not ``None``, specifies + which letter the digit 1 should be mapped to (when *map01* is not ``None``, the + digit 0 is always mapped to the letter O). For security purposes the default is + ``None``, so that 0 and 1 are not allowed in the input. + + The decoded string is returned. A :exc:`TypeError` is raised if *s* were + incorrectly padded or if there are non-alphabet characters present in the + string. + + +.. function:: b16encode(s) + + Encode a string using Base16. + + *s* is the string to encode. The encoded string is returned. + + +.. function:: b16decode(s[, casefold]) + + Decode a Base16 encoded string. + + *s* is the string to decode. Optional *casefold* is a flag specifying whether a + lowercase alphabet is acceptable as input. For security purposes, the default + is ``False``. + + The decoded string is returned. A :exc:`TypeError` is raised if *s* were + incorrectly padded or if there are non-alphabet characters present in the + string. + +The legacy interface: + + +.. function:: decode(input, output) + + Decode the contents of the *input* file and write the resulting binary data to + the *output* file. *input* and *output* must either be file objects or objects + that mimic the file object interface. *input* will be read until + ``input.read()`` returns an empty string. + + +.. function:: decodestring(s) + + Decode the string *s*, which must contain one or more lines of base64 encoded + data, and return a string containing the resulting binary data. + + +.. function:: encode(input, output) + + Encode the contents of the *input* file and write the resulting base64 encoded + data to the *output* file. *input* and *output* must either be file objects or + objects that mimic the file object interface. *input* will be read until + ``input.read()`` returns an empty string. :func:`encode` returns the encoded + data plus a trailing newline character (``'\n'``). + + +.. function:: encodestring(s) + + Encode the string *s*, which can contain arbitrary binary data, and return a + string containing one or more lines of base64-encoded data. + :func:`encodestring` returns a string containing one or more lines of + base64-encoded data always including an extra trailing newline (``'\n'``). + +An example usage of the module:: + + >>> import base64 + >>> encoded = base64.b64encode('data to be encoded') + >>> encoded + 'ZGF0YSB0byBiZSBlbmNvZGVk' + >>> data = base64.b64decode(encoded) + >>> data + 'data to be encoded' + + +.. seealso:: + + Module :mod:`binascii` + Support module containing ASCII-to-binary and binary-to-ASCII conversions. + + :rfc:`1521` - MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies + Section 5.2, "Base64 Content-Transfer-Encoding," provides the definition of the + base64 encoding. + diff --git a/Doc/library/basehttpserver.rst b/Doc/library/basehttpserver.rst new file mode 100644 index 0000000..2e8d6a3 --- /dev/null +++ b/Doc/library/basehttpserver.rst @@ -0,0 +1,254 @@ + +:mod:`BaseHTTPServer` --- Basic HTTP server +=========================================== + +.. module:: BaseHTTPServer + :synopsis: Basic HTTP server (base class for SimpleHTTPServer and CGIHTTPServer). + + +.. index:: + pair: WWW; server + pair: HTTP; protocol + single: URL + single: httpd + +.. index:: + module: SimpleHTTPServer + module: CGIHTTPServer + +This module defines two classes for implementing HTTP servers (Web servers). +Usually, this module isn't used directly, but is used as a basis for building +functioning Web servers. See the :mod:`SimpleHTTPServer` and +:mod:`CGIHTTPServer` modules. + +The first class, :class:`HTTPServer`, is a :class:`SocketServer.TCPServer` +subclass. It creates and listens at the HTTP socket, dispatching the requests +to a handler. Code to create and run the server looks like this:: + + def run(server_class=BaseHTTPServer.HTTPServer, + handler_class=BaseHTTPServer.BaseHTTPRequestHandler): + server_address = ('', 8000) + httpd = server_class(server_address, handler_class) + httpd.serve_forever() + + +.. class:: HTTPServer(server_address, RequestHandlerClass) + + This class builds on the :class:`TCPServer` class by storing the server address + as instance variables named :attr:`server_name` and :attr:`server_port`. The + server is accessible by the handler, typically through the handler's + :attr:`server` instance variable. + + +.. class:: BaseHTTPRequestHandler(request, client_address, server) + + This class is used to handle the HTTP requests that arrive at the server. By + itself, it cannot respond to any actual HTTP requests; it must be subclassed to + handle each request method (e.g. GET or POST). :class:`BaseHTTPRequestHandler` + provides a number of class and instance variables, and methods for use by + subclasses. + + The handler will parse the request and the headers, then call a method specific + to the request type. The method name is constructed from the request. For + example, for the request method ``SPAM``, the :meth:`do_SPAM` method will be + called with no arguments. All of the relevant information is stored in instance + variables of the handler. Subclasses should not need to override or extend the + :meth:`__init__` method. + +:class:`BaseHTTPRequestHandler` has the following instance variables: + + +.. attribute:: BaseHTTPRequestHandler.client_address + + Contains a tuple of the form ``(host, port)`` referring to the client's address. + + +.. attribute:: BaseHTTPRequestHandler.command + + Contains the command (request type). For example, ``'GET'``. + + +.. attribute:: BaseHTTPRequestHandler.path + + Contains the request path. + + +.. attribute:: BaseHTTPRequestHandler.request_version + + Contains the version string from the request. For example, ``'HTTP/1.0'``. + + +.. attribute:: BaseHTTPRequestHandler.headers + + Holds an instance of the class specified by the :attr:`MessageClass` class + variable. This instance parses and manages the headers in the HTTP request. + + +.. attribute:: BaseHTTPRequestHandler.rfile + + Contains an input stream, positioned at the start of the optional input data. + + +.. attribute:: BaseHTTPRequestHandler.wfile + + Contains the output stream for writing a response back to the client. Proper + adherence to the HTTP protocol must be used when writing to this stream. + +:class:`BaseHTTPRequestHandler` has the following class variables: + + +.. attribute:: BaseHTTPRequestHandler.server_version + + Specifies the server software version. You may want to override this. The + format is multiple whitespace-separated strings, where each string is of the + form name[/version]. For example, ``'BaseHTTP/0.2'``. + + +.. attribute:: BaseHTTPRequestHandler.sys_version + + Contains the Python system version, in a form usable by the + :attr:`version_string` method and the :attr:`server_version` class variable. For + example, ``'Python/1.4'``. + + +.. attribute:: BaseHTTPRequestHandler.error_message_format + + Specifies a format string for building an error response to the client. It uses + parenthesized, keyed format specifiers, so the format operand must be a + dictionary. The *code* key should be an integer, specifying the numeric HTTP + error code value. *message* should be a string containing a (detailed) error + message of what occurred, and *explain* should be an explanation of the error + code number. Default *message* and *explain* values can found in the *responses* + class variable. + + +.. attribute:: BaseHTTPRequestHandler.protocol_version + + This specifies the HTTP protocol version used in responses. If set to + ``'HTTP/1.1'``, the server will permit HTTP persistent connections; however, + your server *must* then include an accurate ``Content-Length`` header (using + :meth:`send_header`) in all of its responses to clients. For backwards + compatibility, the setting defaults to ``'HTTP/1.0'``. + + +.. attribute:: BaseHTTPRequestHandler.MessageClass + + .. index:: single: Message (in module mimetools) + + Specifies a :class:`rfc822.Message`\ -like class to parse HTTP headers. + Typically, this is not overridden, and it defaults to + :class:`mimetools.Message`. + + +.. attribute:: BaseHTTPRequestHandler.responses + + This variable contains a mapping of error code integers to two-element tuples + containing a short and long message. For example, ``{code: (shortmessage, + longmessage)}``. The *shortmessage* is usually used as the *message* key in an + error response, and *longmessage* as the *explain* key (see the + :attr:`error_message_format` class variable). + +A :class:`BaseHTTPRequestHandler` instance has the following methods: + + +.. method:: BaseHTTPRequestHandler.handle() + + Calls :meth:`handle_one_request` once (or, if persistent connections are + enabled, multiple times) to handle incoming HTTP requests. You should never need + to override it; instead, implement appropriate :meth:`do_\*` methods. + + +.. method:: BaseHTTPRequestHandler.handle_one_request() + + This method will parse and dispatch the request to the appropriate :meth:`do_\*` + method. You should never need to override it. + + +.. method:: BaseHTTPRequestHandler.send_error(code[, message]) + + Sends and logs a complete error reply to the client. The numeric *code* + specifies the HTTP error code, with *message* as optional, more specific text. A + complete set of headers is sent, followed by text composed using the + :attr:`error_message_format` class variable. + + +.. method:: BaseHTTPRequestHandler.send_response(code[, message]) + + Sends a response header and logs the accepted request. The HTTP response line is + sent, followed by *Server* and *Date* headers. The values for these two headers + are picked up from the :meth:`version_string` and :meth:`date_time_string` + methods, respectively. + + +.. method:: BaseHTTPRequestHandler.send_header(keyword, value) + + Writes a specific HTTP header to the output stream. *keyword* should specify the + header keyword, with *value* specifying its value. + + +.. method:: BaseHTTPRequestHandler.end_headers() + + Sends a blank line, indicating the end of the HTTP headers in the response. + + +.. method:: BaseHTTPRequestHandler.log_request([code[, size]]) + + Logs an accepted (successful) request. *code* should specify the numeric HTTP + code associated with the response. If a size of the response is available, then + it should be passed as the *size* parameter. + + +.. method:: BaseHTTPRequestHandler.log_error(...) + + Logs an error when a request cannot be fulfilled. By default, it passes the + message to :meth:`log_message`, so it takes the same arguments (*format* and + additional values). + + +.. method:: BaseHTTPRequestHandler.log_message(format, ...) + + Logs an arbitrary message to ``sys.stderr``. This is typically overridden to + create custom error logging mechanisms. The *format* argument is a standard + printf-style format string, where the additional arguments to + :meth:`log_message` are applied as inputs to the formatting. The client address + and current date and time are prefixed to every message logged. + + +.. method:: BaseHTTPRequestHandler.version_string() + + Returns the server software's version string. This is a combination of the + :attr:`server_version` and :attr:`sys_version` class variables. + + +.. method:: BaseHTTPRequestHandler.date_time_string([timestamp]) + + Returns the date and time given by *timestamp* (which must be in the format + returned by :func:`time.time`), formatted for a message header. If *timestamp* + is omitted, it uses the current date and time. + + The result looks like ``'Sun, 06 Nov 1994 08:49:37 GMT'``. + + .. versionadded:: 2.5 + The *timestamp* parameter. + + +.. method:: BaseHTTPRequestHandler.log_date_time_string() + + Returns the current date and time, formatted for logging. + + +.. method:: BaseHTTPRequestHandler.address_string() + + Returns the client address, formatted for logging. A name lookup is performed on + the client's IP address. + + +.. seealso:: + + Module :mod:`CGIHTTPServer` + Extended request handler that supports CGI scripts. + + Module :mod:`SimpleHTTPServer` + Basic request handler that limits response to files actually under the document + root. + diff --git a/Doc/library/binascii.rst b/Doc/library/binascii.rst new file mode 100644 index 0000000..ffea232 --- /dev/null +++ b/Doc/library/binascii.rst @@ -0,0 +1,161 @@ + +:mod:`binascii` --- Convert between binary and ASCII +==================================================== + +.. module:: binascii + :synopsis: Tools for converting between binary and various ASCII-encoded binary + representations. + + +.. index:: + module: uu + module: base64 + module: binhex + +The :mod:`binascii` module contains a number of methods to convert between +binary and various ASCII-encoded binary representations. Normally, you will not +use these functions directly but use wrapper modules like :mod:`uu`, +:mod:`base64`, or :mod:`binhex` instead. The :mod:`binascii` module contains +low-level functions written in C for greater speed that are used by the +higher-level modules. + +The :mod:`binascii` module defines the following functions: + + +.. function:: a2b_uu(string) + + Convert a single line of uuencoded data back to binary and return the binary + data. Lines normally contain 45 (binary) bytes, except for the last line. Line + data may be followed by whitespace. + + +.. function:: b2a_uu(data) + + Convert binary data to a line of ASCII characters, the return value is the + converted line, including a newline char. The length of *data* should be at most + 45. + + +.. function:: a2b_base64(string) + + Convert a block of base64 data back to binary and return the binary data. More + than one line may be passed at a time. + + +.. function:: b2a_base64(data) + + Convert binary data to a line of ASCII characters in base64 coding. The return + value is the converted line, including a newline char. The length of *data* + should be at most 57 to adhere to the base64 standard. + + +.. function:: a2b_qp(string[, header]) + + Convert a block of quoted-printable data back to binary and return the binary + data. More than one line may be passed at a time. If the optional argument + *header* is present and true, underscores will be decoded as spaces. + + +.. function:: b2a_qp(data[, quotetabs, istext, header]) + + Convert binary data to a line(s) of ASCII characters in quoted-printable + encoding. The return value is the converted line(s). If the optional argument + *quotetabs* is present and true, all tabs and spaces will be encoded. If the + optional argument *istext* is present and true, newlines are not encoded but + trailing whitespace will be encoded. If the optional argument *header* is + present and true, spaces will be encoded as underscores per RFC1522. If the + optional argument *header* is present and false, newline characters will be + encoded as well; otherwise linefeed conversion might corrupt the binary data + stream. + + +.. function:: a2b_hqx(string) + + Convert binhex4 formatted ASCII data to binary, without doing RLE-decompression. + The string should contain a complete number of binary bytes, or (in case of the + last portion of the binhex4 data) have the remaining bits zero. + + +.. function:: rledecode_hqx(data) + + Perform RLE-decompression on the data, as per the binhex4 standard. The + algorithm uses ``0x90`` after a byte as a repeat indicator, followed by a count. + A count of ``0`` specifies a byte value of ``0x90``. The routine returns the + decompressed data, unless data input data ends in an orphaned repeat indicator, + in which case the :exc:`Incomplete` exception is raised. + + +.. function:: rlecode_hqx(data) + + Perform binhex4 style RLE-compression on *data* and return the result. + + +.. function:: b2a_hqx(data) + + Perform hexbin4 binary-to-ASCII translation and return the resulting string. The + argument should already be RLE-coded, and have a length divisible by 3 (except + possibly the last fragment). + + +.. function:: crc_hqx(data, crc) + + Compute the binhex4 crc value of *data*, starting with an initial *crc* and + returning the result. + + +.. function:: crc32(data[, crc]) + + Compute CRC-32, the 32-bit checksum of data, starting with an initial crc. This + is consistent with the ZIP file checksum. Since the algorithm is designed for + use as a checksum algorithm, it is not suitable for use as a general hash + algorithm. Use as follows:: + + print binascii.crc32("hello world") + # Or, in two pieces: + crc = binascii.crc32("hello") + crc = binascii.crc32(" world", crc) + print crc + + +.. function:: b2a_hex(data) + hexlify(data) + + Return the hexadecimal representation of the binary *data*. Every byte of + *data* is converted into the corresponding 2-digit hex representation. The + resulting string is therefore twice as long as the length of *data*. + + +.. function:: a2b_hex(hexstr) + unhexlify(hexstr) + + Return the binary data represented by the hexadecimal string *hexstr*. This + function is the inverse of :func:`b2a_hex`. *hexstr* must contain an even number + of hexadecimal digits (which can be upper or lower case), otherwise a + :exc:`TypeError` is raised. + + +.. exception:: Error + + Exception raised on errors. These are usually programming errors. + + +.. exception:: Incomplete + + Exception raised on incomplete data. These are usually not programming errors, + but may be handled by reading a little more data and trying again. + + +.. seealso:: + + Module :mod:`base64` + Support for base64 encoding used in MIME email messages. + + Module :mod:`binhex` + Support for the binhex format used on the Macintosh. + + Module :mod:`uu` + Support for UU encoding used on Unix. + + Module :mod:`quopri` + Support for quoted-printable encoding used in MIME email messages. + diff --git a/Doc/library/binhex.rst b/Doc/library/binhex.rst new file mode 100644 index 0000000..3b0485c --- /dev/null +++ b/Doc/library/binhex.rst @@ -0,0 +1,59 @@ + +:mod:`binhex` --- Encode and decode binhex4 files +================================================= + +.. module:: binhex + :synopsis: Encode and decode files in binhex4 format. + + +This module encodes and decodes files in binhex4 format, a format allowing +representation of Macintosh files in ASCII. On the Macintosh, both forks of a +file and the finder information are encoded (or decoded), on other platforms +only the data fork is handled. + +The :mod:`binhex` module defines the following functions: + + +.. function:: binhex(input, output) + + Convert a binary file with filename *input* to binhex file *output*. The + *output* parameter can either be a filename or a file-like object (any object + supporting a :meth:`write` and :meth:`close` method). + + +.. function:: hexbin(input[, output]) + + Decode a binhex file *input*. *input* may be a filename or a file-like object + supporting :meth:`read` and :meth:`close` methods. The resulting file is written + to a file named *output*, unless the argument is omitted in which case the + output filename is read from the binhex file. + +The following exception is also defined: + + +.. exception:: Error + + Exception raised when something can't be encoded using the binhex format (for + example, a filename is too long to fit in the filename field), or when input is + not properly encoded binhex data. + + +.. seealso:: + + Module :mod:`binascii` + Support module containing ASCII-to-binary and binary-to-ASCII conversions. + + +.. _binhex-notes: + +Notes +----- + +There is an alternative, more powerful interface to the coder and decoder, see +the source for details. + +If you code or decode textfiles on non-Macintosh platforms they will still use +the Macintosh newline convention (carriage-return as end of line). + +As of this writing, :func:`hexbin` appears to not work in all cases. + diff --git a/Doc/library/bisect.rst b/Doc/library/bisect.rst new file mode 100644 index 0000000..b8eb348 --- /dev/null +++ b/Doc/library/bisect.rst @@ -0,0 +1,92 @@ + +:mod:`bisect` --- Array bisection algorithm +=========================================== + +.. module:: bisect + :synopsis: Array bisection algorithms for binary searching. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. % LaTeX produced by Fred L. Drake, Jr. <fdrake@acm.org>, with an +.. % example based on the PyModules FAQ entry by Aaron Watters +.. % <arw@pythonpros.com>. + +This module provides support for maintaining a list in sorted order without +having to sort the list after each insertion. For long lists of items with +expensive comparison operations, this can be an improvement over the more common +approach. The module is called :mod:`bisect` because it uses a basic bisection +algorithm to do its work. The source code may be most useful as a working +example of the algorithm (the boundary conditions are already right!). + +The following functions are provided: + + +.. function:: bisect_left(list, item[, lo[, hi]]) + + Locate the proper insertion point for *item* in *list* to maintain sorted order. + The parameters *lo* and *hi* may be used to specify a subset of the list which + should be considered; by default the entire list is used. If *item* is already + present in *list*, the insertion point will be before (to the left of) any + existing entries. The return value is suitable for use as the first parameter + to ``list.insert()``. This assumes that *list* is already sorted. + + .. versionadded:: 2.1 + + +.. function:: bisect_right(list, item[, lo[, hi]]) + + Similar to :func:`bisect_left`, but returns an insertion point which comes after + (to the right of) any existing entries of *item* in *list*. + + .. versionadded:: 2.1 + + +.. function:: bisect(...) + + Alias for :func:`bisect_right`. + + +.. function:: insort_left(list, item[, lo[, hi]]) + + Insert *item* in *list* in sorted order. This is equivalent to + ``list.insert(bisect.bisect_left(list, item, lo, hi), item)``. This assumes + that *list* is already sorted. + + .. versionadded:: 2.1 + + +.. function:: insort_right(list, item[, lo[, hi]]) + + Similar to :func:`insort_left`, but inserting *item* in *list* after any + existing entries of *item*. + + .. versionadded:: 2.1 + + +.. function:: insort(...) + + Alias for :func:`insort_right`. + + +Examples +-------- + +.. _bisect-example: + +The :func:`bisect` function is generally useful for categorizing numeric data. +This example uses :func:`bisect` to look up a letter grade for an exam total +(say) based on a set of ordered numeric breakpoints: 85 and up is an 'A', 75..84 +is a 'B', etc. :: + + >>> grades = "FEDCBA" + >>> breakpoints = [30, 44, 66, 75, 85] + >>> from bisect import bisect + >>> def grade(total): + ... return grades[bisect(breakpoints, total)] + ... + >>> grade(66) + 'C' + >>> map(grade, [33, 99, 77, 44, 12, 88]) + ['E', 'A', 'B', 'D', 'F', 'A'] + + diff --git a/Doc/library/bsddb.rst b/Doc/library/bsddb.rst new file mode 100644 index 0000000..55b7c7d --- /dev/null +++ b/Doc/library/bsddb.rst @@ -0,0 +1,211 @@ + +:mod:`bsddb` --- Interface to Berkeley DB library +================================================= + +.. module:: bsddb + :synopsis: Interface to Berkeley DB database library +.. sectionauthor:: Skip Montanaro <skip@mojam.com> + + +The :mod:`bsddb` module provides an interface to the Berkeley DB library. Users +can create hash, btree or record based library files using the appropriate open +call. Bsddb objects behave generally like dictionaries. Keys and values must be +strings, however, so to use other objects as keys or to store other kinds of +objects the user must serialize them somehow, typically using +:func:`marshal.dumps` or :func:`pickle.dumps`. + +The :mod:`bsddb` module requires a Berkeley DB library version from 3.3 thru +4.5. + + +.. seealso:: + + http://pybsddb.sourceforge.net/ + The website with documentation for the :mod:`bsddb.db` Python Berkeley DB + interface that closely mirrors the object oriented interface provided in + Berkeley DB 3 and 4. + + http://www.oracle.com/database/berkeley-db/ + The Berkeley DB library. + +A more modern DB, DBEnv and DBSequence object interface is available in the +:mod:`bsddb.db` module which closely matches the Berkeley DB C API documented at +the above URLs. Additional features provided by the :mod:`bsddb.db` API include +fine tuning, transactions, logging, and multiprocess concurrent database access. + +The following is a description of the legacy :mod:`bsddb` interface compatible +with the old Python bsddb module. Starting in Python 2.5 this interface should +be safe for multithreaded access. The :mod:`bsddb.db` API is recommended for +threading users as it provides better control. + +The :mod:`bsddb` module defines the following functions that create objects that +access the appropriate type of Berkeley DB file. The first two arguments of +each function are the same. For ease of portability, only the first two +arguments should be used in most instances. + + +.. function:: hashopen(filename[, flag[, mode[, pgsize[, ffactor[, nelem[, cachesize[, lorder[, hflags]]]]]]]]) + + Open the hash format file named *filename*. Files never intended to be + preserved on disk may be created by passing ``None`` as the *filename*. The + optional *flag* identifies the mode used to open the file. It may be ``'r'`` + (read only), ``'w'`` (read-write) , ``'c'`` (read-write - create if necessary; + the default) or ``'n'`` (read-write - truncate to zero length). The other + arguments are rarely used and are just passed to the low-level :cfunc:`dbopen` + function. Consult the Berkeley DB documentation for their use and + interpretation. + + +.. function:: btopen(filename[, flag[, mode[, btflags[, cachesize[, maxkeypage[, minkeypage[, pgsize[, lorder]]]]]]]]) + + Open the btree format file named *filename*. Files never intended to be + preserved on disk may be created by passing ``None`` as the *filename*. The + optional *flag* identifies the mode used to open the file. It may be ``'r'`` + (read only), ``'w'`` (read-write), ``'c'`` (read-write - create if necessary; + the default) or ``'n'`` (read-write - truncate to zero length). The other + arguments are rarely used and are just passed to the low-level dbopen function. + Consult the Berkeley DB documentation for their use and interpretation. + + +.. function:: rnopen(filename[, flag[, mode[, rnflags[, cachesize[, pgsize[, lorder[, rlen[, delim[, source[, pad]]]]]]]]]]) + + Open a DB record format file named *filename*. Files never intended to be + preserved on disk may be created by passing ``None`` as the *filename*. The + optional *flag* identifies the mode used to open the file. It may be ``'r'`` + (read only), ``'w'`` (read-write), ``'c'`` (read-write - create if necessary; + the default) or ``'n'`` (read-write - truncate to zero length). The other + arguments are rarely used and are just passed to the low-level dbopen function. + Consult the Berkeley DB documentation for their use and interpretation. + + +.. class:: StringKeys(db) + + Wrapper class around a DB object that supports string keys (rather than bytes). + All keys are encoded as UTF-8, then passed to the underlying object. + + .. versionadded:: 3.0 + + +.. class:: StringValues(db) + + Wrapper class around a DB object that supports string values (rather than bytes). + All values are encoded as UTF-8, then passed to the underlying object. + + .. versionadded:: 3.0 + + +.. seealso:: + + Module :mod:`dbhash` + DBM-style interface to the :mod:`bsddb` + + +.. _bsddb-objects: + +Hash, BTree and Record Objects +------------------------------ + +Once instantiated, hash, btree and record objects support the same methods as +dictionaries. In addition, they support the methods listed below. + +.. versionchanged:: 2.3.1 + Added dictionary methods. + + +.. method:: bsddbobject.close() + + Close the underlying file. The object can no longer be accessed. Since there + is no open :meth:`open` method for these objects, to open the file again a new + :mod:`bsddb` module open function must be called. + + +.. method:: bsddbobject.keys() + + Return the list of keys contained in the DB file. The order of the list is + unspecified and should not be relied on. In particular, the order of the list + returned is different for different file formats. + + +.. method:: bsddbobject.has_key(key) + + Return ``1`` if the DB file contains the argument as a key. + + +.. method:: bsddbobject.set_location(key) + + Set the cursor to the item indicated by *key* and return a tuple containing the + key and its value. For binary tree databases (opened using :func:`btopen`), if + *key* does not actually exist in the database, the cursor will point to the next + item in sorted order and return that key and value. For other databases, + :exc:`KeyError` will be raised if *key* is not found in the database. + + +.. method:: bsddbobject.first() + + Set the cursor to the first item in the DB file and return it. The order of + keys in the file is unspecified, except in the case of B-Tree databases. This + method raises :exc:`bsddb.error` if the database is empty. + + +.. method:: bsddbobject.next() + + Set the cursor to the next item in the DB file and return it. The order of + keys in the file is unspecified, except in the case of B-Tree databases. + + +.. method:: bsddbobject.previous() + + Set the cursor to the previous item in the DB file and return it. The order of + keys in the file is unspecified, except in the case of B-Tree databases. This + is not supported on hashtable databases (those opened with :func:`hashopen`). + + +.. method:: bsddbobject.last() + + Set the cursor to the last item in the DB file and return it. The order of keys + in the file is unspecified. This is not supported on hashtable databases (those + opened with :func:`hashopen`). This method raises :exc:`bsddb.error` if the + database is empty. + + +.. method:: bsddbobject.sync() + + Synchronize the database on disk. + +Example:: + + >>> import bsddb + >>> db = bsddb.btopen('/tmp/spam.db', 'c') + >>> for i in range(10): db['%d'%i] = '%d'% (i*i) + ... + >>> db['3'] + '9' + >>> db.keys() + ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] + >>> db.first() + ('0', '0') + >>> db.next() + ('1', '1') + >>> db.last() + ('9', '81') + >>> db.set_location('2') + ('2', '4') + >>> db.previous() + ('1', '1') + >>> for k, v in db.iteritems(): + ... print k, v + 0 0 + 1 1 + 2 4 + 3 9 + 4 16 + 5 25 + 6 36 + 7 49 + 8 64 + 9 81 + >>> '8' in db + True + >>> db.sync() + 0 + diff --git a/Doc/library/bz2.rst b/Doc/library/bz2.rst new file mode 100644 index 0000000..a8c0911 --- /dev/null +++ b/Doc/library/bz2.rst @@ -0,0 +1,181 @@ + +:mod:`bz2` --- Compression compatible with :program:`bzip2` +=========================================================== + +.. module:: bz2 + :synopsis: Interface to compression and decompression routines compatible with bzip2. +.. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> +.. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> + + +.. versionadded:: 2.3 + +This module provides a comprehensive interface for the bz2 compression library. +It implements a complete file interface, one-shot (de)compression functions, and +types for sequential (de)compression. + +Here is a resume of the features offered by the bz2 module: + +* :class:`BZ2File` class implements a complete file interface, including + :meth:`readline`, :meth:`readlines`, :meth:`writelines`, :meth:`seek`, etc; + +* :class:`BZ2File` class implements emulated :meth:`seek` support; + +* :class:`BZ2File` class implements universal newline support; + +* :class:`BZ2File` class offers an optimized line iteration using the readahead + algorithm borrowed from file objects; + +* Sequential (de)compression supported by :class:`BZ2Compressor` and + :class:`BZ2Decompressor` classes; + +* One-shot (de)compression supported by :func:`compress` and :func:`decompress` + functions; + +* Thread safety uses individual locking mechanism; + +* Complete inline documentation; + + +(De)compression of files +------------------------ + +Handling of compressed files is offered by the :class:`BZ2File` class. + + +.. class:: BZ2File(filename[, mode[, buffering[, compresslevel]]]) + + Open a bz2 file. Mode can be either ``'r'`` or ``'w'``, for reading (default) + or writing. When opened for writing, the file will be created if it doesn't + exist, and truncated otherwise. If *buffering* is given, ``0`` means unbuffered, + and larger numbers specify the buffer size; the default is ``0``. If + *compresslevel* is given, it must be a number between ``1`` and ``9``; the + default is ``9``. Add a ``'U'`` to mode to open the file for input with + universal newline support. Any line ending in the input file will be seen as a + ``'\n'`` in Python. Also, a file so opened gains the attribute + :attr:`newlines`; the value for this attribute is one of ``None`` (no newline + read yet), ``'\r'``, ``'\n'``, ``'\r\n'`` or a tuple containing all the newline + types seen. Universal newlines are available only when reading. Instances + support iteration in the same way as normal :class:`file` instances. + + +.. method:: BZ2File.close() + + Close the file. Sets data attribute :attr:`closed` to true. A closed file cannot + be used for further I/O operations. :meth:`close` may be called more than once + without error. + + +.. method:: BZ2File.read([size]) + + Read at most *size* uncompressed bytes, returned as a string. If the *size* + argument is negative or omitted, read until EOF is reached. + + +.. method:: BZ2File.readline([size]) + + Return the next line from the file, as a string, retaining newline. A + non-negative *size* argument limits the maximum number of bytes to return (an + incomplete line may be returned then). Return an empty string at EOF. + + +.. method:: BZ2File.readlines([size]) + + Return a list of lines read. The optional *size* argument, if given, is an + approximate bound on the total number of bytes in the lines returned. + + +.. method:: BZ2File.seek(offset[, whence]) + + Move to new file position. Argument *offset* is a byte count. Optional argument + *whence* defaults to ``os.SEEK_SET`` or ``0`` (offset from start of file; offset + should be ``>= 0``); other values are ``os.SEEK_CUR`` or ``1`` (move relative to + current position; offset can be positive or negative), and ``os.SEEK_END`` or + ``2`` (move relative to end of file; offset is usually negative, although many + platforms allow seeking beyond the end of a file). + + Note that seeking of bz2 files is emulated, and depending on the parameters the + operation may be extremely slow. + + +.. method:: BZ2File.tell() + + Return the current file position, an integer (may be a long integer). + + +.. method:: BZ2File.write(data) + + Write string *data* to file. Note that due to buffering, :meth:`close` may be + needed before the file on disk reflects the data written. + + +.. method:: BZ2File.writelines(sequence_of_strings) + + Write the sequence of strings to the file. Note that newlines are not added. The + sequence can be any iterable object producing strings. This is equivalent to + calling write() for each string. + + +Sequential (de)compression +-------------------------- + +Sequential compression and decompression is done using the classes +:class:`BZ2Compressor` and :class:`BZ2Decompressor`. + + +.. class:: BZ2Compressor([compresslevel]) + + Create a new compressor object. This object may be used to compress data + sequentially. If you want to compress data in one shot, use the :func:`compress` + function instead. The *compresslevel* parameter, if given, must be a number + between ``1`` and ``9``; the default is ``9``. + + +.. method:: BZ2Compressor.compress(data) + + Provide more data to the compressor object. It will return chunks of compressed + data whenever possible. When you've finished providing data to compress, call + the :meth:`flush` method to finish the compression process, and return what is + left in internal buffers. + + +.. method:: BZ2Compressor.flush() + + Finish the compression process and return what is left in internal buffers. You + must not use the compressor object after calling this method. + + +.. class:: BZ2Decompressor() + + Create a new decompressor object. This object may be used to decompress data + sequentially. If you want to decompress data in one shot, use the + :func:`decompress` function instead. + + +.. method:: BZ2Decompressor.decompress(data) + + Provide more data to the decompressor object. It will return chunks of + decompressed data whenever possible. If you try to decompress data after the end + of stream is found, :exc:`EOFError` will be raised. If any data was found after + the end of stream, it'll be ignored and saved in :attr:`unused_data` attribute. + + +One-shot (de)compression +------------------------ + +One-shot compression and decompression is provided through the :func:`compress` +and :func:`decompress` functions. + + +.. function:: compress(data[, compresslevel]) + + Compress *data* in one shot. If you want to compress data sequentially, use an + instance of :class:`BZ2Compressor` instead. The *compresslevel* parameter, if + given, must be a number between ``1`` and ``9``; the default is ``9``. + + +.. function:: decompress(data) + + Decompress *data* in one shot. If you want to decompress data sequentially, use + an instance of :class:`BZ2Decompressor` instead. + diff --git a/Doc/library/calendar.rst b/Doc/library/calendar.rst new file mode 100644 index 0000000..68cbeb6 --- /dev/null +++ b/Doc/library/calendar.rst @@ -0,0 +1,326 @@ + +:mod:`calendar` --- General calendar-related functions +====================================================== + +.. module:: calendar + :synopsis: Functions for working with calendars, including some emulation of the Unix cal + program. +.. sectionauthor:: Drew Csillag <drew_csillag@geocities.com> + + +This module allows you to output calendars like the Unix :program:`cal` program, +and provides additional useful functions related to the calendar. By default, +these calendars have Monday as the first day of the week, and Sunday as the last +(the European convention). Use :func:`setfirstweekday` to set the first day of +the week to Sunday (6) or to any other weekday. Parameters that specify dates +are given as integers. For related +functionality, see also the :mod:`datetime` and :mod:`time` modules. + +Most of these functions and classses rely on the :mod:`datetime` module which +uses an idealized calendar, the current Gregorian calendar indefinitely extended +in both directions. This matches the definition of the "proleptic Gregorian" +calendar in Dershowitz and Reingold's book "Calendrical Calculations", where +it's the base calendar for all computations. + + +.. class:: Calendar([firstweekday]) + + Creates a :class:`Calendar` object. *firstweekday* is an integer specifying the + first day of the week. ``0`` is Monday (the default), ``6`` is Sunday. + + A :class:`Calendar` object provides several methods that can be used for + preparing the calendar data for formatting. This class doesn't do any formatting + itself. This is the job of subclasses. + + .. versionadded:: 2.5 + +:class:`Calendar` instances have the following methods: + + +.. method:: Calendar.iterweekdays(weekday) + + Return an iterator for the week day numbers that will be used for one week. The + first number from the iterator will be the same as the number returned by + :meth:`firstweekday`. + + +.. method:: Calendar.itermonthdates(year, month) + + Return an iterator for the month *month* (1-12) in the year *year*. This + iterator will return all days (as :class:`datetime.date` objects) for the month + and all days before the start of the month or after the end of the month that + are required to get a complete week. + + +.. method:: Calendar.itermonthdays2(year, month) + + Return an iterator for the month *month* in the year *year* similar to + :meth:`itermonthdates`. Days returned will be tuples consisting of a day number + and a week day number. + + +.. method:: Calendar.itermonthdays(year, month) + + Return an iterator for the month *month* in the year *year* similar to + :meth:`itermonthdates`. Days returned will simply be day numbers. + + +.. method:: Calendar.monthdatescalendar(year, month) + + Return a list of the weeks in the month *month* of the *year* as full weeks. + Weeks are lists of seven :class:`datetime.date` objects. + + +.. method:: Calendar.monthdays2calendar(year, month) + + Return a list of the weeks in the month *month* of the *year* as full weeks. + Weeks are lists of seven tuples of day numbers and weekday numbers. + + +.. method:: Calendar.monthdayscalendar(year, month) + + Return a list of the weeks in the month *month* of the *year* as full weeks. + Weeks are lists of seven day numbers. + + +.. method:: Calendar.yeardatescalendar(year, month[, width]) + + Return the data for the specified year ready for formatting. The return value is + a list of month rows. Each month row contains up to *width* months (defaulting + to 3). Each month contains between 4 and 6 weeks and each week contains 1--7 + days. Days are :class:`datetime.date` objects. + + +.. method:: Calendar.yeardays2calendar(year, month[, width]) + + Return the data for the specified year ready for formatting (similar to + :meth:`yeardatescalendar`). Entries in the week lists are tuples of day numbers + and weekday numbers. Day numbers outside this month are zero. + + +.. method:: Calendar.yeardayscalendar(year, month[, width]) + + Return the data for the specified year ready for formatting (similar to + :meth:`yeardatescalendar`). Entries in the week lists are day numbers. Day + numbers outside this month are zero. + + +.. class:: TextCalendar([firstweekday]) + + This class can be used to generate plain text calendars. + + .. versionadded:: 2.5 + +:class:`TextCalendar` instances have the following methods: + + +.. method:: TextCalendar.formatmonth(theyear, themonth[, w[, l]]) + + Return a month's calendar in a multi-line string. If *w* is provided, it + specifies the width of the date columns, which are centered. If *l* is given, it + specifies the number of lines that each week will use. Depends on the first + weekday as set by :func:`setfirstweekday`. + + +.. method:: TextCalendar.prmonth(theyear, themonth[, w[, l]]) + + Print a month's calendar as returned by :meth:`formatmonth`. + + +.. method:: TextCalendar.formatyear(theyear, themonth[, w[, l[, c[, m]]]]) + + Return a *m*-column calendar for an entire year as a multi-line string. Optional + parameters *w*, *l*, and *c* are for date column width, lines per week, and + number of spaces between month columns, respectively. Depends on the first + weekday as set by :meth:`setfirstweekday`. The earliest year for which a + calendar can be generated is platform-dependent. + + +.. method:: TextCalendar.pryear(theyear[, w[, l[, c[, m]]]]) + + Print the calendar for an entire year as returned by :meth:`formatyear`. + + +.. class:: HTMLCalendar([firstweekday]) + + This class can be used to generate HTML calendars. + + .. versionadded:: 2.5 + +:class:`HTMLCalendar` instances have the following methods: + + +.. method:: HTMLCalendar.formatmonth(theyear, themonth[, withyear]) + + Return a month's calendar as an HTML table. If *withyear* is true the year will + be included in the header, otherwise just the month name will be used. + + +.. method:: HTMLCalendar.formatyear(theyear, themonth[, width]) + + Return a year's calendar as an HTML table. *width* (defaulting to 3) specifies + the number of months per row. + + +.. method:: HTMLCalendar.formatyearpage(theyear, themonth[, width[, css[, encoding]]]) + + Return a year's calendar as a complete HTML page. *width* (defaulting to 3) + specifies the number of months per row. *css* is the name for the cascading + style sheet to be used. :const:`None` can be passed if no style sheet should be + used. *encoding* specifies the encoding to be used for the output (defaulting to + the system default encoding). + + +.. class:: LocaleTextCalendar([firstweekday[, locale]]) + + This subclass of :class:`TextCalendar` can be passed a locale name in the + constructor and will return month and weekday names in the specified locale. If + this locale includes an encoding all strings containing month and weekday names + will be returned as unicode. + + .. versionadded:: 2.5 + + +.. class:: LocaleHTMLCalendar([firstweekday[, locale]]) + + This subclass of :class:`HTMLCalendar` can be passed a locale name in the + constructor and will return month and weekday names in the specified locale. If + this locale includes an encoding all strings containing month and weekday names + will be returned as unicode. + + .. versionadded:: 2.5 + +For simple text calendars this module provides the following functions. + + +.. function:: setfirstweekday(weekday) + + Sets the weekday (``0`` is Monday, ``6`` is Sunday) to start each week. The + values :const:`MONDAY`, :const:`TUESDAY`, :const:`WEDNESDAY`, :const:`THURSDAY`, + :const:`FRIDAY`, :const:`SATURDAY`, and :const:`SUNDAY` are provided for + convenience. For example, to set the first weekday to Sunday:: + + import calendar + calendar.setfirstweekday(calendar.SUNDAY) + + .. versionadded:: 2.0 + + +.. function:: firstweekday() + + Returns the current setting for the weekday to start each week. + + .. versionadded:: 2.0 + + +.. function:: isleap(year) + + Returns :const:`True` if *year* is a leap year, otherwise :const:`False`. + + +.. function:: leapdays(y1, y2) + + Returns the number of leap years in the range from *y1* to *y2* (exclusive), + where *y1* and *y2* are years. + + .. versionchanged:: 2.0 + This function didn't work for ranges spanning a century change in Python + 1.5.2. + + +.. function:: weekday(year, month, day) + + Returns the day of the week (``0`` is Monday) for *year* (``1970``--...), + *month* (``1``--``12``), *day* (``1``--``31``). + + +.. function:: weekheader(n) + + Return a header containing abbreviated weekday names. *n* specifies the width in + characters for one weekday. + + +.. function:: monthrange(year, month) + + Returns weekday of first day of the month and number of days in month, for the + specified *year* and *month*. + + +.. function:: monthcalendar(year, month) + + Returns a matrix representing a month's calendar. Each row represents a week; + days outside of the month a represented by zeros. Each week begins with Monday + unless set by :func:`setfirstweekday`. + + +.. function:: prmonth(theyear, themonth[, w[, l]]) + + Prints a month's calendar as returned by :func:`month`. + + +.. function:: month(theyear, themonth[, w[, l]]) + + Returns a month's calendar in a multi-line string using the :meth:`formatmonth` + of the :class:`TextCalendar` class. + + .. versionadded:: 2.0 + + +.. function:: prcal(year[, w[, l[c]]]) + + Prints the calendar for an entire year as returned by :func:`calendar`. + + +.. function:: calendar(year[, w[, l[c]]]) + + Returns a 3-column calendar for an entire year as a multi-line string using the + :meth:`formatyear` of the :class:`TextCalendar` class. + + .. versionadded:: 2.0 + + +.. function:: timegm(tuple) + + An unrelated but handy function that takes a time tuple such as returned by the + :func:`gmtime` function in the :mod:`time` module, and returns the corresponding + Unix timestamp value, assuming an epoch of 1970, and the POSIX encoding. In + fact, :func:`time.gmtime` and :func:`timegm` are each others' inverse. + + .. versionadded:: 2.0 + +The :mod:`calendar` module exports the following data attributes: + + +.. data:: day_name + + An array that represents the days of the week in the current locale. + + +.. data:: day_abbr + + An array that represents the abbreviated days of the week in the current locale. + + +.. data:: month_name + + An array that represents the months of the year in the current locale. This + follows normal convention of January being month number 1, so it has a length of + 13 and ``month_name[0]`` is the empty string. + + +.. data:: month_abbr + + An array that represents the abbreviated months of the year in the current + locale. This follows normal convention of January being month number 1, so it + has a length of 13 and ``month_abbr[0]`` is the empty string. + + +.. seealso:: + + Module :mod:`datetime` + Object-oriented interface to dates and times with similar functionality to the + :mod:`time` module. + + Module :mod:`time` + Low-level time related functions. + diff --git a/Doc/library/carbon.rst b/Doc/library/carbon.rst new file mode 100644 index 0000000..ecaf3bb --- /dev/null +++ b/Doc/library/carbon.rst @@ -0,0 +1,288 @@ + +.. _toolbox: + +********************* +MacOS Toolbox Modules +********************* + +There are a set of modules that provide interfaces to various MacOS toolboxes. +If applicable the module will define a number of Python objects for the various +structures declared by the toolbox, and operations will be implemented as +methods of the object. Other operations will be implemented as functions in the +module. Not all operations possible in C will also be possible in Python +(callbacks are often a problem), and parameters will occasionally be different +in Python (input and output buffers, especially). All methods and functions +have a :attr:`__doc__` string describing their arguments and return values, and +for additional description you are referred to `Inside Macintosh +<http://developer.apple.com/documentation/macos8/mac8.html>`_ or similar works. + +These modules all live in a package called :mod:`Carbon`. Despite that name they +are not all part of the Carbon framework: CF is really in the CoreFoundation +framework and Qt is in the QuickTime framework. The normal use pattern is :: + + from Carbon import AE + +**Warning!** These modules are not yet documented. If you wish to contribute +documentation of any of these modules, please get in touch with docs@python.org. + + +:mod:`Carbon.AE` --- Apple Events +================================= + +.. module:: Carbon.AE + :platform: Mac + :synopsis: Interface to the Apple Events toolbox. + + + +:mod:`Carbon.AH` --- Apple Help +=============================== + +.. module:: Carbon.AH + :platform: Mac + :synopsis: Interface to the Apple Help manager. + + + +:mod:`Carbon.App` --- Appearance Manager +======================================== + +.. module:: Carbon.App + :platform: Mac + :synopsis: Interface to the Appearance Manager. + + + +:mod:`Carbon.CF` --- Core Foundation +==================================== + +.. module:: Carbon.CF + :platform: Mac + :synopsis: Interface to the Core Foundation. + + +The ``CFBase``, ``CFArray``, ``CFData``, ``CFDictionary``, ``CFString`` and +``CFURL`` objects are supported, some only partially. + + +:mod:`Carbon.CG` --- Core Graphics +================================== + +.. module:: Carbon.CG + :platform: Mac + :synopsis: Interface to the Component Manager. + + + +:mod:`Carbon.CarbonEvt` --- Carbon Event Manager +================================================ + +.. module:: Carbon.CarbonEvt + :platform: Mac + :synopsis: Interface to the Carbon Event Manager. + + + +:mod:`Carbon.Cm` --- Component Manager +====================================== + +.. module:: Carbon.Cm + :platform: Mac + :synopsis: Interface to the Component Manager. + + + +:mod:`Carbon.Ctl` --- Control Manager +===================================== + +.. module:: Carbon.Ctl + :platform: Mac + :synopsis: Interface to the Control Manager. + + + +:mod:`Carbon.Dlg` --- Dialog Manager +==================================== + +.. module:: Carbon.Dlg + :platform: Mac + :synopsis: Interface to the Dialog Manager. + + + +:mod:`Carbon.Evt` --- Event Manager +=================================== + +.. module:: Carbon.Evt + :platform: Mac + :synopsis: Interface to the classic Event Manager. + + + +:mod:`Carbon.Fm` --- Font Manager +================================= + +.. module:: Carbon.Fm + :platform: Mac + :synopsis: Interface to the Font Manager. + + + +:mod:`Carbon.Folder` --- Folder Manager +======================================= + +.. module:: Carbon.Folder + :platform: Mac + :synopsis: Interface to the Folder Manager. + + + +:mod:`Carbon.Help` --- Help Manager +=================================== + +.. module:: Carbon.Help + :platform: Mac + :synopsis: Interface to the Carbon Help Manager. + + + +:mod:`Carbon.List` --- List Manager +=================================== + +.. module:: Carbon.List + :platform: Mac + :synopsis: Interface to the List Manager. + + + +:mod:`Carbon.Menu` --- Menu Manager +=================================== + +.. module:: Carbon.Menu + :platform: Mac + :synopsis: Interface to the Menu Manager. + + + +:mod:`Carbon.Mlte` --- MultiLingual Text Editor +=============================================== + +.. module:: Carbon.Mlte + :platform: Mac + :synopsis: Interface to the MultiLingual Text Editor. + + + +:mod:`Carbon.Qd` --- QuickDraw +============================== + +.. module:: Carbon.Qd + :platform: Mac + :synopsis: Interface to the QuickDraw toolbox. + + + +:mod:`Carbon.Qdoffs` --- QuickDraw Offscreen +============================================ + +.. module:: Carbon.Qdoffs + :platform: Mac + :synopsis: Interface to the QuickDraw Offscreen APIs. + + + +:mod:`Carbon.Qt` --- QuickTime +============================== + +.. module:: Carbon.Qt + :platform: Mac + :synopsis: Interface to the QuickTime toolbox. + + + +:mod:`Carbon.Res` --- Resource Manager and Handles +================================================== + +.. module:: Carbon.Res + :platform: Mac + :synopsis: Interface to the Resource Manager and Handles. + + + +:mod:`Carbon.Scrap` --- Scrap Manager +===================================== + +.. module:: Carbon.Scrap + :platform: Mac + :synopsis: The Scrap Manager provides basic services for implementing cut & paste and + clipboard operations. + + +This module is only fully available on MacOS9 and earlier under classic PPC +MacPython. Very limited functionality is available under Carbon MacPython. + +.. index:: single: Scrap Manager + +The Scrap Manager supports the simplest form of cut & paste operations on the +Macintosh. It can be use for both inter- and intra-application clipboard +operations. + +The :mod:`Scrap` module provides low-level access to the functions of the Scrap +Manager. It contains the following functions: + + +.. function:: InfoScrap() + + Return current information about the scrap. The information is encoded as a + tuple containing the fields ``(size, handle, count, state, path)``. + + +----------+---------------------------------------------+ + | Field | Meaning | + +==========+=============================================+ + | *size* | Size of the scrap in bytes. | + +----------+---------------------------------------------+ + | *handle* | Resource object representing the scrap. | + +----------+---------------------------------------------+ + | *count* | Serial number of the scrap contents. | + +----------+---------------------------------------------+ + | *state* | Integer; positive if in memory, ``0`` if on | + | | disk, negative if uninitialized. | + +----------+---------------------------------------------+ + | *path* | Filename of the scrap when stored on disk. | + +----------+---------------------------------------------+ + + +.. seealso:: + + `Scrap Manager <http://developer.apple.com/documentation/mac/MoreToolbox/MoreToolbox-109.html>`_ + Apple's documentation for the Scrap Manager gives a lot of useful information + about using the Scrap Manager in applications. + + + +:mod:`Carbon.Snd` --- Sound Manager +=================================== + +.. module:: Carbon.Snd + :platform: Mac + :synopsis: Interface to the Sound Manager. + + + +:mod:`Carbon.TE` --- TextEdit +============================= + +.. module:: Carbon.TE + :platform: Mac + :synopsis: Interface to TextEdit. + + + +:mod:`Carbon.Win` --- Window Manager +==================================== + +.. module:: Carbon.Win + :platform: Mac + :synopsis: Interface to the Window Manager. + + diff --git a/Doc/library/cgi.rst b/Doc/library/cgi.rst new file mode 100644 index 0000000..29ed545 --- /dev/null +++ b/Doc/library/cgi.rst @@ -0,0 +1,558 @@ + +:mod:`cgi` --- Common Gateway Interface support. +================================================ + +.. module:: cgi + :synopsis: Helpers for running Python scripts via the Common Gateway Interface. + + +.. index:: + pair: WWW; server + pair: CGI; protocol + pair: HTTP; protocol + pair: MIME; headers + single: URL + single: Common Gateway Interface + +Support module for Common Gateway Interface (CGI) scripts. + +This module defines a number of utilities for use by CGI scripts written in +Python. + + +Introduction +------------ + +.. _cgi-intro: + +A CGI script is invoked by an HTTP server, usually to process user input +submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element. + +Most often, CGI scripts live in the server's special :file:`cgi-bin` directory. +The HTTP server places all sorts of information about the request (such as the +client's hostname, the requested URL, the query string, and lots of other +goodies) in the script's shell environment, executes the script, and sends the +script's output back to the client. + +The script's input is connected to the client too, and sometimes the form data +is read this way; at other times the form data is passed via the "query string" +part of the URL. This module is intended to take care of the different cases +and provide a simpler interface to the Python script. It also provides a number +of utilities that help in debugging scripts, and the latest addition is support +for file uploads from a form (if your browser supports it). + +The output of a CGI script should consist of two sections, separated by a blank +line. The first section contains a number of headers, telling the client what +kind of data is following. Python code to generate a minimal header section +looks like this:: + + print "Content-Type: text/html" # HTML is following + print # blank line, end of headers + +The second section is usually HTML, which allows the client software to display +nicely formatted text with header, in-line images, etc. Here's Python code that +prints a simple piece of HTML:: + + print "<TITLE>CGI script output</TITLE>" + print "<H1>This is my first CGI script</H1>" + print "Hello, world!" + + +.. _using-the-cgi-module: + +Using the cgi module +-------------------- + +Begin by writing ``import cgi``. Do not use ``from cgi import *`` --- the +module defines all sorts of names for its own use or for backward compatibility +that you don't want in your namespace. + +When you write a new script, consider adding the line:: + + import cgitb; cgitb.enable() + +This activates a special exception handler that will display detailed reports in +the Web browser if any errors occur. If you'd rather not show the guts of your +program to users of your script, you can have the reports saved to files +instead, with a line like this:: + + import cgitb; cgitb.enable(display=0, logdir="/tmp") + +It's very helpful to use this feature during script development. The reports +produced by :mod:`cgitb` provide information that can save you a lot of time in +tracking down bugs. You can always remove the ``cgitb`` line later when you +have tested your script and are confident that it works correctly. + +To get at submitted form data, it's best to use the :class:`FieldStorage` class. +The other classes defined in this module are provided mostly for backward +compatibility. Instantiate it exactly once, without arguments. This reads the +form contents from standard input or the environment (depending on the value of +various environment variables set according to the CGI standard). Since it may +consume standard input, it should be instantiated only once. + +The :class:`FieldStorage` instance can be indexed like a Python dictionary, and +also supports the standard dictionary methods :meth:`has_key` and :meth:`keys`. +The built-in :func:`len` is also supported. Form fields containing empty +strings are ignored and do not appear in the dictionary; to keep such values, +provide a true value for the optional *keep_blank_values* keyword parameter when +creating the :class:`FieldStorage` instance. + +For instance, the following code (which assumes that the +:mailheader:`Content-Type` header and blank line have already been printed) +checks that the fields ``name`` and ``addr`` are both set to a non-empty +string:: + + form = cgi.FieldStorage() + if not (form.has_key("name") and form.has_key("addr")): + print "<H1>Error</H1>" + print "Please fill in the name and addr fields." + return + print "<p>name:", form["name"].value + print "<p>addr:", form["addr"].value + ...further form processing here... + +Here the fields, accessed through ``form[key]``, are themselves instances of +:class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form +encoding). The :attr:`value` attribute of the instance yields the string value +of the field. The :meth:`getvalue` method returns this string value directly; +it also accepts an optional second argument as a default to return if the +requested key is not present. + +If the submitted form data contains more than one field with the same name, the +object retrieved by ``form[key]`` is not a :class:`FieldStorage` or +:class:`MiniFieldStorage` instance but a list of such instances. Similarly, in +this situation, ``form.getvalue(key)`` would return a list of strings. If you +expect this possibility (when your HTML form contains multiple fields with the +same name), use the :func:`getlist` function, which always returns a list of +values (so that you do not need to special-case the single item case). For +example, this code concatenates any number of username fields, separated by +commas:: + + value = form.getlist("username") + usernames = ",".join(value) + +If a field represents an uploaded file, accessing the value via the +:attr:`value` attribute or the :func:`getvalue` method reads the entire file in +memory as a string. This may not be what you want. You can test for an uploaded +file by testing either the :attr:`filename` attribute or the :attr:`file` +attribute. You can then read the data at leisure from the :attr:`file` +attribute:: + + fileitem = form["userfile"] + if fileitem.file: + # It's an uploaded file; count lines + linecount = 0 + while 1: + line = fileitem.file.readline() + if not line: break + linecount = linecount + 1 + +The file upload draft standard entertains the possibility of uploading multiple +files from one field (using a recursive :mimetype:`multipart/\*` encoding). +When this occurs, the item will be a dictionary-like :class:`FieldStorage` item. +This can be determined by testing its :attr:`type` attribute, which should be +:mimetype:`multipart/form-data` (or perhaps another MIME type matching +:mimetype:`multipart/\*`). In this case, it can be iterated over recursively +just like the top-level form object. + +When a form is submitted in the "old" format (as the query string or as a single +data part of type :mimetype:`application/x-www-form-urlencoded`), the items will +actually be instances of the class :class:`MiniFieldStorage`. In this case, the +:attr:`list`, :attr:`file`, and :attr:`filename` attributes are always ``None``. + + +Higher Level Interface +---------------------- + +.. versionadded:: 2.2 + +The previous section explains how to read CGI form data using the +:class:`FieldStorage` class. This section describes a higher level interface +which was added to this class to allow one to do it in a more readable and +intuitive way. The interface doesn't make the techniques described in previous +sections obsolete --- they are still useful to process file uploads efficiently, +for example. + +.. % XXX: Is this true ? + +The interface consists of two simple methods. Using the methods you can process +form data in a generic way, without the need to worry whether only one or more +values were posted under one name. + +In the previous section, you learned to write following code anytime you +expected a user to post more than one value under one name:: + + item = form.getvalue("item") + if isinstance(item, list): + # The user is requesting more than one item. + else: + # The user is requesting only one item. + +This situation is common for example when a form contains a group of multiple +checkboxes with the same name:: + + <input type="checkbox" name="item" value="1" /> + <input type="checkbox" name="item" value="2" /> + +In most situations, however, there's only one form control with a particular +name in a form and then you expect and need only one value associated with this +name. So you write a script containing for example this code:: + + user = form.getvalue("user").upper() + +The problem with the code is that you should never expect that a client will +provide valid input to your scripts. For example, if a curious user appends +another ``user=foo`` pair to the query string, then the script would crash, +because in this situation the ``getvalue("user")`` method call returns a list +instead of a string. Calling the :meth:`toupper` method on a list is not valid +(since lists do not have a method of this name) and results in an +:exc:`AttributeError` exception. + +Therefore, the appropriate way to read form data values was to always use the +code which checks whether the obtained value is a single value or a list of +values. That's annoying and leads to less readable scripts. + +A more convenient approach is to use the methods :meth:`getfirst` and +:meth:`getlist` provided by this higher level interface. + + +.. method:: FieldStorage.getfirst(name[, default]) + + This method always returns only one value associated with form field *name*. + The method returns only the first value in case that more values were posted + under such name. Please note that the order in which the values are received + may vary from browser to browser and should not be counted on. [#]_ If no such + form field or value exists then the method returns the value specified by the + optional parameter *default*. This parameter defaults to ``None`` if not + specified. + + +.. method:: FieldStorage.getlist(name) + + This method always returns a list of values associated with form field *name*. + The method returns an empty list if no such form field or value exists for + *name*. It returns a list consisting of one item if only one such value exists. + +Using these methods you can write nice compact code:: + + import cgi + form = cgi.FieldStorage() + user = form.getfirst("user", "").upper() # This way it's safe. + for item in form.getlist("item"): + do_something(item) + + +Old classes +----------- + +These classes, present in earlier versions of the :mod:`cgi` module, are still +supported for backward compatibility. New applications should use the +:class:`FieldStorage` class. + +:class:`SvFormContentDict` stores single value form content as dictionary; it +assumes each field name occurs in the form only once. + +:class:`FormContentDict` stores multiple value form content as a dictionary (the +form items are lists of values). Useful if your form contains multiple fields +with the same name. + +Other classes (:class:`FormContent`, :class:`InterpFormContentDict`) are present +for backwards compatibility with really old applications only. If you still use +these and would be inconvenienced when they disappeared from a next version of +this module, drop me a note. + + +.. _functions-in-cgi-module: + +Functions +--------- + +These are useful if you want more control, or if you want to employ some of the +algorithms implemented in this module in other circumstances. + + +.. function:: parse(fp[, keep_blank_values[, strict_parsing]]) + + Parse a query in the environment or from a file (the file defaults to + ``sys.stdin``). The *keep_blank_values* and *strict_parsing* parameters are + passed to :func:`parse_qs` unchanged. + + +.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]]) + + Parse a query string given as a string argument (data of type + :mimetype:`application/x-www-form-urlencoded`). Data are returned as a + dictionary. The dictionary keys are the unique query variable names and the + values are lists of values for each name. + + The optional argument *keep_blank_values* is a flag indicating whether blank + values in URL encoded queries should be treated as blank strings. A true value + indicates that blanks should be retained as blank strings. The default false + value indicates that blank values are to be ignored and treated as if they were + not included. + + The optional argument *strict_parsing* is a flag indicating what to do with + parsing errors. If false (the default), errors are silently ignored. If true, + errors raise a :exc:`ValueError` exception. + + Use the :func:`urllib.urlencode` function to convert such dictionaries into + query strings. + + +.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]]) + + Parse a query string given as a string argument (data of type + :mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of + name, value pairs. + + The optional argument *keep_blank_values* is a flag indicating whether blank + values in URL encoded queries should be treated as blank strings. A true value + indicates that blanks should be retained as blank strings. The default false + value indicates that blank values are to be ignored and treated as if they were + not included. + + The optional argument *strict_parsing* is a flag indicating what to do with + parsing errors. If false (the default), errors are silently ignored. If true, + errors raise a :exc:`ValueError` exception. + + Use the :func:`urllib.urlencode` function to convert such lists of pairs into + query strings. + + +.. function:: parse_multipart(fp, pdict) + + Parse input of type :mimetype:`multipart/form-data` (for file uploads). + Arguments are *fp* for the input file and *pdict* for a dictionary containing + other parameters in the :mailheader:`Content-Type` header. + + Returns a dictionary just like :func:`parse_qs` keys are the field names, each + value is a list of values for that field. This is easy to use but not much good + if you are expecting megabytes to be uploaded --- in that case, use the + :class:`FieldStorage` class instead which is much more flexible. + + Note that this does not parse nested multipart parts --- use + :class:`FieldStorage` for that. + + +.. function:: parse_header(string) + + Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a + dictionary of parameters. + + +.. function:: test() + + Robust test CGI script, usable as main program. Writes minimal HTTP headers and + formats all information provided to the script in HTML form. + + +.. function:: print_environ() + + Format the shell environment in HTML. + + +.. function:: print_form(form) + + Format a form in HTML. + + +.. function:: print_directory() + + Format the current directory in HTML. + + +.. function:: print_environ_usage() + + Print a list of useful (used by CGI) environment variables in HTML. + + +.. function:: escape(s[, quote]) + + Convert the characters ``'&'``, ``'<'`` and ``'>'`` in string *s* to HTML-safe + sequences. Use this if you need to display text that might contain such + characters in HTML. If the optional flag *quote* is true, the quotation mark + character (``'"'``) is also translated; this helps for inclusion in an HTML + attribute value, as in ``<A HREF="...">``. If the value to be quoted might + include single- or double-quote characters, or both, consider using the + :func:`quoteattr` function in the :mod:`xml.sax.saxutils` module instead. + + +.. _cgi-security: + +Caring about security +--------------------- + +.. index:: pair: CGI; security + +There's one important rule: if you invoke an external program (via the +:func:`os.system` or :func:`os.popen` functions. or others with similar +functionality), make very sure you don't pass arbitrary strings received from +the client to the shell. This is a well-known security hole whereby clever +hackers anywhere on the Web can exploit a gullible CGI script to invoke +arbitrary shell commands. Even parts of the URL or field names cannot be +trusted, since the request doesn't have to come from your form! + +To be on the safe side, if you must pass a string gotten from a form to a shell +command, you should make sure the string contains only alphanumeric characters, +dashes, underscores, and periods. + + +Installing your CGI script on a Unix system +------------------------------------------- + +Read the documentation for your HTTP server and check with your local system +administrator to find the directory where CGI scripts should be installed; +usually this is in a directory :file:`cgi-bin` in the server tree. + +Make sure that your script is readable and executable by "others"; the Unix file +mode should be ``0755`` octal (use ``chmod 0755 filename``). Make sure that the +first line of the script contains ``#!`` starting in column 1 followed by the +pathname of the Python interpreter, for instance:: + + #!/usr/local/bin/python + +Make sure the Python interpreter exists and is executable by "others". + +Make sure that any files your script needs to read or write are readable or +writable, respectively, by "others" --- their mode should be ``0644`` for +readable and ``0666`` for writable. This is because, for security reasons, the +HTTP server executes your script as user "nobody", without any special +privileges. It can only read (write, execute) files that everybody can read +(write, execute). The current directory at execution time is also different (it +is usually the server's cgi-bin directory) and the set of environment variables +is also different from what you get when you log in. In particular, don't count +on the shell's search path for executables (:envvar:`PATH`) or the Python module +search path (:envvar:`PYTHONPATH`) to be set to anything interesting. + +If you need to load modules from a directory which is not on Python's default +module search path, you can change the path in your script, before importing +other modules. For example:: + + import sys + sys.path.insert(0, "/usr/home/joe/lib/python") + sys.path.insert(0, "/usr/local/lib/python") + +(This way, the directory inserted last will be searched first!) + +Instructions for non-Unix systems will vary; check your HTTP server's +documentation (it will usually have a section on CGI scripts). + + +Testing your CGI script +----------------------- + +Unfortunately, a CGI script will generally not run when you try it from the +command line, and a script that works perfectly from the command line may fail +mysteriously when run from the server. There's one reason why you should still +test your script from the command line: if it contains a syntax error, the +Python interpreter won't execute it at all, and the HTTP server will most likely +send a cryptic error to the client. + +Assuming your script has no syntax errors, yet it does not work, you have no +choice but to read the next section. + + +Debugging CGI scripts +--------------------- + +.. index:: pair: CGI; debugging + +First of all, check for trivial installation errors --- reading the section +above on installing your CGI script carefully can save you a lot of time. If +you wonder whether you have understood the installation procedure correctly, try +installing a copy of this module file (:file:`cgi.py`) as a CGI script. When +invoked as a script, the file will dump its environment and the contents of the +form in HTML form. Give it the right mode etc, and send it a request. If it's +installed in the standard :file:`cgi-bin` directory, it should be possible to +send it a request by entering a URL into your browser of the form:: + + http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home + +If this gives an error of type 404, the server cannot find the script -- perhaps +you need to install it in a different directory. If it gives another error, +there's an installation problem that you should fix before trying to go any +further. If you get a nicely formatted listing of the environment and form +content (in this example, the fields should be listed as "addr" with value "At +Home" and "name" with value "Joe Blow"), the :file:`cgi.py` script has been +installed correctly. If you follow the same procedure for your own script, you +should now be able to debug it. + +The next step could be to call the :mod:`cgi` module's :func:`test` function +from your script: replace its main code with the single statement :: + + cgi.test() + +This should produce the same results as those gotten from installing the +:file:`cgi.py` file itself. + +When an ordinary Python script raises an unhandled exception (for whatever +reason: of a typo in a module name, a file that can't be opened, etc.), the +Python interpreter prints a nice traceback and exits. While the Python +interpreter will still do this when your CGI script raises an exception, most +likely the traceback will end up in one of the HTTP server's log files, or be +discarded altogether. + +Fortunately, once you have managed to get your script to execute *some* code, +you can easily send tracebacks to the Web browser using the :mod:`cgitb` module. +If you haven't done so already, just add the line:: + + import cgitb; cgitb.enable() + +to the top of your script. Then try running it again; when a problem occurs, +you should see a detailed report that will likely make apparent the cause of the +crash. + +If you suspect that there may be a problem in importing the :mod:`cgitb` module, +you can use an even more robust approach (which only uses built-in modules):: + + import sys + sys.stderr = sys.stdout + print "Content-Type: text/plain" + print + ...your code here... + +This relies on the Python interpreter to print the traceback. The content type +of the output is set to plain text, which disables all HTML processing. If your +script works, the raw HTML will be displayed by your client. If it raises an +exception, most likely after the first two lines have been printed, a traceback +will be displayed. Because no HTML interpretation is going on, the traceback +will be readable. + + +Common problems and solutions +----------------------------- + +* Most HTTP servers buffer the output from CGI scripts until the script is + completed. This means that it is not possible to display a progress report on + the client's display while the script is running. + +* Check the installation instructions above. + +* Check the HTTP server's log files. (``tail -f logfile`` in a separate window + may be useful!) + +* Always check a script for syntax errors first, by doing something like + ``python script.py``. + +* If your script does not have any syntax errors, try adding ``import cgitb; + cgitb.enable()`` to the top of the script. + +* When invoking external programs, make sure they can be found. Usually, this + means using absolute path names --- :envvar:`PATH` is usually not set to a very + useful value in a CGI script. + +* When reading or writing external files, make sure they can be read or written + by the userid under which your CGI script will be running: this is typically the + userid under which the web server is running, or some explicitly specified + userid for a web server's ``suexec`` feature. + +* Don't try to give a CGI script a set-uid mode. This doesn't work on most + systems, and is a security liability as well. + +.. rubric:: Footnotes + +.. [#] Note that some recent versions of the HTML specification do state what order the + field values should be supplied in, but knowing whether a request was + received from a conforming browser, or even from a browser at all, is tedious + and error-prone. + diff --git a/Doc/library/cgihttpserver.rst b/Doc/library/cgihttpserver.rst new file mode 100644 index 0000000..4f27627 --- /dev/null +++ b/Doc/library/cgihttpserver.rst @@ -0,0 +1,73 @@ + +:mod:`CGIHTTPServer` --- CGI-capable HTTP request handler +========================================================= + +.. module:: CGIHTTPServer + :synopsis: This module provides a request handler for HTTP servers which can run CGI + scripts. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`CGIHTTPServer` module defines a request-handler class, interface +compatible with :class:`BaseHTTPServer.BaseHTTPRequestHandler` and inherits +behavior from :class:`SimpleHTTPServer.SimpleHTTPRequestHandler` but can also +run CGI scripts. + +.. note:: + + This module can run CGI scripts on Unix and Windows systems; on Mac OS it will + only be able to run Python scripts within the same process as itself. + +.. note:: + + CGI scripts run by the :class:`CGIHTTPRequestHandler` class cannot execute + redirects (HTTP code 302), because code 200 (script output follows) is sent + prior to execution of the CGI script. This pre-empts the status code. + +The :mod:`CGIHTTPServer` module defines the following class: + + +.. class:: CGIHTTPRequestHandler(request, client_address, server) + + This class is used to serve either files or output of CGI scripts from the + current directory and below. Note that mapping HTTP hierarchic structure to + local directory structure is exactly as in + :class:`SimpleHTTPServer.SimpleHTTPRequestHandler`. + + The class will however, run the CGI script, instead of serving it as a file, if + it guesses it to be a CGI script. Only directory-based CGI are used --- the + other common server configuration is to treat special extensions as denoting CGI + scripts. + + The :func:`do_GET` and :func:`do_HEAD` functions are modified to run CGI scripts + and serve the output, instead of serving files, if the request leads to + somewhere below the ``cgi_directories`` path. + +The :class:`CGIHTTPRequestHandler` defines the following data member: + + +.. attribute:: CGIHTTPRequestHandler.cgi_directories + + This defaults to ``['/cgi-bin', '/htbin']`` and describes directories to treat + as containing CGI scripts. + +The :class:`CGIHTTPRequestHandler` defines the following methods: + + +.. method:: CGIHTTPRequestHandler.do_POST() + + This method serves the ``'POST'`` request type, only allowed for CGI scripts. + Error 501, "Can only POST to CGI scripts", is output when trying to POST to a + non-CGI url. + +Note that CGI scripts will be run with UID of user nobody, for security reasons. +Problems with the CGI script will be translated to error 403. + +For example usage, see the implementation of the :func:`test` function. + + +.. seealso:: + + Module :mod:`BaseHTTPServer` + Base class implementation for Web server and request handler. + diff --git a/Doc/library/cgitb.rst b/Doc/library/cgitb.rst new file mode 100644 index 0000000..327cd17 --- /dev/null +++ b/Doc/library/cgitb.rst @@ -0,0 +1,64 @@ + +:mod:`cgitb` --- Traceback manager for CGI scripts +================================================== + +.. module:: cgitb + :synopsis: Configurable traceback handler for CGI scripts. +.. moduleauthor:: Ka-Ping Yee <ping@lfw.org> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. versionadded:: 2.2 + +.. index:: + single: CGI; exceptions + single: CGI; tracebacks + single: exceptions; in CGI scripts + single: tracebacks; in CGI scripts + +The :mod:`cgitb` module provides a special exception handler for Python scripts. +(Its name is a bit misleading. It was originally designed to display extensive +traceback information in HTML for CGI scripts. It was later generalized to also +display this information in plain text.) After this module is activated, if an +uncaught exception occurs, a detailed, formatted report will be displayed. The +report includes a traceback showing excerpts of the source code for each level, +as well as the values of the arguments and local variables to currently running +functions, to help you debug the problem. Optionally, you can save this +information to a file instead of sending it to the browser. + +To enable this feature, simply add one line to the top of your CGI script:: + + import cgitb; cgitb.enable() + +The options to the :func:`enable` function control whether the report is +displayed in the browser and whether the report is logged to a file for later +analysis. + + +.. function:: enable([display[, logdir[, context[, format]]]]) + + .. index:: single: excepthook() (in module sys) + + This function causes the :mod:`cgitb` module to take over the interpreter's + default handling for exceptions by setting the value of :attr:`sys.excepthook`. + + The optional argument *display* defaults to ``1`` and can be set to ``0`` to + suppress sending the traceback to the browser. If the argument *logdir* is + present, the traceback reports are written to files. The value of *logdir* + should be a directory where these files will be placed. The optional argument + *context* is the number of lines of context to display around the current line + of source code in the traceback; this defaults to ``5``. If the optional + argument *format* is ``"html"``, the output is formatted as HTML. Any other + value forces plain text output. The default value is ``"html"``. + + +.. function:: handler([info]) + + This function handles an exception using the default settings (that is, show a + report in the browser, but don't log to a file). This can be used when you've + caught an exception and want to report it using :mod:`cgitb`. The optional + *info* argument should be a 3-tuple containing an exception type, exception + value, and traceback object, exactly like the tuple returned by + :func:`sys.exc_info`. If the *info* argument is not supplied, the current + exception is obtained from :func:`sys.exc_info`. + diff --git a/Doc/library/chunk.rst b/Doc/library/chunk.rst new file mode 100644 index 0000000..2e1798d --- /dev/null +++ b/Doc/library/chunk.rst @@ -0,0 +1,130 @@ + +:mod:`chunk` --- Read IFF chunked data +====================================== + +.. module:: chunk + :synopsis: Module to read IFF chunks. +.. moduleauthor:: Sjoerd Mullender <sjoerd@acm.org> +.. sectionauthor:: Sjoerd Mullender <sjoerd@acm.org> + + +.. index:: + single: Audio Interchange File Format + single: AIFF + single: AIFF-C + single: Real Media File Format + single: RMFF + +This module provides an interface for reading files that use EA IFF 85 chunks. +[#]_ This format is used in at least the Audio Interchange File Format +(AIFF/AIFF-C) and the Real Media File Format (RMFF). The WAVE audio file format +is closely related and can also be read using this module. + +A chunk has the following structure: + ++---------+--------+-------------------------------+ +| Offset | Length | Contents | ++=========+========+===============================+ +| 0 | 4 | Chunk ID | ++---------+--------+-------------------------------+ +| 4 | 4 | Size of chunk in big-endian | +| | | byte order, not including the | +| | | header | ++---------+--------+-------------------------------+ +| 8 | *n* | Data bytes, where *n* is the | +| | | size given in the preceding | +| | | field | ++---------+--------+-------------------------------+ +| 8 + *n* | 0 or 1 | Pad byte needed if *n* is odd | +| | | and chunk alignment is used | ++---------+--------+-------------------------------+ + +The ID is a 4-byte string which identifies the type of chunk. + +The size field (a 32-bit value, encoded using big-endian byte order) gives the +size of the chunk data, not including the 8-byte header. + +Usually an IFF-type file consists of one or more chunks. The proposed usage of +the :class:`Chunk` class defined here is to instantiate an instance at the start +of each chunk and read from the instance until it reaches the end, after which a +new instance can be instantiated. At the end of the file, creating a new +instance will fail with a :exc:`EOFError` exception. + + +.. class:: Chunk(file[, align, bigendian, inclheader]) + + Class which represents a chunk. The *file* argument is expected to be a + file-like object. An instance of this class is specifically allowed. The + only method that is needed is :meth:`read`. If the methods :meth:`seek` and + :meth:`tell` are present and don't raise an exception, they are also used. + If these methods are present and raise an exception, they are expected to not + have altered the object. If the optional argument *align* is true, chunks + are assumed to be aligned on 2-byte boundaries. If *align* is false, no + alignment is assumed. The default value is true. If the optional argument + *bigendian* is false, the chunk size is assumed to be in little-endian order. + This is needed for WAVE audio files. The default value is true. If the + optional argument *inclheader* is true, the size given in the chunk header + includes the size of the header. The default value is false. + +A :class:`Chunk` object supports the following methods: + + +.. method:: Chunk.getname() + + Returns the name (ID) of the chunk. This is the first 4 bytes of the chunk. + + +.. method:: Chunk.getsize() + + Returns the size of the chunk. + + +.. method:: Chunk.close() + + Close and skip to the end of the chunk. This does not close the underlying + file. + +The remaining methods will raise :exc:`IOError` if called after the +:meth:`close` method has been called. + + +.. method:: Chunk.isatty() + + Returns ``False``. + + +.. method:: Chunk.seek(pos[, whence]) + + Set the chunk's current position. The *whence* argument is optional and + defaults to ``0`` (absolute file positioning); other values are ``1`` (seek + relative to the current position) and ``2`` (seek relative to the file's end). + There is no return value. If the underlying file does not allow seek, only + forward seeks are allowed. + + +.. method:: Chunk.tell() + + Return the current position into the chunk. + + +.. method:: Chunk.read([size]) + + Read at most *size* bytes from the chunk (less if the read hits the end of the + chunk before obtaining *size* bytes). If the *size* argument is negative or + omitted, read all data until the end of the chunk. The bytes are returned as a + string object. An empty string is returned when the end of the chunk is + encountered immediately. + + +.. method:: Chunk.skip() + + Skip to the end of the chunk. All further calls to :meth:`read` for the chunk + will return ``''``. If you are not interested in the contents of the chunk, + this method should be called so that the file points to the start of the next + chunk. + +.. rubric:: Footnotes + +.. [#] "EA IFF 85" Standard for Interchange Format Files, Jerry Morrison, Electronic + Arts, January 1985. + diff --git a/Doc/library/cmath.rst b/Doc/library/cmath.rst new file mode 100644 index 0000000..2bc162c --- /dev/null +++ b/Doc/library/cmath.rst @@ -0,0 +1,156 @@ + +:mod:`cmath` --- Mathematical functions for complex numbers +=========================================================== + +.. module:: cmath + :synopsis: Mathematical functions for complex numbers. + + +This module is always available. It provides access to mathematical functions +for complex numbers. The functions in this module accept integers, +floating-point numbers or complex numbers as arguments. They will also accept +any Python object that has either a :meth:`__complex__` or a :meth:`__float__` +method: these methods are used to convert the object to a complex or +floating-point number, respectively, and the function is then applied to the +result of the conversion. + +The functions are: + + +.. function:: acos(x) + + Return the arc cosine of *x*. There are two branch cuts: One extends right from + 1 along the real axis to ∞, continuous from below. The other extends left from + -1 along the real axis to -∞, continuous from above. + + +.. function:: acosh(x) + + Return the hyperbolic arc cosine of *x*. There is one branch cut, extending left + from 1 along the real axis to -∞, continuous from above. + + +.. function:: asin(x) + + Return the arc sine of *x*. This has the same branch cuts as :func:`acos`. + + +.. function:: asinh(x) + + Return the hyperbolic arc sine of *x*. There are two branch cuts, extending + left from ``±1j`` to ``±∞j``, both continuous from above. These branch cuts + should be considered a bug to be corrected in a future release. The correct + branch cuts should extend along the imaginary axis, one from ``1j`` up to + ``∞j`` and continuous from the right, and one from ``-1j`` down to ``-∞j`` + and continuous from the left. + + +.. function:: atan(x) + + Return the arc tangent of *x*. There are two branch cuts: One extends from + ``1j`` along the imaginary axis to ``∞j``, continuous from the left. The + other extends from ``-1j`` along the imaginary axis to ``-∞j``, continuous + from the left. (This should probably be changed so the upper cut becomes + continuous from the other side.) + + +.. function:: atanh(x) + + Return the hyperbolic arc tangent of *x*. There are two branch cuts: One + extends from ``1`` along the real axis to ``∞``, continuous from above. The + other extends from ``-1`` along the real axis to ``-∞``, continuous from + above. (This should probably be changed so the right cut becomes continuous + from the other side.) + + +.. function:: cos(x) + + Return the cosine of *x*. + + +.. function:: cosh(x) + + Return the hyperbolic cosine of *x*. + + +.. function:: exp(x) + + Return the exponential value ``e**x``. + + +.. function:: log(x[, base]) + + Returns the logarithm of *x* to the given *base*. If the *base* is not + specified, returns the natural logarithm of *x*. There is one branch cut, from 0 + along the negative real axis to -∞, continuous from above. + + .. versionchanged:: 2.4 + *base* argument added. + + +.. function:: log10(x) + + Return the base-10 logarithm of *x*. This has the same branch cut as + :func:`log`. + + +.. function:: sin(x) + + Return the sine of *x*. + + +.. function:: sinh(x) + + Return the hyperbolic sine of *x*. + + +.. function:: sqrt(x) + + Return the square root of *x*. This has the same branch cut as :func:`log`. + + +.. function:: tan(x) + + Return the tangent of *x*. + + +.. function:: tanh(x) + + Return the hyperbolic tangent of *x*. + +The module also defines two mathematical constants: + + +.. data:: pi + + The mathematical constant *pi*, as a float. + + +.. data:: e + + The mathematical constant *e*, as a float. + +.. index:: module: math + +Note that the selection of functions is similar, but not identical, to that in +module :mod:`math`. The reason for having two modules is that some users aren't +interested in complex numbers, and perhaps don't even know what they are. They +would rather have ``math.sqrt(-1)`` raise an exception than return a complex +number. Also note that the functions defined in :mod:`cmath` always return a +complex number, even if the answer can be expressed as a real number (in which +case the complex number has an imaginary part of zero). + +A note on branch cuts: They are curves along which the given function fails to +be continuous. They are a necessary feature of many complex functions. It is +assumed that if you need to compute with complex functions, you will understand +about branch cuts. Consult almost any (not too elementary) book on complex +variables for enlightenment. For information of the proper choice of branch +cuts for numerical purposes, a good reference should be the following: + + +.. seealso:: + + Kahan, W: Branch cuts for complex elementary functions; or, Much ado about + nothing's sign bit. In Iserles, A., and Powell, M. (eds.), The state of the art + in numerical analysis. Clarendon Press (1987) pp165-211. + diff --git a/Doc/library/cmd.rst b/Doc/library/cmd.rst new file mode 100644 index 0000000..9af08e2 --- /dev/null +++ b/Doc/library/cmd.rst @@ -0,0 +1,202 @@ + +:mod:`cmd` --- Support for line-oriented command interpreters +============================================================= + +.. module:: cmd + :synopsis: Build line-oriented command interpreters. +.. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com> + + +The :class:`Cmd` class provides a simple framework for writing line-oriented +command interpreters. These are often useful for test harnesses, administrative +tools, and prototypes that will later be wrapped in a more sophisticated +interface. + + +.. class:: Cmd([completekey[, stdin[, stdout]]]) + + A :class:`Cmd` instance or subclass instance is a line-oriented interpreter + framework. There is no good reason to instantiate :class:`Cmd` itself; rather, + it's useful as a superclass of an interpreter class you define yourself in order + to inherit :class:`Cmd`'s methods and encapsulate action methods. + + The optional argument *completekey* is the :mod:`readline` name of a completion + key; it defaults to :kbd:`Tab`. If *completekey* is not :const:`None` and + :mod:`readline` is available, command completion is done automatically. + + The optional arguments *stdin* and *stdout* specify the input and output file + objects that the Cmd instance or subclass instance will use for input and + output. If not specified, they will default to *sys.stdin* and *sys.stdout*. + + .. versionchanged:: 2.3 + The *stdin* and *stdout* parameters were added. + + +.. _cmd-objects: + +Cmd Objects +----------- + +A :class:`Cmd` instance has the following methods: + + +.. method:: Cmd.cmdloop([intro]) + + Repeatedly issue a prompt, accept input, parse an initial prefix off the + received input, and dispatch to action methods, passing them the remainder of + the line as argument. + + The optional argument is a banner or intro string to be issued before the first + prompt (this overrides the :attr:`intro` class member). + + If the :mod:`readline` module is loaded, input will automatically inherit + :program:`bash`\ -like history-list editing (e.g. :kbd:`Control-P` scrolls back + to the last command, :kbd:`Control-N` forward to the next one, :kbd:`Control-F` + moves the cursor to the right non-destructively, :kbd:`Control-B` moves the + cursor to the left non-destructively, etc.). + + An end-of-file on input is passed back as the string ``'EOF'``. + + An interpreter instance will recognize a command name ``foo`` if and only if it + has a method :meth:`do_foo`. As a special case, a line beginning with the + character ``'?'`` is dispatched to the method :meth:`do_help`. As another + special case, a line beginning with the character ``'!'`` is dispatched to the + method :meth:`do_shell` (if such a method is defined). + + This method will return when the :meth:`postcmd` method returns a true value. + The *stop* argument to :meth:`postcmd` is the return value from the command's + corresponding :meth:`do_\*` method. + + If completion is enabled, completing commands will be done automatically, and + completing of commands args is done by calling :meth:`complete_foo` with + arguments *text*, *line*, *begidx*, and *endidx*. *text* is the string prefix + we are attempting to match: all returned matches must begin with it. *line* is + the current input line with leading whitespace removed, *begidx* and *endidx* + are the beginning and ending indexes of the prefix text, which could be used to + provide different completion depending upon which position the argument is in. + + All subclasses of :class:`Cmd` inherit a predefined :meth:`do_help`. This + method, called with an argument ``'bar'``, invokes the corresponding method + :meth:`help_bar`. With no argument, :meth:`do_help` lists all available help + topics (that is, all commands with corresponding :meth:`help_\*` methods), and + also lists any undocumented commands. + + +.. method:: Cmd.onecmd(str) + + Interpret the argument as though it had been typed in response to the prompt. + This may be overridden, but should not normally need to be; see the + :meth:`precmd` and :meth:`postcmd` methods for useful execution hooks. The + return value is a flag indicating whether interpretation of commands by the + interpreter should stop. If there is a :meth:`do_\*` method for the command + *str*, the return value of that method is returned, otherwise the return value + from the :meth:`default` method is returned. + + +.. method:: Cmd.emptyline() + + Method called when an empty line is entered in response to the prompt. If this + method is not overridden, it repeats the last nonempty command entered. + + +.. method:: Cmd.default(line) + + Method called on an input line when the command prefix is not recognized. If + this method is not overridden, it prints an error message and returns. + + +.. method:: Cmd.completedefault(text, line, begidx, endidx) + + Method called to complete an input line when no command-specific + :meth:`complete_\*` method is available. By default, it returns an empty list. + + +.. method:: Cmd.precmd(line) + + Hook method executed just before the command line *line* is interpreted, but + after the input prompt is generated and issued. This method is a stub in + :class:`Cmd`; it exists to be overridden by subclasses. The return value is + used as the command which will be executed by the :meth:`onecmd` method; the + :meth:`precmd` implementation may re-write the command or simply return *line* + unchanged. + + +.. method:: Cmd.postcmd(stop, line) + + Hook method executed just after a command dispatch is finished. This method is + a stub in :class:`Cmd`; it exists to be overridden by subclasses. *line* is the + command line which was executed, and *stop* is a flag which indicates whether + execution will be terminated after the call to :meth:`postcmd`; this will be the + return value of the :meth:`onecmd` method. The return value of this method will + be used as the new value for the internal flag which corresponds to *stop*; + returning false will cause interpretation to continue. + + +.. method:: Cmd.preloop() + + Hook method executed once when :meth:`cmdloop` is called. This method is a stub + in :class:`Cmd`; it exists to be overridden by subclasses. + + +.. method:: Cmd.postloop() + + Hook method executed once when :meth:`cmdloop` is about to return. This method + is a stub in :class:`Cmd`; it exists to be overridden by subclasses. + +Instances of :class:`Cmd` subclasses have some public instance variables: + + +.. attribute:: Cmd.prompt + + The prompt issued to solicit input. + + +.. attribute:: Cmd.identchars + + The string of characters accepted for the command prefix. + + +.. attribute:: Cmd.lastcmd + + The last nonempty command prefix seen. + + +.. attribute:: Cmd.intro + + A string to issue as an intro or banner. May be overridden by giving the + :meth:`cmdloop` method an argument. + + +.. attribute:: Cmd.doc_header + + The header to issue if the help output has a section for documented commands. + + +.. attribute:: Cmd.misc_header + + The header to issue if the help output has a section for miscellaneous help + topics (that is, there are :meth:`help_\*` methods without corresponding + :meth:`do_\*` methods). + + +.. attribute:: Cmd.undoc_header + + The header to issue if the help output has a section for undocumented commands + (that is, there are :meth:`do_\*` methods without corresponding :meth:`help_\*` + methods). + + +.. attribute:: Cmd.ruler + + The character used to draw separator lines under the help-message headers. If + empty, no ruler line is drawn. It defaults to ``'='``. + + +.. attribute:: Cmd.use_rawinput + + A flag, defaulting to true. If true, :meth:`cmdloop` uses :func:`input` to + display a prompt and read the next command; if false, :meth:`sys.stdout.write` + and :meth:`sys.stdin.readline` are used. (This means that by importing + :mod:`readline`, on systems that support it, the interpreter will automatically + support :program:`Emacs`\ -like line editing and command-history keystrokes.) + diff --git a/Doc/library/code.rst b/Doc/library/code.rst new file mode 100644 index 0000000..4e00639 --- /dev/null +++ b/Doc/library/code.rst @@ -0,0 +1,167 @@ + +:mod:`code` --- Interpreter base classes +======================================== + +.. module:: code + :synopsis: Facilities to implement read-eval-print loops. + + + +The ``code`` module provides facilities to implement read-eval-print loops in +Python. Two classes and convenience functions are included which can be used to +build applications which provide an interactive interpreter prompt. + + +.. class:: InteractiveInterpreter([locals]) + + This class deals with parsing and interpreter state (the user's namespace); it + does not deal with input buffering or prompting or input file naming (the + filename is always passed in explicitly). The optional *locals* argument + specifies the dictionary in which code will be executed; it defaults to a newly + created dictionary with key ``'__name__'`` set to ``'__console__'`` and key + ``'__doc__'`` set to ``None``. + + +.. class:: InteractiveConsole([locals[, filename]]) + + Closely emulate the behavior of the interactive Python interpreter. This class + builds on :class:`InteractiveInterpreter` and adds prompting using the familiar + ``sys.ps1`` and ``sys.ps2``, and input buffering. + + +.. function:: interact([banner[, readfunc[, local]]]) + + Convenience function to run a read-eval-print loop. This creates a new instance + of :class:`InteractiveConsole` and sets *readfunc* to be used as the + :meth:`raw_input` method, if provided. If *local* is provided, it is passed to + the :class:`InteractiveConsole` constructor for use as the default namespace for + the interpreter loop. The :meth:`interact` method of the instance is then run + with *banner* passed as the banner to use, if provided. The console object is + discarded after use. + + +.. function:: compile_command(source[, filename[, symbol]]) + + This function is useful for programs that want to emulate Python's interpreter + main loop (a.k.a. the read-eval-print loop). The tricky part is to determine + when the user has entered an incomplete command that can be completed by + entering more text (as opposed to a complete command or a syntax error). This + function *almost* always makes the same decision as the real interpreter main + loop. + + *source* is the source string; *filename* is the optional filename from which + source was read, defaulting to ``'<input>'``; and *symbol* is the optional + grammar start symbol, which should be either ``'single'`` (the default) or + ``'eval'``. + + Returns a code object (the same as ``compile(source, filename, symbol)``) if the + command is complete and valid; ``None`` if the command is incomplete; raises + :exc:`SyntaxError` if the command is complete and contains a syntax error, or + raises :exc:`OverflowError` or :exc:`ValueError` if the command contains an + invalid literal. + + +.. _interpreter-objects: + +Interactive Interpreter Objects +------------------------------- + + +.. method:: InteractiveInterpreter.runsource(source[, filename[, symbol]]) + + Compile and run some source in the interpreter. Arguments are the same as for + :func:`compile_command`; the default for *filename* is ``'<input>'``, and for + *symbol* is ``'single'``. One several things can happen: + + * The input is incorrect; :func:`compile_command` raised an exception + (:exc:`SyntaxError` or :exc:`OverflowError`). A syntax traceback will be + printed by calling the :meth:`showsyntaxerror` method. :meth:`runsource` + returns ``False``. + + * The input is incomplete, and more input is required; :func:`compile_command` + returned ``None``. :meth:`runsource` returns ``True``. + + * The input is complete; :func:`compile_command` returned a code object. The + code is executed by calling the :meth:`runcode` (which also handles run-time + exceptions, except for :exc:`SystemExit`). :meth:`runsource` returns ``False``. + + The return value can be used to decide whether to use ``sys.ps1`` or ``sys.ps2`` + to prompt the next line. + + +.. method:: InteractiveInterpreter.runcode(code) + + Execute a code object. When an exception occurs, :meth:`showtraceback` is called + to display a traceback. All exceptions are caught except :exc:`SystemExit`, + which is allowed to propagate. + + A note about :exc:`KeyboardInterrupt`: this exception may occur elsewhere in + this code, and may not always be caught. The caller should be prepared to deal + with it. + + +.. method:: InteractiveInterpreter.showsyntaxerror([filename]) + + Display the syntax error that just occurred. This does not display a stack + trace because there isn't one for syntax errors. If *filename* is given, it is + stuffed into the exception instead of the default filename provided by Python's + parser, because it always uses ``'<string>'`` when reading from a string. The + output is written by the :meth:`write` method. + + +.. method:: InteractiveInterpreter.showtraceback() + + Display the exception that just occurred. We remove the first stack item + because it is within the interpreter object implementation. The output is + written by the :meth:`write` method. + + +.. method:: InteractiveInterpreter.write(data) + + Write a string to the standard error stream (``sys.stderr``). Derived classes + should override this to provide the appropriate output handling as needed. + + +.. _console-objects: + +Interactive Console Objects +--------------------------- + +The :class:`InteractiveConsole` class is a subclass of +:class:`InteractiveInterpreter`, and so offers all the methods of the +interpreter objects as well as the following additions. + + +.. method:: InteractiveConsole.interact([banner]) + + Closely emulate the interactive Python console. The optional banner argument + specify the banner to print before the first interaction; by default it prints a + banner similar to the one printed by the standard Python interpreter, followed + by the class name of the console object in parentheses (so as not to confuse + this with the real interpreter -- since it's so close!). + + +.. method:: InteractiveConsole.push(line) + + Push a line of source text to the interpreter. The line should not have a + trailing newline; it may have internal newlines. The line is appended to a + buffer and the interpreter's :meth:`runsource` method is called with the + concatenated contents of the buffer as source. If this indicates that the + command was executed or invalid, the buffer is reset; otherwise, the command is + incomplete, and the buffer is left as it was after the line was appended. The + return value is ``True`` if more input is required, ``False`` if the line was + dealt with in some way (this is the same as :meth:`runsource`). + + +.. method:: InteractiveConsole.resetbuffer() + + Remove any unhandled source text from the input buffer. + + +.. method:: InteractiveConsole.raw_input([prompt]) + + Write a prompt and read a line. The returned line does not include the trailing + newline. When the user enters the EOF key sequence, :exc:`EOFError` is raised. + The base implementation reads from ``sys.stdin``; a subclass may replace this + with a different implementation. + diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst new file mode 100644 index 0000000..38264df --- /dev/null +++ b/Doc/library/codecs.rst @@ -0,0 +1,1230 @@ + +:mod:`codecs` --- Codec registry and base classes +================================================= + +.. module:: codecs + :synopsis: Encode and decode data and streams. +.. moduleauthor:: Marc-Andre Lemburg <mal@lemburg.com> +.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. index:: + single: Unicode + single: Codecs + pair: Codecs; encode + pair: Codecs; decode + single: streams + pair: stackable; streams + +This module defines base classes for standard Python codecs (encoders and +decoders) and provides access to the internal Python codec registry which +manages the codec and error handling lookup process. + +It defines the following functions: + + +.. function:: register(search_function) + + Register a codec search function. Search functions are expected to take one + argument, the encoding name in all lower case letters, and return a + :class:`CodecInfo` object having the following attributes: + + * ``name`` The name of the encoding; + + * ``encoder`` The stateless encoding function; + + * ``decoder`` The stateless decoding function; + + * ``incrementalencoder`` An incremental encoder class or factory function; + + * ``incrementaldecoder`` An incremental decoder class or factory function; + + * ``streamwriter`` A stream writer class or factory function; + + * ``streamreader`` A stream reader class or factory function. + + The various functions or classes take the following arguments: + + *encoder* and *decoder*: These must be functions or methods which have the same + interface as the :meth:`encode`/:meth:`decode` methods of Codec instances (see + Codec Interface). The functions/methods are expected to work in a stateless + mode. + + *incrementalencoder* and *incrementalencoder*: These have to be factory + functions providing the following interface: + + ``factory(errors='strict')`` + + The factory functions must return objects providing the interfaces defined by + the base classes :class:`IncrementalEncoder` and :class:`IncrementalEncoder`, + respectively. Incremental codecs can maintain state. + + *streamreader* and *streamwriter*: These have to be factory functions providing + the following interface: + + ``factory(stream, errors='strict')`` + + The factory functions must return objects providing the interfaces defined by + the base classes :class:`StreamWriter` and :class:`StreamReader`, respectively. + Stream codecs can maintain state. + + Possible values for errors are ``'strict'`` (raise an exception in case of an + encoding error), ``'replace'`` (replace malformed data with a suitable + replacement marker, such as ``'?'``), ``'ignore'`` (ignore malformed data and + continue without further notice), ``'xmlcharrefreplace'`` (replace with the + appropriate XML character reference (for encoding only)) and + ``'backslashreplace'`` (replace with backslashed escape sequences (for encoding + only)) as well as any other error handling name defined via + :func:`register_error`. + + In case a search function cannot find a given encoding, it should return + ``None``. + + +.. function:: lookup(encoding) + + Looks up the codec info in the Python codec registry and returns a + :class:`CodecInfo` object as defined above. + + Encodings are first looked up in the registry's cache. If not found, the list of + registered search functions is scanned. If no :class:`CodecInfo` object is + found, a :exc:`LookupError` is raised. Otherwise, the :class:`CodecInfo` object + is stored in the cache and returned to the caller. + +To simplify access to the various codecs, the module provides these additional +functions which use :func:`lookup` for the codec lookup: + + +.. function:: getencoder(encoding) + + Look up the codec for the given encoding and return its encoder function. + + Raises a :exc:`LookupError` in case the encoding cannot be found. + + +.. function:: getdecoder(encoding) + + Look up the codec for the given encoding and return its decoder function. + + Raises a :exc:`LookupError` in case the encoding cannot be found. + + +.. function:: getincrementalencoder(encoding) + + Look up the codec for the given encoding and return its incremental encoder + class or factory function. + + Raises a :exc:`LookupError` in case the encoding cannot be found or the codec + doesn't support an incremental encoder. + + .. versionadded:: 2.5 + + +.. function:: getincrementaldecoder(encoding) + + Look up the codec for the given encoding and return its incremental decoder + class or factory function. + + Raises a :exc:`LookupError` in case the encoding cannot be found or the codec + doesn't support an incremental decoder. + + .. versionadded:: 2.5 + + +.. function:: getreader(encoding) + + Look up the codec for the given encoding and return its StreamReader class or + factory function. + + Raises a :exc:`LookupError` in case the encoding cannot be found. + + +.. function:: getwriter(encoding) + + Look up the codec for the given encoding and return its StreamWriter class or + factory function. + + Raises a :exc:`LookupError` in case the encoding cannot be found. + + +.. function:: register_error(name, error_handler) + + Register the error handling function *error_handler* under the name *name*. + *error_handler* will be called during encoding and decoding in case of an error, + when *name* is specified as the errors parameter. + + For encoding *error_handler* will be called with a :exc:`UnicodeEncodeError` + instance, which contains information about the location of the error. The error + handler must either raise this or a different exception or return a tuple with a + replacement for the unencodable part of the input and a position where encoding + should continue. The encoder will encode the replacement and continue encoding + the original input at the specified position. Negative position values will be + treated as being relative to the end of the input string. If the resulting + position is out of bound an :exc:`IndexError` will be raised. + + Decoding and translating works similar, except :exc:`UnicodeDecodeError` or + :exc:`UnicodeTranslateError` will be passed to the handler and that the + replacement from the error handler will be put into the output directly. + + +.. function:: lookup_error(name) + + Return the error handler previously registered under the name *name*. + + Raises a :exc:`LookupError` in case the handler cannot be found. + + +.. function:: strict_errors(exception) + + Implements the ``strict`` error handling. + + +.. function:: replace_errors(exception) + + Implements the ``replace`` error handling. + + +.. function:: ignore_errors(exception) + + Implements the ``ignore`` error handling. + + +.. function:: xmlcharrefreplace_errors_errors(exception) + + Implements the ``xmlcharrefreplace`` error handling. + + +.. function:: backslashreplace_errors_errors(exception) + + Implements the ``backslashreplace`` error handling. + +To simplify working with encoded files or stream, the module also defines these +utility functions: + + +.. function:: open(filename, mode[, encoding[, errors[, buffering]]]) + + Open an encoded file using the given *mode* and return a wrapped version + providing transparent encoding/decoding. + + .. note:: + + The wrapped version will only accept the object format defined by the codecs, + i.e. Unicode objects for most built-in codecs. Output is also codec-dependent + and will usually be Unicode as well. + + *encoding* specifies the encoding which is to be used for the file. + + *errors* may be given to define the error handling. It defaults to ``'strict'`` + which causes a :exc:`ValueError` to be raised in case an encoding error occurs. + + *buffering* has the same meaning as for the built-in :func:`open` function. It + defaults to line buffered. + + +.. function:: EncodedFile(file, input[, output[, errors]]) + + Return a wrapped version of file which provides transparent encoding + translation. + + Strings written to the wrapped file are interpreted according to the given + *input* encoding and then written to the original file as strings using the + *output* encoding. The intermediate encoding will usually be Unicode but depends + on the specified codecs. + + If *output* is not given, it defaults to *input*. + + *errors* may be given to define the error handling. It defaults to ``'strict'``, + which causes :exc:`ValueError` to be raised in case an encoding error occurs. + + +.. function:: iterencode(iterable, encoding[, errors]) + + Uses an incremental encoder to iteratively encode the input provided by + *iterable*. This function is a generator. *errors* (as well as any other keyword + argument) is passed through to the incremental encoder. + + .. versionadded:: 2.5 + + +.. function:: iterdecode(iterable, encoding[, errors]) + + Uses an incremental decoder to iteratively decode the input provided by + *iterable*. This function is a generator. *errors* (as well as any other keyword + argument) is passed through to the incremental decoder. + + .. versionadded:: 2.5 + +The module also provides the following constants which are useful for reading +and writing to platform dependent files: + + +.. data:: BOM + BOM_BE + BOM_LE + BOM_UTF8 + BOM_UTF16 + BOM_UTF16_BE + BOM_UTF16_LE + BOM_UTF32 + BOM_UTF32_BE + BOM_UTF32_LE + + These constants define various encodings of the Unicode byte order mark (BOM) + used in UTF-16 and UTF-32 data streams to indicate the byte order used in the + stream or file and in UTF-8 as a Unicode signature. :const:`BOM_UTF16` is either + :const:`BOM_UTF16_BE` or :const:`BOM_UTF16_LE` depending on the platform's + native byte order, :const:`BOM` is an alias for :const:`BOM_UTF16`, + :const:`BOM_LE` for :const:`BOM_UTF16_LE` and :const:`BOM_BE` for + :const:`BOM_UTF16_BE`. The others represent the BOM in UTF-8 and UTF-32 + encodings. + + +.. _codec-base-classes: + +Codec Base Classes +------------------ + +The :mod:`codecs` module defines a set of base classes which define the +interface and can also be used to easily write you own codecs for use in Python. + +Each codec has to define four interfaces to make it usable as codec in Python: +stateless encoder, stateless decoder, stream reader and stream writer. The +stream reader and writers typically reuse the stateless encoder/decoder to +implement the file protocols. + +The :class:`Codec` class defines the interface for stateless encoders/decoders. + +To simplify and standardize error handling, the :meth:`encode` and +:meth:`decode` methods may implement different error handling schemes by +providing the *errors* string argument. The following string values are defined +and implemented by all standard Python codecs: + ++-------------------------+-----------------------------------------------+ +| Value | Meaning | ++=========================+===============================================+ +| ``'strict'`` | Raise :exc:`UnicodeError` (or a subclass); | +| | this is the default. | ++-------------------------+-----------------------------------------------+ +| ``'ignore'`` | Ignore the character and continue with the | +| | next. | ++-------------------------+-----------------------------------------------+ +| ``'replace'`` | Replace with a suitable replacement | +| | character; Python will use the official | +| | U+FFFD REPLACEMENT CHARACTER for the built-in | +| | Unicode codecs on decoding and '?' on | +| | encoding. | ++-------------------------+-----------------------------------------------+ +| ``'xmlcharrefreplace'`` | Replace with the appropriate XML character | +| | reference (only for encoding). | ++-------------------------+-----------------------------------------------+ +| ``'backslashreplace'`` | Replace with backslashed escape sequences | +| | (only for encoding). | ++-------------------------+-----------------------------------------------+ + +The set of allowed values can be extended via :meth:`register_error`. + + +.. _codec-objects: + +Codec Objects +^^^^^^^^^^^^^ + +The :class:`Codec` class defines these methods which also define the function +interfaces of the stateless encoder and decoder: + + +.. method:: Codec.encode(input[, errors]) + + Encodes the object *input* and returns a tuple (output object, length consumed). + While codecs are not restricted to use with Unicode, in a Unicode context, + encoding converts a Unicode object to a plain string using a particular + character set encoding (e.g., ``cp1252`` or ``iso-8859-1``). + + *errors* defines the error handling to apply. It defaults to ``'strict'`` + handling. + + The method may not store state in the :class:`Codec` instance. Use + :class:`StreamCodec` for codecs which have to keep state in order to make + encoding/decoding efficient. + + The encoder must be able to handle zero length input and return an empty object + of the output object type in this situation. + + +.. method:: Codec.decode(input[, errors]) + + Decodes the object *input* and returns a tuple (output object, length consumed). + In a Unicode context, decoding converts a plain string encoded using a + particular character set encoding to a Unicode object. + + *input* must be an object which provides the ``bf_getreadbuf`` buffer slot. + Python strings, buffer objects and memory mapped files are examples of objects + providing this slot. + + *errors* defines the error handling to apply. It defaults to ``'strict'`` + handling. + + The method may not store state in the :class:`Codec` instance. Use + :class:`StreamCodec` for codecs which have to keep state in order to make + encoding/decoding efficient. + + The decoder must be able to handle zero length input and return an empty object + of the output object type in this situation. + +The :class:`IncrementalEncoder` and :class:`IncrementalDecoder` classes provide +the basic interface for incremental encoding and decoding. Encoding/decoding the +input isn't done with one call to the stateless encoder/decoder function, but +with multiple calls to the :meth:`encode`/:meth:`decode` method of the +incremental encoder/decoder. The incremental encoder/decoder keeps track of the +encoding/decoding process during method calls. + +The joined output of calls to the :meth:`encode`/:meth:`decode` method is the +same as if all the single inputs were joined into one, and this input was +encoded/decoded with the stateless encoder/decoder. + + +.. _incremental-encoder-objects: + +IncrementalEncoder Objects +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. versionadded:: 2.5 + +The :class:`IncrementalEncoder` class is used for encoding an input in multiple +steps. It defines the following methods which every incremental encoder must +define in order to be compatible with the Python codec registry. + + +.. class:: IncrementalEncoder([errors]) + + Constructor for an :class:`IncrementalEncoder` instance. + + All incremental encoders must provide this constructor interface. They are free + to add additional keyword arguments, but only the ones defined here are used by + the Python codec registry. + + The :class:`IncrementalEncoder` may implement different error handling schemes + by providing the *errors* keyword argument. These parameters are predefined: + + * ``'strict'`` Raise :exc:`ValueError` (or a subclass); this is the default. + + * ``'ignore'`` Ignore the character and continue with the next. + + * ``'replace'`` Replace with a suitable replacement character + + * ``'xmlcharrefreplace'`` Replace with the appropriate XML character reference + + * ``'backslashreplace'`` Replace with backslashed escape sequences. + + The *errors* argument will be assigned to an attribute of the same name. + Assigning to this attribute makes it possible to switch between different error + handling strategies during the lifetime of the :class:`IncrementalEncoder` + object. + + The set of allowed values for the *errors* argument can be extended with + :func:`register_error`. + + +.. method:: IncrementalEncoder.encode(object[, final]) + + Encodes *object* (taking the current state of the encoder into account) and + returns the resulting encoded object. If this is the last call to :meth:`encode` + *final* must be true (the default is false). + + +.. method:: IncrementalEncoder.reset() + + Reset the encoder to the initial state. + + +.. method:: IncrementalEncoder.getstate() + + Return the current state of the encoder which must be an integer. The + implementation should make sure that ``0`` is the most common state. (States + that are more complicated than integers can be converted into an integer by + marshaling/pickling the state and encoding the bytes of the resulting string + into an integer). + + .. versionadded:: 3.0 + + +.. method:: IncrementalEncoder.setstate(state) + + Set the state of the encoder to *state*. *state* must be an encoder state + returned by :meth:`getstate`. + + .. versionadded:: 3.0 + + +.. _incremental-decoder-objects: + +IncrementalDecoder Objects +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The :class:`IncrementalDecoder` class is used for decoding an input in multiple +steps. It defines the following methods which every incremental decoder must +define in order to be compatible with the Python codec registry. + + +.. class:: IncrementalDecoder([errors]) + + Constructor for an :class:`IncrementalDecoder` instance. + + All incremental decoders must provide this constructor interface. They are free + to add additional keyword arguments, but only the ones defined here are used by + the Python codec registry. + + The :class:`IncrementalDecoder` may implement different error handling schemes + by providing the *errors* keyword argument. These parameters are predefined: + + * ``'strict'`` Raise :exc:`ValueError` (or a subclass); this is the default. + + * ``'ignore'`` Ignore the character and continue with the next. + + * ``'replace'`` Replace with a suitable replacement character. + + The *errors* argument will be assigned to an attribute of the same name. + Assigning to this attribute makes it possible to switch between different error + handling strategies during the lifetime of the :class:`IncrementalEncoder` + object. + + The set of allowed values for the *errors* argument can be extended with + :func:`register_error`. + + +.. method:: IncrementalDecoder.decode(object[, final]) + + Decodes *object* (taking the current state of the decoder into account) and + returns the resulting decoded object. If this is the last call to :meth:`decode` + *final* must be true (the default is false). If *final* is true the decoder must + decode the input completely and must flush all buffers. If this isn't possible + (e.g. because of incomplete byte sequences at the end of the input) it must + initiate error handling just like in the stateless case (which might raise an + exception). + + +.. method:: IncrementalDecoder.reset() + + Reset the decoder to the initial state. + + +.. method:: IncrementalDecoder.getstate() + + Return the current state of the decoder. This must be a tuple with two items, + the first must be the buffer containing the still undecoded input. The second + must be an integer and can be additional state info. (The implementation should + make sure that ``0`` is the most common additional state info.) If this + additional state info is ``0`` it must be possible to set the decoder to the + state which has no input buffered and ``0`` as the additional state info, so + that feeding the previously buffered input to the decoder returns it to the + previous state without producing any output. (Additional state info that is more + complicated than integers can be converted into an integer by + marshaling/pickling the info and encoding the bytes of the resulting string into + an integer.) + + .. versionadded:: 3.0 + + +.. method:: IncrementalDecoder.setstate(state) + + Set the state of the encoder to *state*. *state* must be a decoder state + returned by :meth:`getstate`. + + .. versionadded:: 3.0 + +The :class:`StreamWriter` and :class:`StreamReader` classes provide generic +working interfaces which can be used to implement new encoding submodules very +easily. See :mod:`encodings.utf_8` for an example of how this is done. + + +.. _stream-writer-objects: + +StreamWriter Objects +^^^^^^^^^^^^^^^^^^^^ + +The :class:`StreamWriter` class is a subclass of :class:`Codec` and defines the +following methods which every stream writer must define in order to be +compatible with the Python codec registry. + + +.. class:: StreamWriter(stream[, errors]) + + Constructor for a :class:`StreamWriter` instance. + + All stream writers must provide this constructor interface. They are free to add + additional keyword arguments, but only the ones defined here are used by the + Python codec registry. + + *stream* must be a file-like object open for writing binary data. + + The :class:`StreamWriter` may implement different error handling schemes by + providing the *errors* keyword argument. These parameters are predefined: + + * ``'strict'`` Raise :exc:`ValueError` (or a subclass); this is the default. + + * ``'ignore'`` Ignore the character and continue with the next. + + * ``'replace'`` Replace with a suitable replacement character + + * ``'xmlcharrefreplace'`` Replace with the appropriate XML character reference + + * ``'backslashreplace'`` Replace with backslashed escape sequences. + + The *errors* argument will be assigned to an attribute of the same name. + Assigning to this attribute makes it possible to switch between different error + handling strategies during the lifetime of the :class:`StreamWriter` object. + + The set of allowed values for the *errors* argument can be extended with + :func:`register_error`. + + +.. method:: StreamWriter.write(object) + + Writes the object's contents encoded to the stream. + + +.. method:: StreamWriter.writelines(list) + + Writes the concatenated list of strings to the stream (possibly by reusing the + :meth:`write` method). + + +.. method:: StreamWriter.reset() + + Flushes and resets the codec buffers used for keeping state. + + Calling this method should ensure that the data on the output is put into a + clean state that allows appending of new fresh data without having to rescan the + whole stream to recover state. + +In addition to the above methods, the :class:`StreamWriter` must also inherit +all other methods and attributes from the underlying stream. + + +.. _stream-reader-objects: + +StreamReader Objects +^^^^^^^^^^^^^^^^^^^^ + +The :class:`StreamReader` class is a subclass of :class:`Codec` and defines the +following methods which every stream reader must define in order to be +compatible with the Python codec registry. + + +.. class:: StreamReader(stream[, errors]) + + Constructor for a :class:`StreamReader` instance. + + All stream readers must provide this constructor interface. They are free to add + additional keyword arguments, but only the ones defined here are used by the + Python codec registry. + + *stream* must be a file-like object open for reading (binary) data. + + The :class:`StreamReader` may implement different error handling schemes by + providing the *errors* keyword argument. These parameters are defined: + + * ``'strict'`` Raise :exc:`ValueError` (or a subclass); this is the default. + + * ``'ignore'`` Ignore the character and continue with the next. + + * ``'replace'`` Replace with a suitable replacement character. + + The *errors* argument will be assigned to an attribute of the same name. + Assigning to this attribute makes it possible to switch between different error + handling strategies during the lifetime of the :class:`StreamReader` object. + + The set of allowed values for the *errors* argument can be extended with + :func:`register_error`. + + +.. method:: StreamReader.read([size[, chars, [firstline]]]) + + Decodes data from the stream and returns the resulting object. + + *chars* indicates the number of characters to read from the stream. :func:`read` + will never return more than *chars* characters, but it might return less, if + there are not enough characters available. + + *size* indicates the approximate maximum number of bytes to read from the stream + for decoding purposes. The decoder can modify this setting as appropriate. The + default value -1 indicates to read and decode as much as possible. *size* is + intended to prevent having to decode huge files in one step. + + *firstline* indicates that it would be sufficient to only return the first line, + if there are decoding errors on later lines. + + The method should use a greedy read strategy meaning that it should read as much + data as is allowed within the definition of the encoding and the given size, + e.g. if optional encoding endings or state markers are available on the stream, + these should be read too. + + .. versionchanged:: 2.4 + *chars* argument added. + + .. versionchanged:: 2.4.2 + *firstline* argument added. + + +.. method:: StreamReader.readline([size[, keepends]]) + + Read one line from the input stream and return the decoded data. + + *size*, if given, is passed as size argument to the stream's :meth:`readline` + method. + + If *keepends* is false line-endings will be stripped from the lines returned. + + .. versionchanged:: 2.4 + *keepends* argument added. + + +.. method:: StreamReader.readlines([sizehint[, keepends]]) + + Read all lines available on the input stream and return them as a list of lines. + + Line-endings are implemented using the codec's decoder method and are included + in the list entries if *keepends* is true. + + *sizehint*, if given, is passed as the *size* argument to the stream's + :meth:`read` method. + + +.. method:: StreamReader.reset() + + Resets the codec buffers used for keeping state. + + Note that no stream repositioning should take place. This method is primarily + intended to be able to recover from decoding errors. + +In addition to the above methods, the :class:`StreamReader` must also inherit +all other methods and attributes from the underlying stream. + +The next two base classes are included for convenience. They are not needed by +the codec registry, but may provide useful in practice. + + +.. _stream-reader-writer: + +StreamReaderWriter Objects +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The :class:`StreamReaderWriter` allows wrapping streams which work in both read +and write modes. + +The design is such that one can use the factory functions returned by the +:func:`lookup` function to construct the instance. + + +.. class:: StreamReaderWriter(stream, Reader, Writer, errors) + + Creates a :class:`StreamReaderWriter` instance. *stream* must be a file-like + object. *Reader* and *Writer* must be factory functions or classes providing the + :class:`StreamReader` and :class:`StreamWriter` interface resp. Error handling + is done in the same way as defined for the stream readers and writers. + +:class:`StreamReaderWriter` instances define the combined interfaces of +:class:`StreamReader` and :class:`StreamWriter` classes. They inherit all other +methods and attributes from the underlying stream. + + +.. _stream-recoder-objects: + +StreamRecoder Objects +^^^^^^^^^^^^^^^^^^^^^ + +The :class:`StreamRecoder` provide a frontend - backend view of encoding data +which is sometimes useful when dealing with different encoding environments. + +The design is such that one can use the factory functions returned by the +:func:`lookup` function to construct the instance. + + +.. class:: StreamRecoder(stream, encode, decode, Reader, Writer, errors) + + Creates a :class:`StreamRecoder` instance which implements a two-way conversion: + *encode* and *decode* work on the frontend (the input to :meth:`read` and output + of :meth:`write`) while *Reader* and *Writer* work on the backend (reading and + writing to the stream). + + You can use these objects to do transparent direct recodings from e.g. Latin-1 + to UTF-8 and back. + + *stream* must be a file-like object. + + *encode*, *decode* must adhere to the :class:`Codec` interface. *Reader*, + *Writer* must be factory functions or classes providing objects of the + :class:`StreamReader` and :class:`StreamWriter` interface respectively. + + *encode* and *decode* are needed for the frontend translation, *Reader* and + *Writer* for the backend translation. The intermediate format used is + determined by the two sets of codecs, e.g. the Unicode codecs will use Unicode + as the intermediate encoding. + + Error handling is done in the same way as defined for the stream readers and + writers. + +:class:`StreamRecoder` instances define the combined interfaces of +:class:`StreamReader` and :class:`StreamWriter` classes. They inherit all other +methods and attributes from the underlying stream. + + +.. _encodings-overview: + +Encodings and Unicode +--------------------- + +Unicode strings are stored internally as sequences of codepoints (to be precise +as :ctype:`Py_UNICODE` arrays). Depending on the way Python is compiled (either +via :option:`--enable-unicode=ucs2` or :option:`--enable-unicode=ucs4`, with the +former being the default) :ctype:`Py_UNICODE` is either a 16-bit or 32-bit data +type. Once a Unicode object is used outside of CPU and memory, CPU endianness +and how these arrays are stored as bytes become an issue. Transforming a +unicode object into a sequence of bytes is called encoding and recreating the +unicode object from the sequence of bytes is known as decoding. There are many +different methods for how this transformation can be done (these methods are +also called encodings). The simplest method is to map the codepoints 0-255 to +the bytes ``0x0``-``0xff``. This means that a unicode object that contains +codepoints above ``U+00FF`` can't be encoded with this method (which is called +``'latin-1'`` or ``'iso-8859-1'``). :func:`unicode.encode` will raise a +:exc:`UnicodeEncodeError` that looks like this: ``UnicodeEncodeError: 'latin-1' +codec can't encode character u'\u1234' in position 3: ordinal not in +range(256)``. + +There's another group of encodings (the so called charmap encodings) that choose +a different subset of all unicode code points and how these codepoints are +mapped to the bytes ``0x0``-``0xff``. To see how this is done simply open +e.g. :file:`encodings/cp1252.py` (which is an encoding that is used primarily on +Windows). There's a string constant with 256 characters that shows you which +character is mapped to which byte value. + +All of these encodings can only encode 256 of the 65536 (or 1114111) codepoints +defined in unicode. A simple and straightforward way that can store each Unicode +code point, is to store each codepoint as two consecutive bytes. There are two +possibilities: Store the bytes in big endian or in little endian order. These +two encodings are called UTF-16-BE and UTF-16-LE respectively. Their +disadvantage is that if e.g. you use UTF-16-BE on a little endian machine you +will always have to swap bytes on encoding and decoding. UTF-16 avoids this +problem: Bytes will always be in natural endianness. When these bytes are read +by a CPU with a different endianness, then bytes have to be swapped though. To +be able to detect the endianness of a UTF-16 byte sequence, there's the so +called BOM (the "Byte Order Mark"). This is the Unicode character ``U+FEFF``. +This character will be prepended to every UTF-16 byte sequence. The byte swapped +version of this character (``0xFFFE``) is an illegal character that may not +appear in a Unicode text. So when the first character in an UTF-16 byte sequence +appears to be a ``U+FFFE`` the bytes have to be swapped on decoding. +Unfortunately upto Unicode 4.0 the character ``U+FEFF`` had a second purpose as +a ``ZERO WIDTH NO-BREAK SPACE``: A character that has no width and doesn't allow +a word to be split. It can e.g. be used to give hints to a ligature algorithm. +With Unicode 4.0 using ``U+FEFF`` as a ``ZERO WIDTH NO-BREAK SPACE`` has been +deprecated (with ``U+2060`` (``WORD JOINER``) assuming this role). Nevertheless +Unicode software still must be able to handle ``U+FEFF`` in both roles: As a BOM +it's a device to determine the storage layout of the encoded bytes, and vanishes +once the byte sequence has been decoded into a Unicode string; as a ``ZERO WIDTH +NO-BREAK SPACE`` it's a normal character that will be decoded like any other. + +There's another encoding that is able to encoding the full range of Unicode +characters: UTF-8. UTF-8 is an 8-bit encoding, which means there are no issues +with byte order in UTF-8. Each byte in a UTF-8 byte sequence consists of two +parts: Marker bits (the most significant bits) and payload bits. The marker bits +are a sequence of zero to six 1 bits followed by a 0 bit. Unicode characters are +encoded like this (with x being payload bits, which when concatenated give the +Unicode character): + ++-----------------------------------+----------------------------------------------+ +| Range | Encoding | ++===================================+==============================================+ +| ``U-00000000`` ... ``U-0000007F`` | 0xxxxxxx | ++-----------------------------------+----------------------------------------------+ +| ``U-00000080`` ... ``U-000007FF`` | 110xxxxx 10xxxxxx | ++-----------------------------------+----------------------------------------------+ +| ``U-00000800`` ... ``U-0000FFFF`` | 1110xxxx 10xxxxxx 10xxxxxx | ++-----------------------------------+----------------------------------------------+ +| ``U-00010000`` ... ``U-001FFFFF`` | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | ++-----------------------------------+----------------------------------------------+ +| ``U-00200000`` ... ``U-03FFFFFF`` | 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | ++-----------------------------------+----------------------------------------------+ +| ``U-04000000`` ... ``U-7FFFFFFF`` | 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx | +| | 10xxxxxx | ++-----------------------------------+----------------------------------------------+ + +The least significant bit of the Unicode character is the rightmost x bit. + +As UTF-8 is an 8-bit encoding no BOM is required and any ``U+FEFF`` character in +the decoded Unicode string (even if it's the first character) is treated as a +``ZERO WIDTH NO-BREAK SPACE``. + +Without external information it's impossible to reliably determine which +encoding was used for encoding a Unicode string. Each charmap encoding can +decode any random byte sequence. However that's not possible with UTF-8, as +UTF-8 byte sequences have a structure that doesn't allow arbitrary byte +sequence. To increase the reliability with which a UTF-8 encoding can be +detected, Microsoft invented a variant of UTF-8 (that Python 2.5 calls +``"utf-8-sig"``) for its Notepad program: Before any of the Unicode characters +is written to the file, a UTF-8 encoded BOM (which looks like this as a byte +sequence: ``0xef``, ``0xbb``, ``0xbf``) is written. As it's rather improbable +that any charmap encoded file starts with these byte values (which would e.g. +map to + + | LATIN SMALL LETTER I WITH DIAERESIS + | RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK + | INVERTED QUESTION MARK + +in iso-8859-1), this increases the probability that a utf-8-sig encoding can be +correctly guessed from the byte sequence. So here the BOM is not used to be able +to determine the byte order used for generating the byte sequence, but as a +signature that helps in guessing the encoding. On encoding the utf-8-sig codec +will write ``0xef``, ``0xbb``, ``0xbf`` as the first three bytes to the file. On +decoding utf-8-sig will skip those three bytes if they appear as the first three +bytes in the file. + + +.. _standard-encodings: + +Standard Encodings +------------------ + +Python comes with a number of codecs built-in, either implemented as C functions +or with dictionaries as mapping tables. The following table lists the codecs by +name, together with a few common aliases, and the languages for which the +encoding is likely used. Neither the list of aliases nor the list of languages +is meant to be exhaustive. Notice that spelling alternatives that only differ in +case or use a hyphen instead of an underscore are also valid aliases. + +Many of the character sets support the same languages. They vary in individual +characters (e.g. whether the EURO SIGN is supported or not), and in the +assignment of characters to code positions. For the European languages in +particular, the following variants typically exist: + +* an ISO 8859 codeset + +* a Microsoft Windows code page, which is typically derived from a 8859 codeset, + but replaces control characters with additional graphic characters + +* an IBM EBCDIC code page + +* an IBM PC code page, which is ASCII compatible + ++-----------------+--------------------------------+--------------------------------+ +| Codec | Aliases | Languages | ++=================+================================+================================+ +| ascii | 646, us-ascii | English | ++-----------------+--------------------------------+--------------------------------+ +| big5 | big5-tw, csbig5 | Traditional Chinese | ++-----------------+--------------------------------+--------------------------------+ +| big5hkscs | big5-hkscs, hkscs | Traditional Chinese | ++-----------------+--------------------------------+--------------------------------+ +| cp037 | IBM037, IBM039 | English | ++-----------------+--------------------------------+--------------------------------+ +| cp424 | EBCDIC-CP-HE, IBM424 | Hebrew | ++-----------------+--------------------------------+--------------------------------+ +| cp437 | 437, IBM437 | English | ++-----------------+--------------------------------+--------------------------------+ +| cp500 | EBCDIC-CP-BE, EBCDIC-CP-CH, | Western Europe | +| | IBM500 | | ++-----------------+--------------------------------+--------------------------------+ +| cp737 | | Greek | ++-----------------+--------------------------------+--------------------------------+ +| cp775 | IBM775 | Baltic languages | ++-----------------+--------------------------------+--------------------------------+ +| cp850 | 850, IBM850 | Western Europe | ++-----------------+--------------------------------+--------------------------------+ +| cp852 | 852, IBM852 | Central and Eastern Europe | ++-----------------+--------------------------------+--------------------------------+ +| cp855 | 855, IBM855 | Bulgarian, Byelorussian, | +| | | Macedonian, Russian, Serbian | ++-----------------+--------------------------------+--------------------------------+ +| cp856 | | Hebrew | ++-----------------+--------------------------------+--------------------------------+ +| cp857 | 857, IBM857 | Turkish | ++-----------------+--------------------------------+--------------------------------+ +| cp860 | 860, IBM860 | Portuguese | ++-----------------+--------------------------------+--------------------------------+ +| cp861 | 861, CP-IS, IBM861 | Icelandic | ++-----------------+--------------------------------+--------------------------------+ +| cp862 | 862, IBM862 | Hebrew | ++-----------------+--------------------------------+--------------------------------+ +| cp863 | 863, IBM863 | Canadian | ++-----------------+--------------------------------+--------------------------------+ +| cp864 | IBM864 | Arabic | ++-----------------+--------------------------------+--------------------------------+ +| cp865 | 865, IBM865 | Danish, Norwegian | ++-----------------+--------------------------------+--------------------------------+ +| cp866 | 866, IBM866 | Russian | ++-----------------+--------------------------------+--------------------------------+ +| cp869 | 869, CP-GR, IBM869 | Greek | ++-----------------+--------------------------------+--------------------------------+ +| cp874 | | Thai | ++-----------------+--------------------------------+--------------------------------+ +| cp875 | | Greek | ++-----------------+--------------------------------+--------------------------------+ +| cp932 | 932, ms932, mskanji, ms-kanji | Japanese | ++-----------------+--------------------------------+--------------------------------+ +| cp949 | 949, ms949, uhc | Korean | ++-----------------+--------------------------------+--------------------------------+ +| cp950 | 950, ms950 | Traditional Chinese | ++-----------------+--------------------------------+--------------------------------+ +| cp1006 | | Urdu | ++-----------------+--------------------------------+--------------------------------+ +| cp1026 | ibm1026 | Turkish | ++-----------------+--------------------------------+--------------------------------+ +| cp1140 | ibm1140 | Western Europe | ++-----------------+--------------------------------+--------------------------------+ +| cp1250 | windows-1250 | Central and Eastern Europe | ++-----------------+--------------------------------+--------------------------------+ +| cp1251 | windows-1251 | Bulgarian, Byelorussian, | +| | | Macedonian, Russian, Serbian | ++-----------------+--------------------------------+--------------------------------+ +| cp1252 | windows-1252 | Western Europe | ++-----------------+--------------------------------+--------------------------------+ +| cp1253 | windows-1253 | Greek | ++-----------------+--------------------------------+--------------------------------+ +| cp1254 | windows-1254 | Turkish | ++-----------------+--------------------------------+--------------------------------+ +| cp1255 | windows-1255 | Hebrew | ++-----------------+--------------------------------+--------------------------------+ +| cp1256 | windows1256 | Arabic | ++-----------------+--------------------------------+--------------------------------+ +| cp1257 | windows-1257 | Baltic languages | ++-----------------+--------------------------------+--------------------------------+ +| cp1258 | windows-1258 | Vietnamese | ++-----------------+--------------------------------+--------------------------------+ +| euc_jp | eucjp, ujis, u-jis | Japanese | ++-----------------+--------------------------------+--------------------------------+ +| euc_jis_2004 | jisx0213, eucjis2004 | Japanese | ++-----------------+--------------------------------+--------------------------------+ +| euc_jisx0213 | eucjisx0213 | Japanese | ++-----------------+--------------------------------+--------------------------------+ +| euc_kr | euckr, korean, ksc5601, | Korean | +| | ks_c-5601, ks_c-5601-1987, | | +| | ksx1001, ks_x-1001 | | ++-----------------+--------------------------------+--------------------------------+ +| gb2312 | chinese, csiso58gb231280, euc- | Simplified Chinese | +| | cn, euccn, eucgb2312-cn, | | +| | gb2312-1980, gb2312-80, iso- | | +| | ir-58 | | ++-----------------+--------------------------------+--------------------------------+ +| gbk | 936, cp936, ms936 | Unified Chinese | ++-----------------+--------------------------------+--------------------------------+ +| gb18030 | gb18030-2000 | Unified Chinese | ++-----------------+--------------------------------+--------------------------------+ +| hz | hzgb, hz-gb, hz-gb-2312 | Simplified Chinese | ++-----------------+--------------------------------+--------------------------------+ +| iso2022_jp | csiso2022jp, iso2022jp, | Japanese | +| | iso-2022-jp | | ++-----------------+--------------------------------+--------------------------------+ +| iso2022_jp_1 | iso2022jp-1, iso-2022-jp-1 | Japanese | ++-----------------+--------------------------------+--------------------------------+ +| iso2022_jp_2 | iso2022jp-2, iso-2022-jp-2 | Japanese, Korean, Simplified | +| | | Chinese, Western Europe, Greek | ++-----------------+--------------------------------+--------------------------------+ +| iso2022_jp_2004 | iso2022jp-2004, | Japanese | +| | iso-2022-jp-2004 | | ++-----------------+--------------------------------+--------------------------------+ +| iso2022_jp_3 | iso2022jp-3, iso-2022-jp-3 | Japanese | ++-----------------+--------------------------------+--------------------------------+ +| iso2022_jp_ext | iso2022jp-ext, iso-2022-jp-ext | Japanese | ++-----------------+--------------------------------+--------------------------------+ +| iso2022_kr | csiso2022kr, iso2022kr, | Korean | +| | iso-2022-kr | | ++-----------------+--------------------------------+--------------------------------+ +| latin_1 | iso-8859-1, iso8859-1, 8859, | West Europe | +| | cp819, latin, latin1, L1 | | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_2 | iso-8859-2, latin2, L2 | Central and Eastern Europe | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_3 | iso-8859-3, latin3, L3 | Esperanto, Maltese | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_4 | iso-8859-4, latin4, L4 | Baltic languagues | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_5 | iso-8859-5, cyrillic | Bulgarian, Byelorussian, | +| | | Macedonian, Russian, Serbian | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_6 | iso-8859-6, arabic | Arabic | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_7 | iso-8859-7, greek, greek8 | Greek | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_8 | iso-8859-8, hebrew | Hebrew | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_9 | iso-8859-9, latin5, L5 | Turkish | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_10 | iso-8859-10, latin6, L6 | Nordic languages | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_13 | iso-8859-13 | Baltic languages | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_14 | iso-8859-14, latin8, L8 | Celtic languages | ++-----------------+--------------------------------+--------------------------------+ +| iso8859_15 | iso-8859-15 | Western Europe | ++-----------------+--------------------------------+--------------------------------+ +| johab | cp1361, ms1361 | Korean | ++-----------------+--------------------------------+--------------------------------+ +| koi8_r | | Russian | ++-----------------+--------------------------------+--------------------------------+ +| koi8_u | | Ukrainian | ++-----------------+--------------------------------+--------------------------------+ +| mac_cyrillic | maccyrillic | Bulgarian, Byelorussian, | +| | | Macedonian, Russian, Serbian | ++-----------------+--------------------------------+--------------------------------+ +| mac_greek | macgreek | Greek | ++-----------------+--------------------------------+--------------------------------+ +| mac_iceland | maciceland | Icelandic | ++-----------------+--------------------------------+--------------------------------+ +| mac_latin2 | maclatin2, maccentraleurope | Central and Eastern Europe | ++-----------------+--------------------------------+--------------------------------+ +| mac_roman | macroman | Western Europe | ++-----------------+--------------------------------+--------------------------------+ +| mac_turkish | macturkish | Turkish | ++-----------------+--------------------------------+--------------------------------+ +| ptcp154 | csptcp154, pt154, cp154, | Kazakh | +| | cyrillic-asian | | ++-----------------+--------------------------------+--------------------------------+ +| shift_jis | csshiftjis, shiftjis, sjis, | Japanese | +| | s_jis | | ++-----------------+--------------------------------+--------------------------------+ +| shift_jis_2004 | shiftjis2004, sjis_2004, | Japanese | +| | sjis2004 | | ++-----------------+--------------------------------+--------------------------------+ +| shift_jisx0213 | shiftjisx0213, sjisx0213, | Japanese | +| | s_jisx0213 | | ++-----------------+--------------------------------+--------------------------------+ +| utf_16 | U16, utf16 | all languages | ++-----------------+--------------------------------+--------------------------------+ +| utf_16_be | UTF-16BE | all languages (BMP only) | ++-----------------+--------------------------------+--------------------------------+ +| utf_16_le | UTF-16LE | all languages (BMP only) | ++-----------------+--------------------------------+--------------------------------+ +| utf_7 | U7, unicode-1-1-utf-7 | all languages | ++-----------------+--------------------------------+--------------------------------+ +| utf_8 | U8, UTF, utf8 | all languages | ++-----------------+--------------------------------+--------------------------------+ +| utf_8_sig | | all languages | ++-----------------+--------------------------------+--------------------------------+ + +A number of codecs are specific to Python, so their codec names have no meaning +outside Python. Some of them don't convert from Unicode strings to byte strings, +but instead use the property of the Python codecs machinery that any bijective +function with one argument can be considered as an encoding. + +For the codecs listed below, the result in the "encoding" direction is always a +byte string. The result of the "decoding" direction is listed as operand type in +the table. + ++--------------------+---------+----------------+---------------------------+ +| Codec | Aliases | Operand type | Purpose | ++====================+=========+================+===========================+ +| idna | | Unicode string | Implements :rfc:`3490`, | +| | | | see also | +| | | | :mod:`encodings.idna` | ++--------------------+---------+----------------+---------------------------+ +| mbcs | dbcs | Unicode string | Windows only: Encode | +| | | | operand according to the | +| | | | ANSI codepage (CP_ACP) | ++--------------------+---------+----------------+---------------------------+ +| palmos | | Unicode string | Encoding of PalmOS 3.5 | ++--------------------+---------+----------------+---------------------------+ +| punycode | | Unicode string | Implements :rfc:`3492` | ++--------------------+---------+----------------+---------------------------+ +| raw_unicode_escape | | Unicode string | Produce a string that is | +| | | | suitable as raw Unicode | +| | | | literal in Python source | +| | | | code | ++--------------------+---------+----------------+---------------------------+ +| undefined | | any | Raise an exception for | +| | | | all conversions. Can be | +| | | | used as the system | +| | | | encoding if no automatic | +| | | | coercion between byte and | +| | | | Unicode strings is | +| | | | desired. | ++--------------------+---------+----------------+---------------------------+ +| unicode_escape | | Unicode string | Produce a string that is | +| | | | suitable as Unicode | +| | | | literal in Python source | +| | | | code | ++--------------------+---------+----------------+---------------------------+ +| unicode_internal | | Unicode string | Return the internal | +| | | | representation of the | +| | | | operand | ++--------------------+---------+----------------+---------------------------+ + +.. versionadded:: 2.3 + The ``idna`` and ``punycode`` encodings. + + +:mod:`encodings.idna` --- Internationalized Domain Names in Applications +------------------------------------------------------------------------ + +.. module:: encodings.idna + :synopsis: Internationalized Domain Names implementation +.. moduleauthor:: Martin v. Löwis + +.. versionadded:: 2.3 + +This module implements :rfc:`3490` (Internationalized Domain Names in +Applications) and :rfc:`3492` (Nameprep: A Stringprep Profile for +Internationalized Domain Names (IDN)). It builds upon the ``punycode`` encoding +and :mod:`stringprep`. + +These RFCs together define a protocol to support non-ASCII characters in domain +names. A domain name containing non-ASCII characters (such as +``www.Alliancefrançaise.nu``) is converted into an ASCII-compatible encoding +(ACE, such as ``www.xn--alliancefranaise-npb.nu``). The ACE form of the domain +name is then used in all places where arbitrary characters are not allowed by +the protocol, such as DNS queries, HTTP :mailheader:`Host` fields, and so +on. This conversion is carried out in the application; if possible invisible to +the user: The application should transparently convert Unicode domain labels to +IDNA on the wire, and convert back ACE labels to Unicode before presenting them +to the user. + +Python supports this conversion in several ways: The ``idna`` codec allows to +convert between Unicode and the ACE. Furthermore, the :mod:`socket` module +transparently converts Unicode host names to ACE, so that applications need not +be concerned about converting host names themselves when they pass them to the +socket module. On top of that, modules that have host names as function +parameters, such as :mod:`httplib` and :mod:`ftplib`, accept Unicode host names +(:mod:`httplib` then also transparently sends an IDNA hostname in the +:mailheader:`Host` field if it sends that field at all). + +When receiving host names from the wire (such as in reverse name lookup), no +automatic conversion to Unicode is performed: Applications wishing to present +such host names to the user should decode them to Unicode. + +The module :mod:`encodings.idna` also implements the nameprep procedure, which +performs certain normalizations on host names, to achieve case-insensitivity of +international domain names, and to unify similar characters. The nameprep +functions can be used directly if desired. + + +.. function:: nameprep(label) + + Return the nameprepped version of *label*. The implementation currently assumes + query strings, so ``AllowUnassigned`` is true. + + +.. function:: ToASCII(label) + + Convert a label to ASCII, as specified in :rfc:`3490`. ``UseSTD3ASCIIRules`` is + assumed to be false. + + +.. function:: ToUnicode(label) + + Convert a label to Unicode, as specified in :rfc:`3490`. + + +:mod:`encodings.utf_8_sig` --- UTF-8 codec with BOM signature +------------------------------------------------------------- + +.. module:: encodings.utf_8_sig + :synopsis: UTF-8 codec with BOM signature +.. moduleauthor:: Walter Dörwald + +.. versionadded:: 2.5 + +This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded +BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this +is only done once (on the first write to the byte stream). For decoding an +optional UTF-8 encoded BOM at the start of the data will be skipped. + diff --git a/Doc/library/codeop.rst b/Doc/library/codeop.rst new file mode 100644 index 0000000..8a730ec --- /dev/null +++ b/Doc/library/codeop.rst @@ -0,0 +1,95 @@ + +:mod:`codeop` --- Compile Python code +===================================== + +.. module:: codeop + :synopsis: Compile (possibly incomplete) Python code. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> +.. sectionauthor:: Michael Hudson <mwh@python.net> + + +.. % LaTeXed from excellent doc-string. + +The :mod:`codeop` module provides utilities upon which the Python +read-eval-print loop can be emulated, as is done in the :mod:`code` module. As +a result, you probably don't want to use the module directly; if you want to +include such a loop in your program you probably want to use the :mod:`code` +module instead. + +There are two parts to this job: + +#. Being able to tell if a line of input completes a Python statement: in + short, telling whether to print '``>>>``' or '``...``' next. + +#. Remembering which future statements the user has entered, so subsequent + input can be compiled with these in effect. + +The :mod:`codeop` module provides a way of doing each of these things, and a way +of doing them both. + +To do just the former: + + +.. function:: compile_command(source[, filename[, symbol]]) + + Tries to compile *source*, which should be a string of Python code and return a + code object if *source* is valid Python code. In that case, the filename + attribute of the code object will be *filename*, which defaults to + ``'<input>'``. Returns ``None`` if *source* is *not* valid Python code, but is a + prefix of valid Python code. + + If there is a problem with *source*, an exception will be raised. + :exc:`SyntaxError` is raised if there is invalid Python syntax, and + :exc:`OverflowError` or :exc:`ValueError` if there is an invalid literal. + + The *symbol* argument determines whether *source* is compiled as a statement + (``'single'``, the default) or as an expression (``'eval'``). Any other value + will cause :exc:`ValueError` to be raised. + + **Caveat:** It is possible (but not likely) that the parser stops parsing with a + successful outcome before reaching the end of the source; in this case, trailing + symbols may be ignored instead of causing an error. For example, a backslash + followed by two newlines may be followed by arbitrary garbage. This will be + fixed once the API for the parser is better. + + +.. class:: Compile() + + Instances of this class have :meth:`__call__` methods identical in signature to + the built-in function :func:`compile`, but with the difference that if the + instance compiles program text containing a :mod:`__future__` statement, the + instance 'remembers' and compiles all subsequent program texts with the + statement in force. + + +.. class:: CommandCompiler() + + Instances of this class have :meth:`__call__` methods identical in signature to + :func:`compile_command`; the difference is that if the instance compiles program + text containing a ``__future__`` statement, the instance 'remembers' and + compiles all subsequent program texts with the statement in force. + +A note on version compatibility: the :class:`Compile` and +:class:`CommandCompiler` are new in Python 2.2. If you want to enable the +future-tracking features of 2.2 but also retain compatibility with 2.1 and +earlier versions of Python you can either write :: + + try: + from codeop import CommandCompiler + compile_command = CommandCompiler() + del CommandCompiler + except ImportError: + from codeop import compile_command + +which is a low-impact change, but introduces possibly unwanted global state into +your program, or you can write:: + + try: + from codeop import CommandCompiler + except ImportError: + def CommandCompiler(): + from codeop import compile_command + return compile_command + +and then call ``CommandCompiler`` every time you need a fresh compiler object. + diff --git a/Doc/library/collections.rst b/Doc/library/collections.rst new file mode 100644 index 0000000..c2c9262 --- /dev/null +++ b/Doc/library/collections.rst @@ -0,0 +1,414 @@ + +:mod:`collections` --- High-performance container datatypes +=========================================================== + +.. module:: collections + :synopsis: High-performance datatypes +.. moduleauthor:: Raymond Hettinger <python@rcn.com> +.. sectionauthor:: Raymond Hettinger <python@rcn.com> + + +.. versionadded:: 2.4 + +This module implements high-performance container datatypes. Currently, +there are two datatypes, :class:`deque` and :class:`defaultdict`, and +one datatype factory function, :func:`NamedTuple`. Python already +includes built-in containers, :class:`dict`, :class:`list`, +:class:`set`, and :class:`tuple`. In addition, the optional :mod:`bsddb` +module has a :meth:`bsddb.btopen` method that can be used to create in-memory +or file based ordered dictionaries with string keys. + +Future editions of the standard library may include balanced trees and +ordered dictionaries. + +.. versionchanged:: 2.5 + Added :class:`defaultdict`. + +.. versionchanged:: 2.6 + Added :class:`NamedTuple`. + + +.. _deque-objects: + +:class:`deque` objects +---------------------- + + +.. class:: deque([iterable]) + + Returns a new deque object initialized left-to-right (using :meth:`append`) with + data from *iterable*. If *iterable* is not specified, the new deque is empty. + + Deques are a generalization of stacks and queues (the name is pronounced "deck" + and is short for "double-ended queue"). Deques support thread-safe, memory + efficient appends and pops from either side of the deque with approximately the + same O(1) performance in either direction. + + Though :class:`list` objects support similar operations, they are optimized for + fast fixed-length operations and incur O(n) memory movement costs for + ``pop(0)`` and ``insert(0, v)`` operations which change both the size and + position of the underlying data representation. + + .. versionadded:: 2.4 + +Deque objects support the following methods: + + +.. method:: deque.append(x) + + Add *x* to the right side of the deque. + + +.. method:: deque.appendleft(x) + + Add *x* to the left side of the deque. + + +.. method:: deque.clear() + + Remove all elements from the deque leaving it with length 0. + + +.. method:: deque.extend(iterable) + + Extend the right side of the deque by appending elements from the iterable + argument. + + +.. method:: deque.extendleft(iterable) + + Extend the left side of the deque by appending elements from *iterable*. Note, + the series of left appends results in reversing the order of elements in the + iterable argument. + + +.. method:: deque.pop() + + Remove and return an element from the right side of the deque. If no elements + are present, raises an :exc:`IndexError`. + + +.. method:: deque.popleft() + + Remove and return an element from the left side of the deque. If no elements are + present, raises an :exc:`IndexError`. + + +.. method:: deque.remove(value) + + Removed the first occurrence of *value*. If not found, raises a + :exc:`ValueError`. + + .. versionadded:: 2.5 + + +.. method:: deque.rotate(n) + + Rotate the deque *n* steps to the right. If *n* is negative, rotate to the + left. Rotating one step to the right is equivalent to: + ``d.appendleft(d.pop())``. + +In addition to the above, deques support iteration, pickling, ``len(d)``, +``reversed(d)``, ``copy.copy(d)``, ``copy.deepcopy(d)``, membership testing with +the :keyword:`in` operator, and subscript references such as ``d[-1]``. + +Example:: + + >>> from collections import deque + >>> d = deque('ghi') # make a new deque with three items + >>> for elem in d: # iterate over the deque's elements + ... print elem.upper() + G + H + I + + >>> d.append('j') # add a new entry to the right side + >>> d.appendleft('f') # add a new entry to the left side + >>> d # show the representation of the deque + deque(['f', 'g', 'h', 'i', 'j']) + + >>> d.pop() # return and remove the rightmost item + 'j' + >>> d.popleft() # return and remove the leftmost item + 'f' + >>> list(d) # list the contents of the deque + ['g', 'h', 'i'] + >>> d[0] # peek at leftmost item + 'g' + >>> d[-1] # peek at rightmost item + 'i' + + >>> list(reversed(d)) # list the contents of a deque in reverse + ['i', 'h', 'g'] + >>> 'h' in d # search the deque + True + >>> d.extend('jkl') # add multiple elements at once + >>> d + deque(['g', 'h', 'i', 'j', 'k', 'l']) + >>> d.rotate(1) # right rotation + >>> d + deque(['l', 'g', 'h', 'i', 'j', 'k']) + >>> d.rotate(-1) # left rotation + >>> d + deque(['g', 'h', 'i', 'j', 'k', 'l']) + + >>> deque(reversed(d)) # make a new deque in reverse order + deque(['l', 'k', 'j', 'i', 'h', 'g']) + >>> d.clear() # empty the deque + >>> d.pop() # cannot pop from an empty deque + Traceback (most recent call last): + File "<pyshell#6>", line 1, in -toplevel- + d.pop() + IndexError: pop from an empty deque + + >>> d.extendleft('abc') # extendleft() reverses the input order + >>> d + deque(['c', 'b', 'a']) + + +.. _deque-recipes: + +Recipes +^^^^^^^ + +This section shows various approaches to working with deques. + +The :meth:`rotate` method provides a way to implement :class:`deque` slicing and +deletion. For example, a pure python implementation of ``del d[n]`` relies on +the :meth:`rotate` method to position elements to be popped:: + + def delete_nth(d, n): + d.rotate(-n) + d.popleft() + d.rotate(n) + +To implement :class:`deque` slicing, use a similar approach applying +:meth:`rotate` to bring a target element to the left side of the deque. Remove +old entries with :meth:`popleft`, add new entries with :meth:`extend`, and then +reverse the rotation. + +With minor variations on that approach, it is easy to implement Forth style +stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``, +``rot``, and ``roll``. + +A roundrobin task server can be built from a :class:`deque` using +:meth:`popleft` to select the current task and :meth:`append` to add it back to +the tasklist if the input stream is not exhausted:: + + >>> def roundrobin(*iterables): + ... pending = deque(iter(i) for i in iterables) + ... while pending: + ... task = pending.popleft() + ... try: + ... yield next(task) + ... except StopIteration: + ... continue + ... pending.append(task) + ... + >>> for value in roundrobin('abc', 'd', 'efgh'): + ... print value + + a + d + e + b + f + c + g + h + + +Multi-pass data reduction algorithms can be succinctly expressed and efficiently +coded by extracting elements with multiple calls to :meth:`popleft`, applying +the reduction function, and calling :meth:`append` to add the result back to the +queue. + +For example, building a balanced binary tree of nested lists entails reducing +two adjacent nodes into one by grouping them in a list:: + + >>> def maketree(iterable): + ... d = deque(iterable) + ... while len(d) > 1: + ... pair = [d.popleft(), d.popleft()] + ... d.append(pair) + ... return list(d) + ... + >>> print maketree('abcdefgh') + [[[['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']]]] + + + +.. _defaultdict-objects: + +:class:`defaultdict` objects +---------------------------- + + +.. class:: defaultdict([default_factory[, ...]]) + + Returns a new dictionary-like object. :class:`defaultdict` is a subclass of the + builtin :class:`dict` class. It overrides one method and adds one writable + instance variable. The remaining functionality is the same as for the + :class:`dict` class and is not documented here. + + The first argument provides the initial value for the :attr:`default_factory` + attribute; it defaults to ``None``. All remaining arguments are treated the same + as if they were passed to the :class:`dict` constructor, including keyword + arguments. + + .. versionadded:: 2.5 + +:class:`defaultdict` objects support the following method in addition to the +standard :class:`dict` operations: + + +.. method:: defaultdict.__missing__(key) + + If the :attr:`default_factory` attribute is ``None``, this raises an + :exc:`KeyError` exception with the *key* as argument. + + If :attr:`default_factory` is not ``None``, it is called without arguments to + provide a default value for the given *key*, this value is inserted in the + dictionary for the *key*, and returned. + + If calling :attr:`default_factory` raises an exception this exception is + propagated unchanged. + + This method is called by the :meth:`__getitem__` method of the :class:`dict` + class when the requested key is not found; whatever it returns or raises is then + returned or raised by :meth:`__getitem__`. + +:class:`defaultdict` objects support the following instance variable: + + +.. attribute:: defaultdict.default_factory + + This attribute is used by the :meth:`__missing__` method; it is initialized from + the first argument to the constructor, if present, or to ``None``, if absent. + + +.. _defaultdict-examples: + +:class:`defaultdict` Examples +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Using :class:`list` as the :attr:`default_factory`, it is easy to group a +sequence of key-value pairs into a dictionary of lists:: + + >>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)] + >>> d = defaultdict(list) + >>> for k, v in s: + ... d[k].append(v) + ... + >>> d.items() + [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])] + +When each key is encountered for the first time, it is not already in the +mapping; so an entry is automatically created using the :attr:`default_factory` +function which returns an empty :class:`list`. The :meth:`list.append` +operation then attaches the value to the new list. When keys are encountered +again, the look-up proceeds normally (returning the list for that key) and the +:meth:`list.append` operation adds another value to the list. This technique is +simpler and faster than an equivalent technique using :meth:`dict.setdefault`:: + + >>> d = {} + >>> for k, v in s: + ... d.setdefault(k, []).append(v) + ... + >>> d.items() + [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])] + +Setting the :attr:`default_factory` to :class:`int` makes the +:class:`defaultdict` useful for counting (like a bag or multiset in other +languages):: + + >>> s = 'mississippi' + >>> d = defaultdict(int) + >>> for k in s: + ... d[k] += 1 + ... + >>> d.items() + [('i', 4), ('p', 2), ('s', 4), ('m', 1)] + +When a letter is first encountered, it is missing from the mapping, so the +:attr:`default_factory` function calls :func:`int` to supply a default count of +zero. The increment operation then builds up the count for each letter. + +The function :func:`int` which always returns zero is just a special case of +constant functions. A faster and more flexible way to create constant functions +is to use a lambda function which can supply any constant value (not just +zero):: + + >>> def constant_factory(value): + ... return lambda: value + >>> d = defaultdict(constant_factory('<missing>')) + >>> d.update(name='John', action='ran') + >>> '%(name)s %(action)s to %(object)s' % d + 'John ran to <missing>' + +Setting the :attr:`default_factory` to :class:`set` makes the +:class:`defaultdict` useful for building a dictionary of sets:: + + >>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)] + >>> d = defaultdict(set) + >>> for k, v in s: + ... d[k].add(v) + ... + >>> d.items() + [('blue', set([2, 4])), ('red', set([1, 3]))] + + +.. _named-tuple-factory: + +:func:`NamedTuple` datatype factory function +-------------------------------------------- + + +.. function:: NamedTuple(typename, fieldnames) + + Returns a new tuple subclass named *typename*. The new subclass is used to + create tuple-like objects that have fields accessable by attribute lookup as + well as being indexable and iterable. Instances of the subclass also have a + helpful docstring (with typename and fieldnames) and a helpful :meth:`__repr__` + method which lists the tuple contents in a ``name=value`` format. + + .. versionadded:: 2.6 + + The *fieldnames* are specified in a single string and are separated by spaces. + Any valid Python identifier may be used for a field name. + + Example:: + + >>> Point = NamedTuple('Point', 'x y') + >>> Point.__doc__ # docstring for the new datatype + 'Point(x, y)' + >>> p = Point(11, y=22) # instantiate with positional or keyword arguments + >>> p[0] + p[1] # works just like the tuple (11, 22) + 33 + >>> x, y = p # unpacks just like a tuple + >>> x, y + (11, 22) + >>> p.x + p.y # fields also accessable by name + 33 + >>> p # readable __repr__ with name=value style + Point(x=11, y=22) + + The use cases are the same as those for tuples. The named factories assign + meaning to each tuple position and allow for more readable, self-documenting + code. Named tuples can also be used to assign field names to tuples returned + by the :mod:`csv` or :mod:`sqlite3` modules. For example:: + + from itertools import starmap + import csv + EmployeeRecord = NamedTuple('EmployeeRecord', 'name age title department paygrade') + for record in starmap(EmployeeRecord, csv.reader(open("employees.csv", "rb"))): + print record + + To cast an individual record stored as :class:`list`, :class:`tuple`, or some + other iterable type, use the star-operator to unpack the values:: + + >>> Color = NamedTuple('Color', 'name code') + >>> m = dict(red=1, green=2, blue=3) + >>> print Color(*m.popitem()) + Color(name='blue', code=3) + diff --git a/Doc/library/colorpicker.rst b/Doc/library/colorpicker.rst new file mode 100644 index 0000000..4244104 --- /dev/null +++ b/Doc/library/colorpicker.rst @@ -0,0 +1,23 @@ + +:mod:`ColorPicker` --- Color selection dialog +============================================= + +.. module:: ColorPicker + :platform: Mac + :synopsis: Interface to the standard color selection dialog. +.. moduleauthor:: Just van Rossum <just@letterror.com> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`ColorPicker` module provides access to the standard color picker +dialog. + + +.. function:: GetColor(prompt, rgb) + + Show a standard color selection dialog and allow the user to select a color. + The user is given instruction by the *prompt* string, and the default color is + set to *rgb*. *rgb* must be a tuple giving the red, green, and blue components + of the color. :func:`GetColor` returns a tuple giving the user's selected color + and a flag indicating whether they accepted the selection of cancelled. + diff --git a/Doc/library/colorsys.rst b/Doc/library/colorsys.rst new file mode 100644 index 0000000..2e7f3b7 --- /dev/null +++ b/Doc/library/colorsys.rst @@ -0,0 +1,60 @@ + +:mod:`colorsys` --- Conversions between color systems +===================================================== + +.. module:: colorsys + :synopsis: Conversion functions between RGB and other color systems. +.. sectionauthor:: David Ascher <da@python.net> + + +The :mod:`colorsys` module defines bidirectional conversions of color values +between colors expressed in the RGB (Red Green Blue) color space used in +computer monitors and three other coordinate systems: YIQ, HLS (Hue Lightness +Saturation) and HSV (Hue Saturation Value). Coordinates in all of these color +spaces are floating point values. In the YIQ space, the Y coordinate is between +0 and 1, but the I and Q coordinates can be positive or negative. In all other +spaces, the coordinates are all between 0 and 1. + +More information about color spaces can be found at +http://www.poynton.com/ColorFAQ.html. + +The :mod:`colorsys` module defines the following functions: + + +.. function:: rgb_to_yiq(r, g, b) + + Convert the color from RGB coordinates to YIQ coordinates. + + +.. function:: yiq_to_rgb(y, i, q) + + Convert the color from YIQ coordinates to RGB coordinates. + + +.. function:: rgb_to_hls(r, g, b) + + Convert the color from RGB coordinates to HLS coordinates. + + +.. function:: hls_to_rgb(h, l, s) + + Convert the color from HLS coordinates to RGB coordinates. + + +.. function:: rgb_to_hsv(r, g, b) + + Convert the color from RGB coordinates to HSV coordinates. + + +.. function:: hsv_to_rgb(h, s, v) + + Convert the color from HSV coordinates to RGB coordinates. + +Example:: + + >>> import colorsys + >>> colorsys.rgb_to_hsv(.3, .4, .2) + (0.25, 0.5, 0.4) + >>> colorsys.hsv_to_rgb(0.25, 0.5, 0.4) + (0.3, 0.4, 0.2) + diff --git a/Doc/library/commands.rst b/Doc/library/commands.rst new file mode 100644 index 0000000..79e3d73 --- /dev/null +++ b/Doc/library/commands.rst @@ -0,0 +1,53 @@ + +:mod:`commands` --- Utilities for running commands +================================================== + +.. module:: commands + :platform: Unix + :synopsis: Utility functions for running external commands. +.. sectionauthor:: Sue Williams <sbw@provis.com> + + +The :mod:`commands` module contains wrapper functions for :func:`os.popen` which +take a system command as a string and return any output generated by the command +and, optionally, the exit status. + +The :mod:`subprocess` module provides more powerful facilities for spawning new +processes and retrieving their results. Using the :mod:`subprocess` module is +preferable to using the :mod:`commands` module. + +The :mod:`commands` module defines the following functions: + + +.. function:: getstatusoutput(cmd) + + Execute the string *cmd* in a shell with :func:`os.popen` and return a 2-tuple + ``(status, output)``. *cmd* is actually run as ``{ cmd ; } 2>&1``, so that the + returned output will contain output or error messages. A trailing newline is + stripped from the output. The exit status for the command can be interpreted + according to the rules for the C function :cfunc:`wait`. + + +.. function:: getoutput(cmd) + + Like :func:`getstatusoutput`, except the exit status is ignored and the return + value is a string containing the command's output. + +Example:: + + >>> import commands + >>> commands.getstatusoutput('ls /bin/ls') + (0, '/bin/ls') + >>> commands.getstatusoutput('cat /bin/junk') + (256, 'cat: /bin/junk: No such file or directory') + >>> commands.getstatusoutput('/bin/junk') + (256, 'sh: /bin/junk: not found') + >>> commands.getoutput('ls /bin/ls') + '/bin/ls' + + +.. seealso:: + + Module :mod:`subprocess` + Module for spawning and managing subprocesses. + diff --git a/Doc/library/compileall.rst b/Doc/library/compileall.rst new file mode 100644 index 0000000..d62b785 --- /dev/null +++ b/Doc/library/compileall.rst @@ -0,0 +1,57 @@ + +:mod:`compileall` --- Byte-compile Python libraries +=================================================== + +.. module:: compileall + :synopsis: Tools for byte-compiling all Python source files in a directory tree. + + +This module provides some utility functions to support installing Python +libraries. These functions compile Python source files in a directory tree, +allowing users without permission to write to the libraries to take advantage of +cached byte-code files. + +The source file for this module may also be used as a script to compile Python +sources in directories named on the command line or in ``sys.path``. + + +.. function:: compile_dir(dir[, maxlevels[, ddir[, force[, rx[, quiet]]]]]) + + Recursively descend the directory tree named by *dir*, compiling all :file:`.py` + files along the way. The *maxlevels* parameter is used to limit the depth of + the recursion; it defaults to ``10``. If *ddir* is given, it is used as the + base path from which the filenames used in error messages will be generated. + If *force* is true, modules are re-compiled even if the timestamps are up to + date. + + If *rx* is given, it specifies a regular expression of file names to exclude + from the search; that expression is searched for in the full path. + + If *quiet* is true, nothing is printed to the standard output in normal + operation. + + +.. function:: compile_path([skip_curdir[, maxlevels[, force]]]) + + Byte-compile all the :file:`.py` files found along ``sys.path``. If + *skip_curdir* is true (the default), the current directory is not included in + the search. The *maxlevels* and *force* parameters default to ``0`` and are + passed to the :func:`compile_dir` function. + +To force a recompile of all the :file:`.py` files in the :file:`Lib/` +subdirectory and all its subdirectories:: + + import compileall + + compileall.compile_dir('Lib/', force=True) + + # Perform same compilation, excluding files in .svn directories. + import re + compileall.compile_dir('Lib/', rx=re.compile('/[.]svn'), force=True) + + +.. seealso:: + + Module :mod:`py_compile` + Byte-compile a single source file. + diff --git a/Doc/library/configparser.rst b/Doc/library/configparser.rst new file mode 100644 index 0000000..dd91d59 --- /dev/null +++ b/Doc/library/configparser.rst @@ -0,0 +1,361 @@ + +:mod:`ConfigParser` --- Configuration file parser +================================================= + +.. module:: ConfigParser + :synopsis: Configuration file parser. +.. moduleauthor:: Ken Manheimer <klm@zope.com> +.. moduleauthor:: Barry Warsaw <bwarsaw@python.org> +.. moduleauthor:: Eric S. Raymond <esr@thyrsus.com> +.. sectionauthor:: Christopher G. Petrilli <petrilli@amber.org> + + +.. index:: + pair: .ini; file + pair: configuration; file + single: ini file + single: Windows ini file + +This module defines the class :class:`ConfigParser`. The :class:`ConfigParser` +class implements a basic configuration file parser language which provides a +structure similar to what you would find on Microsoft Windows INI files. You +can use this to write Python programs which can be customized by end users +easily. + +.. warning:: + + This library does *not* interpret or write the value-type prefixes used in the + Windows Registry extended version of INI syntax. + +The configuration file consists of sections, led by a ``[section]`` header and +followed by ``name: value`` entries, with continuations in the style of +:rfc:`822`; ``name=value`` is also accepted. Note that leading whitespace is +removed from values. The optional values can contain format strings which refer +to other values in the same section, or values in a special ``DEFAULT`` section. +Additional defaults can be provided on initialization and retrieval. Lines +beginning with ``'#'`` or ``';'`` are ignored and may be used to provide +comments. + +For example:: + + [My Section] + foodir: %(dir)s/whatever + dir=frob + +would resolve the ``%(dir)s`` to the value of ``dir`` (``frob`` in this case). +All reference expansions are done on demand. + +Default values can be specified by passing them into the :class:`ConfigParser` +constructor as a dictionary. Additional defaults may be passed into the +:meth:`get` method which will override all others. + +Sections are normally stored in a builtin dictionary. An alternative dictionary +type can be passed to the :class:`ConfigParser` constructor. For example, if a +dictionary type is passed that sorts its keys, the sections will be sorted on +write-back, as will be the keys within each section. + + +.. class:: RawConfigParser([defaults[, dict_type]]) + + The basic configuration object. When *defaults* is given, it is initialized + into the dictionary of intrinsic defaults. When *dict_type* is given, it will + be used to create the dictionary objects for the list of sections, for the + options within a section, and for the default values. This class does not + support the magical interpolation behavior. + + .. versionadded:: 2.3 + + .. versionchanged:: 2.6 + *dict_type* was added. + + +.. class:: ConfigParser([defaults]) + + Derived class of :class:`RawConfigParser` that implements the magical + interpolation feature and adds optional arguments to the :meth:`get` and + :meth:`items` methods. The values in *defaults* must be appropriate for the + ``%()s`` string interpolation. Note that *__name__* is an intrinsic default; + its value is the section name, and will override any value provided in + *defaults*. + + All option names used in interpolation will be passed through the + :meth:`optionxform` method just like any other option name reference. For + example, using the default implementation of :meth:`optionxform` (which converts + option names to lower case), the values ``foo %(bar)s`` and ``foo %(BAR)s`` are + equivalent. + + +.. class:: SafeConfigParser([defaults]) + + Derived class of :class:`ConfigParser` that implements a more-sane variant of + the magical interpolation feature. This implementation is more predictable as + well. New applications should prefer this version if they don't need to be + compatible with older versions of Python. + + .. % XXX Need to explain what's safer/more predictable about it. + + .. versionadded:: 2.3 + + +.. exception:: NoSectionError + + Exception raised when a specified section is not found. + + +.. exception:: DuplicateSectionError + + Exception raised if :meth:`add_section` is called with the name of a section + that is already present. + + +.. exception:: NoOptionError + + Exception raised when a specified option is not found in the specified section. + + +.. exception:: InterpolationError + + Base class for exceptions raised when problems occur performing string + interpolation. + + +.. exception:: InterpolationDepthError + + Exception raised when string interpolation cannot be completed because the + number of iterations exceeds :const:`MAX_INTERPOLATION_DEPTH`. Subclass of + :exc:`InterpolationError`. + + +.. exception:: InterpolationMissingOptionError + + Exception raised when an option referenced from a value does not exist. Subclass + of :exc:`InterpolationError`. + + .. versionadded:: 2.3 + + +.. exception:: InterpolationSyntaxError + + Exception raised when the source text into which substitutions are made does not + conform to the required syntax. Subclass of :exc:`InterpolationError`. + + .. versionadded:: 2.3 + + +.. exception:: MissingSectionHeaderError + + Exception raised when attempting to parse a file which has no section headers. + + +.. exception:: ParsingError + + Exception raised when errors occur attempting to parse a file. + + +.. data:: MAX_INTERPOLATION_DEPTH + + The maximum depth for recursive interpolation for :meth:`get` when the *raw* + parameter is false. This is relevant only for the :class:`ConfigParser` class. + + +.. seealso:: + + Module :mod:`shlex` + Support for a creating Unix shell-like mini-languages which can be used as an + alternate format for application configuration files. + + +.. _rawconfigparser-objects: + +RawConfigParser Objects +----------------------- + +:class:`RawConfigParser` instances have the following methods: + + +.. method:: RawConfigParser.defaults() + + Return a dictionary containing the instance-wide defaults. + + +.. method:: RawConfigParser.sections() + + Return a list of the sections available; ``DEFAULT`` is not included in the + list. + + +.. method:: RawConfigParser.add_section(section) + + Add a section named *section* to the instance. If a section by the given name + already exists, :exc:`DuplicateSectionError` is raised. + + +.. method:: RawConfigParser.has_section(section) + + Indicates whether the named section is present in the configuration. The + ``DEFAULT`` section is not acknowledged. + + +.. method:: RawConfigParser.options(section) + + Returns a list of options available in the specified *section*. + + +.. method:: RawConfigParser.has_option(section, option) + + If the given section exists, and contains the given option, return + :const:`True`; otherwise return :const:`False`. + + .. versionadded:: 1.6 + + +.. method:: RawConfigParser.read(filenames) + + Attempt to read and parse a list of filenames, returning a list of filenames + which were successfully parsed. If *filenames* is a string or Unicode string, + it is treated as a single filename. If a file named in *filenames* cannot be + opened, that file will be ignored. This is designed so that you can specify a + list of potential configuration file locations (for example, the current + directory, the user's home directory, and some system-wide directory), and all + existing configuration files in the list will be read. If none of the named + files exist, the :class:`ConfigParser` instance will contain an empty dataset. + An application which requires initial values to be loaded from a file should + load the required file or files using :meth:`readfp` before calling :meth:`read` + for any optional files:: + + import ConfigParser, os + + config = ConfigParser.ConfigParser() + config.readfp(open('defaults.cfg')) + config.read(['site.cfg', os.path.expanduser('~/.myapp.cfg')]) + + .. versionchanged:: 2.4 + Returns list of successfully parsed filenames. + + +.. method:: RawConfigParser.readfp(fp[, filename]) + + Read and parse configuration data from the file or file-like object in *fp* + (only the :meth:`readline` method is used). If *filename* is omitted and *fp* + has a :attr:`name` attribute, that is used for *filename*; the default is + ``<???>``. + + +.. method:: RawConfigParser.get(section, option) + + Get an *option* value for the named *section*. + + +.. method:: RawConfigParser.getint(section, option) + + A convenience method which coerces the *option* in the specified *section* to an + integer. + + +.. method:: RawConfigParser.getfloat(section, option) + + A convenience method which coerces the *option* in the specified *section* to a + floating point number. + + +.. method:: RawConfigParser.getboolean(section, option) + + A convenience method which coerces the *option* in the specified *section* to a + Boolean value. Note that the accepted values for the option are ``"1"``, + ``"yes"``, ``"true"``, and ``"on"``, which cause this method to return ``True``, + and ``"0"``, ``"no"``, ``"false"``, and ``"off"``, which cause it to return + ``False``. These string values are checked in a case-insensitive manner. Any + other value will cause it to raise :exc:`ValueError`. + + +.. method:: RawConfigParser.items(section) + + Return a list of ``(name, value)`` pairs for each option in the given *section*. + + +.. method:: RawConfigParser.set(section, option, value) + + If the given section exists, set the given option to the specified value; + otherwise raise :exc:`NoSectionError`. While it is possible to use + :class:`RawConfigParser` (or :class:`ConfigParser` with *raw* parameters set to + true) for *internal* storage of non-string values, full functionality (including + interpolation and output to files) can only be achieved using string values. + + .. versionadded:: 1.6 + + +.. method:: RawConfigParser.write(fileobject) + + Write a representation of the configuration to the specified file object. This + representation can be parsed by a future :meth:`read` call. + + .. versionadded:: 1.6 + + +.. method:: RawConfigParser.remove_option(section, option) + + Remove the specified *option* from the specified *section*. If the section does + not exist, raise :exc:`NoSectionError`. If the option existed to be removed, + return :const:`True`; otherwise return :const:`False`. + + .. versionadded:: 1.6 + + +.. method:: RawConfigParser.remove_section(section) + + Remove the specified *section* from the configuration. If the section in fact + existed, return ``True``. Otherwise return ``False``. + + +.. method:: RawConfigParser.optionxform(option) + + Transforms the option name *option* as found in an input file or as passed in by + client code to the form that should be used in the internal structures. The + default implementation returns a lower-case version of *option*; subclasses may + override this or client code can set an attribute of this name on instances to + affect this behavior. Setting this to :func:`str`, for example, would make + option names case sensitive. + + +.. _configparser-objects: + +ConfigParser Objects +-------------------- + +The :class:`ConfigParser` class extends some methods of the +:class:`RawConfigParser` interface, adding some optional arguments. + + +.. method:: ConfigParser.get(section, option[, raw[, vars]]) + + Get an *option* value for the named *section*. All the ``'%'`` interpolations + are expanded in the return values, based on the defaults passed into the + constructor, as well as the options *vars* provided, unless the *raw* argument + is true. + + +.. method:: ConfigParser.items(section[, raw[, vars]]) + + Return a list of ``(name, value)`` pairs for each option in the given *section*. + Optional arguments have the same meaning as for the :meth:`get` method. + + .. versionadded:: 2.3 + + +.. _safeconfigparser-objects: + +SafeConfigParser Objects +------------------------ + +The :class:`SafeConfigParser` class implements the same extended interface as +:class:`ConfigParser`, with the following addition: + + +.. method:: SafeConfigParser.set(section, option, value) + + If the given section exists, set the given option to the specified value; + otherwise raise :exc:`NoSectionError`. *value* must be a string (:class:`str` + or :class:`unicode`); if not, :exc:`TypeError` is raised. + + .. versionadded:: 2.4 + diff --git a/Doc/library/constants.rst b/Doc/library/constants.rst new file mode 100644 index 0000000..fecd836 --- /dev/null +++ b/Doc/library/constants.rst @@ -0,0 +1,42 @@ + +Built-in Constants +================== + +A small number of constants live in the built-in namespace. They are: + + +.. data:: False + + The false value of the :class:`bool` type. + + .. versionadded:: 2.3 + + +.. data:: True + + The true value of the :class:`bool` type. + + .. versionadded:: 2.3 + + +.. data:: None + + The sole value of :attr:`types.NoneType`. ``None`` is frequently used to + represent the absence of a value, as when default arguments are not passed to a + function. + + +.. data:: NotImplemented + + Special value which can be returned by the "rich comparison" special methods + (:meth:`__eq__`, :meth:`__lt__`, and friends), to indicate that the comparison + is not implemented with respect to the other type. + + +.. data:: Ellipsis + + The same as ``...``. Special value used mostly in conjunction with extended + slicing syntax for user-defined container data types. + + .. % XXX Someone who understands extended slicing should fill in here. + diff --git a/Doc/library/contextlib.rst b/Doc/library/contextlib.rst new file mode 100644 index 0000000..fffb99c --- /dev/null +++ b/Doc/library/contextlib.rst @@ -0,0 +1,120 @@ + +:mod:`contextlib` --- Utilities for :keyword:`with`\ -statement contexts. +========================================================================= + +.. module:: contextlib + :synopsis: Utilities for with-statement contexts. + + +.. versionadded:: 2.5 + +This module provides utilities for common tasks involving the :keyword:`with` +statement. For more information see also :ref:`typecontextmanager` and +:ref:`context-managers`. + +Functions provided: + + +.. function:: contextmanager(func) + + This function is a decorator that can be used to define a factory function for + :keyword:`with` statement context managers, without needing to create a class or + separate :meth:`__enter__` and :meth:`__exit__` methods. + + A simple example (this is not recommended as a real way of generating HTML!):: + + from __future__ import with_statement + from contextlib import contextmanager + + @contextmanager + def tag(name): + print "<%s>" % name + yield + print "</%s>" % name + + >>> with tag("h1"): + ... print "foo" + ... + <h1> + foo + </h1> + + The function being decorated must return a generator-iterator when called. This + iterator must yield exactly one value, which will be bound to the targets in the + :keyword:`with` statement's :keyword:`as` clause, if any. + + At the point where the generator yields, the block nested in the :keyword:`with` + statement is executed. The generator is then resumed after the block is exited. + If an unhandled exception occurs in the block, it is reraised inside the + generator at the point where the yield occurred. Thus, you can use a + :keyword:`try`...\ :keyword:`except`...\ :keyword:`finally` statement to trap + the error (if any), or ensure that some cleanup takes place. If an exception is + trapped merely in order to log it or to perform some action (rather than to + suppress it entirely), the generator must reraise that exception. Otherwise the + generator context manager will indicate to the :keyword:`with` statement that + the exception has been handled, and execution will resume with the statement + immediately following the :keyword:`with` statement. + + +.. function:: nested(mgr1[, mgr2[, ...]]) + + Combine multiple context managers into a single nested context manager. + + Code like this:: + + from contextlib import nested + + with nested(A, B, C) as (X, Y, Z): + do_something() + + is equivalent to this:: + + with A as X: + with B as Y: + with C as Z: + do_something() + + Note that if the :meth:`__exit__` method of one of the nested context managers + indicates an exception should be suppressed, no exception information will be + passed to any remaining outer context managers. Similarly, if the + :meth:`__exit__` method of one of the nested managers raises an exception, any + previous exception state will be lost; the new exception will be passed to the + :meth:`__exit__` methods of any remaining outer context managers. In general, + :meth:`__exit__` methods should avoid raising exceptions, and in particular they + should not re-raise a passed-in exception. + + +.. function:: closing(thing) + + Return a context manager that closes *thing* upon completion of the block. This + is basically equivalent to:: + + from contextlib import contextmanager + + @contextmanager + def closing(thing): + try: + yield thing + finally: + thing.close() + + And lets you write code like this:: + + from __future__ import with_statement + from contextlib import closing + import urllib + + with closing(urllib.urlopen('http://www.python.org')) as page: + for line in page: + print line + + without needing to explicitly close ``page``. Even if an error occurs, + ``page.close()`` will be called when the :keyword:`with` block is exited. + + +.. seealso:: + + :pep:`0343` - The "with" statement + The specification, background, and examples for the Python :keyword:`with` + statement. + diff --git a/Doc/library/cookie.rst b/Doc/library/cookie.rst new file mode 100644 index 0000000..5a5808f --- /dev/null +++ b/Doc/library/cookie.rst @@ -0,0 +1,282 @@ + +:mod:`Cookie` --- HTTP state management +======================================= + +.. module:: Cookie + :synopsis: Support for HTTP state management (cookies). +.. moduleauthor:: Timothy O'Malley <timo@alum.mit.edu> +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`Cookie` module defines classes for abstracting the concept of +cookies, an HTTP state management mechanism. It supports both simple string-only +cookies, and provides an abstraction for having any serializable data-type as +cookie value. + +The module formerly strictly applied the parsing rules described in the +:rfc:`2109` and :rfc:`2068` specifications. It has since been discovered that +MSIE 3.0x doesn't follow the character rules outlined in those specs. As a +result, the parsing rules used are a bit less strict. + + +.. exception:: CookieError + + Exception failing because of :rfc:`2109` invalidity: incorrect attributes, + incorrect :mailheader:`Set-Cookie` header, etc. + + +.. class:: BaseCookie([input]) + + This class is a dictionary-like object whose keys are strings and whose values + are :class:`Morsel` instances. Note that upon setting a key to a value, the + value is first converted to a :class:`Morsel` containing the key and the value. + + If *input* is given, it is passed to the :meth:`load` method. + + +.. class:: SimpleCookie([input]) + + This class derives from :class:`BaseCookie` and overrides :meth:`value_decode` + and :meth:`value_encode` to be the identity and :func:`str` respectively. + + +.. class:: SerialCookie([input]) + + This class derives from :class:`BaseCookie` and overrides :meth:`value_decode` + and :meth:`value_encode` to be the :func:`pickle.loads` and + :func:`pickle.dumps`. + + .. deprecated:: 2.3 + Reading pickled values from untrusted cookie data is a huge security hole, as + pickle strings can be crafted to cause arbitrary code to execute on your server. + It is supported for backwards compatibility only, and may eventually go away. + + +.. class:: SmartCookie([input]) + + This class derives from :class:`BaseCookie`. It overrides :meth:`value_decode` + to be :func:`pickle.loads` if it is a valid pickle, and otherwise the value + itself. It overrides :meth:`value_encode` to be :func:`pickle.dumps` unless it + is a string, in which case it returns the value itself. + + .. deprecated:: 2.3 + The same security warning from :class:`SerialCookie` applies here. + +A further security note is warranted. For backwards compatibility, the +:mod:`Cookie` module exports a class named :class:`Cookie` which is just an +alias for :class:`SmartCookie`. This is probably a mistake and will likely be +removed in a future version. You should not use the :class:`Cookie` class in +your applications, for the same reason why you should not use the +:class:`SerialCookie` class. + + +.. seealso:: + + Module :mod:`cookielib` + HTTP cookie handling for web *clients*. The :mod:`cookielib` and :mod:`Cookie` + modules do not depend on each other. + + :rfc:`2109` - HTTP State Management Mechanism + This is the state management specification implemented by this module. + + +.. _cookie-objects: + +Cookie Objects +-------------- + + +.. method:: BaseCookie.value_decode(val) + + Return a decoded value from a string representation. Return value can be any + type. This method does nothing in :class:`BaseCookie` --- it exists so it can be + overridden. + + +.. method:: BaseCookie.value_encode(val) + + Return an encoded value. *val* can be any type, but return value must be a + string. This method does nothing in :class:`BaseCookie` --- it exists so it can + be overridden + + In general, it should be the case that :meth:`value_encode` and + :meth:`value_decode` are inverses on the range of *value_decode*. + + +.. method:: BaseCookie.output([attrs[, header[, sep]]]) + + Return a string representation suitable to be sent as HTTP headers. *attrs* and + *header* are sent to each :class:`Morsel`'s :meth:`output` method. *sep* is used + to join the headers together, and is by default the combination ``'\r\n'`` + (CRLF). + + .. versionchanged:: 2.5 + The default separator has been changed from ``'\n'`` to match the cookie + specification. + + +.. method:: BaseCookie.js_output([attrs]) + + Return an embeddable JavaScript snippet, which, if run on a browser which + supports JavaScript, will act the same as if the HTTP headers was sent. + + The meaning for *attrs* is the same as in :meth:`output`. + + +.. method:: BaseCookie.load(rawdata) + + If *rawdata* is a string, parse it as an ``HTTP_COOKIE`` and add the values + found there as :class:`Morsel`\ s. If it is a dictionary, it is equivalent to:: + + for k, v in rawdata.items(): + cookie[k] = v + + +.. _morsel-objects: + +Morsel Objects +-------------- + + +.. class:: Morsel() + + Abstract a key/value pair, which has some :rfc:`2109` attributes. + + Morsels are dictionary-like objects, whose set of keys is constant --- the valid + :rfc:`2109` attributes, which are + + * ``expires`` + * ``path`` + * ``comment`` + * ``domain`` + * ``max-age`` + * ``secure`` + * ``version`` + + The keys are case-insensitive. + + +.. attribute:: Morsel.value + + The value of the cookie. + + +.. attribute:: Morsel.coded_value + + The encoded value of the cookie --- this is what should be sent. + + +.. attribute:: Morsel.key + + The name of the cookie. + + +.. method:: Morsel.set(key, value, coded_value) + + Set the *key*, *value* and *coded_value* members. + + +.. method:: Morsel.isReservedKey(K) + + Whether *K* is a member of the set of keys of a :class:`Morsel`. + + +.. method:: Morsel.output([attrs[, header]]) + + Return a string representation of the Morsel, suitable to be sent as an HTTP + header. By default, all the attributes are included, unless *attrs* is given, in + which case it should be a list of attributes to use. *header* is by default + ``"Set-Cookie:"``. + + +.. method:: Morsel.js_output([attrs]) + + Return an embeddable JavaScript snippet, which, if run on a browser which + supports JavaScript, will act the same as if the HTTP header was sent. + + The meaning for *attrs* is the same as in :meth:`output`. + + +.. method:: Morsel.OutputString([attrs]) + + Return a string representing the Morsel, without any surrounding HTTP or + JavaScript. + + The meaning for *attrs* is the same as in :meth:`output`. + + +.. _cookie-example: + +Example +------- + +The following example demonstrates how to use the :mod:`Cookie` module. :: + + >>> import Cookie + >>> C = Cookie.SimpleCookie() + >>> C = Cookie.SerialCookie() + >>> C = Cookie.SmartCookie() + >>> C["fig"] = "newton" + >>> C["sugar"] = "wafer" + >>> print C # generate HTTP headers + Set-Cookie: sugar=wafer + Set-Cookie: fig=newton + >>> print C.output() # same thing + Set-Cookie: sugar=wafer + Set-Cookie: fig=newton + >>> C = Cookie.SmartCookie() + >>> C["rocky"] = "road" + >>> C["rocky"]["path"] = "/cookie" + >>> print C.output(header="Cookie:") + Cookie: rocky=road; Path=/cookie + >>> print C.output(attrs=[], header="Cookie:") + Cookie: rocky=road + >>> C = Cookie.SmartCookie() + >>> C.load("chips=ahoy; vienna=finger") # load from a string (HTTP header) + >>> print C + Set-Cookie: vienna=finger + Set-Cookie: chips=ahoy + >>> C = Cookie.SmartCookie() + >>> C.load('keebler="E=everybody; L=\\"Loves\\"; fudge=\\012;";') + >>> print C + Set-Cookie: keebler="E=everybody; L=\"Loves\"; fudge=\012;" + >>> C = Cookie.SmartCookie() + >>> C["oreo"] = "doublestuff" + >>> C["oreo"]["path"] = "/" + >>> print C + Set-Cookie: oreo=doublestuff; Path=/ + >>> C = Cookie.SmartCookie() + >>> C["twix"] = "none for you" + >>> C["twix"].value + 'none for you' + >>> C = Cookie.SimpleCookie() + >>> C["number"] = 7 # equivalent to C["number"] = str(7) + >>> C["string"] = "seven" + >>> C["number"].value + '7' + >>> C["string"].value + 'seven' + >>> print C + Set-Cookie: number=7 + Set-Cookie: string=seven + >>> C = Cookie.SerialCookie() + >>> C["number"] = 7 + >>> C["string"] = "seven" + >>> C["number"].value + 7 + >>> C["string"].value + 'seven' + >>> print C + Set-Cookie: number="I7\012." + Set-Cookie: string="S'seven'\012p1\012." + >>> C = Cookie.SmartCookie() + >>> C["number"] = 7 + >>> C["string"] = "seven" + >>> C["number"].value + 7 + >>> C["string"].value + 'seven' + >>> print C + Set-Cookie: number="I7\012." + Set-Cookie: string=seven + diff --git a/Doc/library/cookielib.rst b/Doc/library/cookielib.rst new file mode 100644 index 0000000..44045d3 --- /dev/null +++ b/Doc/library/cookielib.rst @@ -0,0 +1,768 @@ + +:mod:`cookielib` --- Cookie handling for HTTP clients +===================================================== + +.. module:: cookielib + :synopsis: Classes for automatic handling of HTTP cookies. +.. moduleauthor:: John J. Lee <jjl@pobox.com> +.. sectionauthor:: John J. Lee <jjl@pobox.com> + + +.. versionadded:: 2.4 + + + +The :mod:`cookielib` module defines classes for automatic handling of HTTP +cookies. It is useful for accessing web sites that require small pieces of data +-- :dfn:`cookies` -- to be set on the client machine by an HTTP response from a +web server, and then returned to the server in later HTTP requests. + +Both the regular Netscape cookie protocol and the protocol defined by +:rfc:`2965` are handled. RFC 2965 handling is switched off by default. +:rfc:`2109` cookies are parsed as Netscape cookies and subsequently treated +either as Netscape or RFC 2965 cookies according to the 'policy' in effect. +Note that the great majority of cookies on the Internet are Netscape cookies. +:mod:`cookielib` attempts to follow the de-facto Netscape cookie protocol (which +differs substantially from that set out in the original Netscape specification), +including taking note of the ``max-age`` and ``port`` cookie-attributes +introduced with RFC 2965. + +.. note:: + + The various named parameters found in :mailheader:`Set-Cookie` and + :mailheader:`Set-Cookie2` headers (eg. ``domain`` and ``expires``) are + conventionally referred to as :dfn:`attributes`. To distinguish them from + Python attributes, the documentation for this module uses the term + :dfn:`cookie-attribute` instead. + + +The module defines the following exception: + + +.. exception:: LoadError + + Instances of :class:`FileCookieJar` raise this exception on failure to load + cookies from a file. + + .. note:: + + For backwards-compatibility with Python 2.4 (which raised an :exc:`IOError`), + :exc:`LoadError` is a subclass of :exc:`IOError`. + + +The following classes are provided: + + +.. class:: CookieJar(policy=None) + + *policy* is an object implementing the :class:`CookiePolicy` interface. + + The :class:`CookieJar` class stores HTTP cookies. It extracts cookies from HTTP + requests, and returns them in HTTP responses. :class:`CookieJar` instances + automatically expire contained cookies when necessary. Subclasses are also + responsible for storing and retrieving cookies from a file or database. + + +.. class:: FileCookieJar(filename, delayload=None, policy=None) + + *policy* is an object implementing the :class:`CookiePolicy` interface. For the + other arguments, see the documentation for the corresponding attributes. + + A :class:`CookieJar` which can load cookies from, and perhaps save cookies to, a + file on disk. Cookies are **NOT** loaded from the named file until either the + :meth:`load` or :meth:`revert` method is called. Subclasses of this class are + documented in section :ref:`file-cookie-jar-classes`. + + +.. class:: CookiePolicy() + + This class is responsible for deciding whether each cookie should be accepted + from / returned to the server. + + +.. class:: DefaultCookiePolicy( blocked_domains=None, allowed_domains=None, netscape=True, rfc2965=False, rfc2109_as_netscape=None, hide_cookie2=False, strict_domain=False, strict_rfc2965_unverifiable=True, strict_ns_unverifiable=False, strict_ns_domain=DefaultCookiePolicy.DomainLiberal, strict_ns_set_initial_dollar=False, strict_ns_set_path=False ) + + Constructor arguments should be passed as keyword arguments only. + *blocked_domains* is a sequence of domain names that we never accept cookies + from, nor return cookies to. *allowed_domains* if not :const:`None`, this is a + sequence of the only domains for which we accept and return cookies. For all + other arguments, see the documentation for :class:`CookiePolicy` and + :class:`DefaultCookiePolicy` objects. + + :class:`DefaultCookiePolicy` implements the standard accept / reject rules for + Netscape and RFC 2965 cookies. By default, RFC 2109 cookies (ie. cookies + received in a :mailheader:`Set-Cookie` header with a version cookie-attribute of + 1) are treated according to the RFC 2965 rules. However, if RFC 2965 handling + is turned off or :attr:`rfc2109_as_netscape` is True, RFC 2109 cookies are + 'downgraded' by the :class:`CookieJar` instance to Netscape cookies, by + setting the :attr:`version` attribute of the :class:`Cookie` instance to 0. + :class:`DefaultCookiePolicy` also provides some parameters to allow some + fine-tuning of policy. + + +.. class:: Cookie() + + This class represents Netscape, RFC 2109 and RFC 2965 cookies. It is not + expected that users of :mod:`cookielib` construct their own :class:`Cookie` + instances. Instead, if necessary, call :meth:`make_cookies` on a + :class:`CookieJar` instance. + + +.. seealso:: + + Module :mod:`urllib2` + URL opening with automatic cookie handling. + + Module :mod:`Cookie` + HTTP cookie classes, principally useful for server-side code. The + :mod:`cookielib` and :mod:`Cookie` modules do not depend on each other. + + http://wwwsearch.sf.net/ClientCookie/ + Extensions to this module, including a class for reading Microsoft Internet + Explorer cookies on Windows. + + http://www.netscape.com/newsref/std/cookie_spec.html + The specification of the original Netscape cookie protocol. Though this is + still the dominant protocol, the 'Netscape cookie protocol' implemented by all + the major browsers (and :mod:`cookielib`) only bears a passing resemblance to + the one sketched out in ``cookie_spec.html``. + + :rfc:`2109` - HTTP State Management Mechanism + Obsoleted by RFC 2965. Uses :mailheader:`Set-Cookie` with version=1. + + :rfc:`2965` - HTTP State Management Mechanism + The Netscape protocol with the bugs fixed. Uses :mailheader:`Set-Cookie2` in + place of :mailheader:`Set-Cookie`. Not widely used. + + http://kristol.org/cookie/errata.html + Unfinished errata to RFC 2965. + + :rfc:`2964` - Use of HTTP State Management + +.. _cookie-jar-objects: + +CookieJar and FileCookieJar Objects +----------------------------------- + +:class:`CookieJar` objects support the iterator protocol for iterating over +contained :class:`Cookie` objects. + +:class:`CookieJar` has the following methods: + + +.. method:: CookieJar.add_cookie_header(request) + + Add correct :mailheader:`Cookie` header to *request*. + + If policy allows (ie. the :attr:`rfc2965` and :attr:`hide_cookie2` attributes of + the :class:`CookieJar`'s :class:`CookiePolicy` instance are true and false + respectively), the :mailheader:`Cookie2` header is also added when appropriate. + + The *request* object (usually a :class:`urllib2.Request` instance) must support + the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`get_type`, + :meth:`unverifiable`, :meth:`get_origin_req_host`, :meth:`has_header`, + :meth:`get_header`, :meth:`header_items`, and :meth:`add_unredirected_header`,as + documented by :mod:`urllib2`. + + +.. method:: CookieJar.extract_cookies(response, request) + + Extract cookies from HTTP *response* and store them in the :class:`CookieJar`, + where allowed by policy. + + The :class:`CookieJar` will look for allowable :mailheader:`Set-Cookie` and + :mailheader:`Set-Cookie2` headers in the *response* argument, and store cookies + as appropriate (subject to the :meth:`CookiePolicy.set_ok` method's approval). + + The *response* object (usually the result of a call to :meth:`urllib2.urlopen`, + or similar) should support an :meth:`info` method, which returns an object with + a :meth:`getallmatchingheaders` method (usually a :class:`mimetools.Message` + instance). + + The *request* object (usually a :class:`urllib2.Request` instance) must support + the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`unverifiable`, and + :meth:`get_origin_req_host`, as documented by :mod:`urllib2`. The request is + used to set default values for cookie-attributes as well as for checking that + the cookie is allowed to be set. + + +.. method:: CookieJar.set_policy(policy) + + Set the :class:`CookiePolicy` instance to be used. + + +.. method:: CookieJar.make_cookies(response, request) + + Return sequence of :class:`Cookie` objects extracted from *response* object. + + See the documentation for :meth:`extract_cookies` for the interfaces required of + the *response* and *request* arguments. + + +.. method:: CookieJar.set_cookie_if_ok(cookie, request) + + Set a :class:`Cookie` if policy says it's OK to do so. + + +.. method:: CookieJar.set_cookie(cookie) + + Set a :class:`Cookie`, without checking with policy to see whether or not it + should be set. + + +.. method:: CookieJar.clear([domain[, path[, name]]]) + + Clear some cookies. + + If invoked without arguments, clear all cookies. If given a single argument, + only cookies belonging to that *domain* will be removed. If given two arguments, + cookies belonging to the specified *domain* and URL *path* are removed. If + given three arguments, then the cookie with the specified *domain*, *path* and + *name* is removed. + + Raises :exc:`KeyError` if no matching cookie exists. + + +.. method:: CookieJar.clear_session_cookies() + + Discard all session cookies. + + Discards all contained cookies that have a true :attr:`discard` attribute + (usually because they had either no ``max-age`` or ``expires`` cookie-attribute, + or an explicit ``discard`` cookie-attribute). For interactive browsers, the end + of a session usually corresponds to closing the browser window. + + Note that the :meth:`save` method won't save session cookies anyway, unless you + ask otherwise by passing a true *ignore_discard* argument. + +:class:`FileCookieJar` implements the following additional methods: + + +.. method:: FileCookieJar.save(filename=None, ignore_discard=False, ignore_expires=False) + + Save cookies to a file. + + This base class raises :exc:`NotImplementedError`. Subclasses may leave this + method unimplemented. + + *filename* is the name of file in which to save cookies. If *filename* is not + specified, :attr:`self.filename` is used (whose default is the value passed to + the constructor, if any); if :attr:`self.filename` is :const:`None`, + :exc:`ValueError` is raised. + + *ignore_discard*: save even cookies set to be discarded. *ignore_expires*: save + even cookies that have expired + + The file is overwritten if it already exists, thus wiping all the cookies it + contains. Saved cookies can be restored later using the :meth:`load` or + :meth:`revert` methods. + + +.. method:: FileCookieJar.load(filename=None, ignore_discard=False, ignore_expires=False) + + Load cookies from a file. + + Old cookies are kept unless overwritten by newly loaded ones. + + Arguments are as for :meth:`save`. + + The named file must be in the format understood by the class, or + :exc:`LoadError` will be raised. Also, :exc:`IOError` may be raised, for + example if the file does not exist. + + .. note:: + + For backwards-compatibility with Python 2.4 (which raised an :exc:`IOError`), + :exc:`LoadError` is a subclass of :exc:`IOError`. + + +.. method:: FileCookieJar.revert(filename=None, ignore_discard=False, ignore_expires=False) + + Clear all cookies and reload cookies from a saved file. + + :meth:`revert` can raise the same exceptions as :meth:`load`. If there is a + failure, the object's state will not be altered. + +:class:`FileCookieJar` instances have the following public attributes: + + +.. attribute:: FileCookieJar.filename + + Filename of default file in which to keep cookies. This attribute may be + assigned to. + + +.. attribute:: FileCookieJar.delayload + + If true, load cookies lazily from disk. This attribute should not be assigned + to. This is only a hint, since this only affects performance, not behaviour + (unless the cookies on disk are changing). A :class:`CookieJar` object may + ignore it. None of the :class:`FileCookieJar` classes included in the standard + library lazily loads cookies. + + +.. _file-cookie-jar-classes: + +FileCookieJar subclasses and co-operation with web browsers +----------------------------------------------------------- + +The following :class:`CookieJar` subclasses are provided for reading and writing +. Further :class:`CookieJar` subclasses, including one that reads Microsoft +Internet Explorer cookies, are available at +http://wwwsearch.sf.net/ClientCookie/. + + +.. class:: MozillaCookieJar(filename, delayload=None, policy=None) + + A :class:`FileCookieJar` that can load from and save cookies to disk in the + Mozilla ``cookies.txt`` file format (which is also used by the Lynx and Netscape + browsers). + + .. note:: + + This loses information about RFC 2965 cookies, and also about newer or + non-standard cookie-attributes such as ``port``. + + .. warning:: + + Back up your cookies before saving if you have cookies whose loss / corruption + would be inconvenient (there are some subtleties which may lead to slight + changes in the file over a load / save round-trip). + + Also note that cookies saved while Mozilla is running will get clobbered by + Mozilla. + + +.. class:: LWPCookieJar(filename, delayload=None, policy=None) + + A :class:`FileCookieJar` that can load from and save cookies to disk in format + compatible with the libwww-perl library's ``Set-Cookie3`` file format. This is + convenient if you want to store cookies in a human-readable file. + + +.. _cookie-policy-objects: + +CookiePolicy Objects +-------------------- + +Objects implementing the :class:`CookiePolicy` interface have the following +methods: + + +.. method:: CookiePolicy.set_ok(cookie, request) + + Return boolean value indicating whether cookie should be accepted from server. + + *cookie* is a :class:`cookielib.Cookie` instance. *request* is an object + implementing the interface defined by the documentation for + :meth:`CookieJar.extract_cookies`. + + +.. method:: CookiePolicy.return_ok(cookie, request) + + Return boolean value indicating whether cookie should be returned to server. + + *cookie* is a :class:`cookielib.Cookie` instance. *request* is an object + implementing the interface defined by the documentation for + :meth:`CookieJar.add_cookie_header`. + + +.. method:: CookiePolicy.domain_return_ok(domain, request) + + Return false if cookies should not be returned, given cookie domain. + + This method is an optimization. It removes the need for checking every cookie + with a particular domain (which might involve reading many files). Returning + true from :meth:`domain_return_ok` and :meth:`path_return_ok` leaves all the + work to :meth:`return_ok`. + + If :meth:`domain_return_ok` returns true for the cookie domain, + :meth:`path_return_ok` is called for the cookie path. Otherwise, + :meth:`path_return_ok` and :meth:`return_ok` are never called for that cookie + domain. If :meth:`path_return_ok` returns true, :meth:`return_ok` is called + with the :class:`Cookie` object itself for a full check. Otherwise, + :meth:`return_ok` is never called for that cookie path. + + Note that :meth:`domain_return_ok` is called for every *cookie* domain, not just + for the *request* domain. For example, the function might be called with both + ``".example.com"`` and ``"www.example.com"`` if the request domain is + ``"www.example.com"``. The same goes for :meth:`path_return_ok`. + + The *request* argument is as documented for :meth:`return_ok`. + + +.. method:: CookiePolicy.path_return_ok(path, request) + + Return false if cookies should not be returned, given cookie path. + + See the documentation for :meth:`domain_return_ok`. + +In addition to implementing the methods above, implementations of the +:class:`CookiePolicy` interface must also supply the following attributes, +indicating which protocols should be used, and how. All of these attributes may +be assigned to. + + +.. attribute:: CookiePolicy.netscape + + Implement Netscape protocol. + + +.. attribute:: CookiePolicy.rfc2965 + + Implement RFC 2965 protocol. + + +.. attribute:: CookiePolicy.hide_cookie2 + + Don't add :mailheader:`Cookie2` header to requests (the presence of this header + indicates to the server that we understand RFC 2965 cookies). + +The most useful way to define a :class:`CookiePolicy` class is by subclassing +from :class:`DefaultCookiePolicy` and overriding some or all of the methods +above. :class:`CookiePolicy` itself may be used as a 'null policy' to allow +setting and receiving any and all cookies (this is unlikely to be useful). + + +.. _default-cookie-policy-objects: + +DefaultCookiePolicy Objects +--------------------------- + +Implements the standard rules for accepting and returning cookies. + +Both RFC 2965 and Netscape cookies are covered. RFC 2965 handling is switched +off by default. + +The easiest way to provide your own policy is to override this class and call +its methods in your overridden implementations before adding your own additional +checks:: + + import cookielib + class MyCookiePolicy(cookielib.DefaultCookiePolicy): + def set_ok(self, cookie, request): + if not cookielib.DefaultCookiePolicy.set_ok(self, cookie, request): + return False + if i_dont_want_to_store_this_cookie(cookie): + return False + return True + +In addition to the features required to implement the :class:`CookiePolicy` +interface, this class allows you to block and allow domains from setting and +receiving cookies. There are also some strictness switches that allow you to +tighten up the rather loose Netscape protocol rules a little bit (at the cost of +blocking some benign cookies). + +A domain blacklist and whitelist is provided (both off by default). Only domains +not in the blacklist and present in the whitelist (if the whitelist is active) +participate in cookie setting and returning. Use the *blocked_domains* +constructor argument, and :meth:`blocked_domains` and +:meth:`set_blocked_domains` methods (and the corresponding argument and methods +for *allowed_domains*). If you set a whitelist, you can turn it off again by +setting it to :const:`None`. + +Domains in block or allow lists that do not start with a dot must equal the +cookie domain to be matched. For example, ``"example.com"`` matches a blacklist +entry of ``"example.com"``, but ``"www.example.com"`` does not. Domains that do +start with a dot are matched by more specific domains too. For example, both +``"www.example.com"`` and ``"www.coyote.example.com"`` match ``".example.com"`` +(but ``"example.com"`` itself does not). IP addresses are an exception, and +must match exactly. For example, if blocked_domains contains ``"192.168.1.2"`` +and ``".168.1.2"``, 192.168.1.2 is blocked, but 193.168.1.2 is not. + +:class:`DefaultCookiePolicy` implements the following additional methods: + + +.. method:: DefaultCookiePolicy.blocked_domains() + + Return the sequence of blocked domains (as a tuple). + + +.. method:: DefaultCookiePolicy.set_blocked_domains(blocked_domains) + + Set the sequence of blocked domains. + + +.. method:: DefaultCookiePolicy.is_blocked(domain) + + Return whether *domain* is on the blacklist for setting or receiving cookies. + + +.. method:: DefaultCookiePolicy.allowed_domains() + + Return :const:`None`, or the sequence of allowed domains (as a tuple). + + +.. method:: DefaultCookiePolicy.set_allowed_domains(allowed_domains) + + Set the sequence of allowed domains, or :const:`None`. + + +.. method:: DefaultCookiePolicy.is_not_allowed(domain) + + Return whether *domain* is not on the whitelist for setting or receiving + cookies. + +:class:`DefaultCookiePolicy` instances have the following attributes, which are +all initialised from the constructor arguments of the same name, and which may +all be assigned to. + + +.. attribute:: DefaultCookiePolicy.rfc2109_as_netscape + + If true, request that the :class:`CookieJar` instance downgrade RFC 2109 cookies + (ie. cookies received in a :mailheader:`Set-Cookie` header with a version + cookie-attribute of 1) to Netscape cookies by setting the version attribute of + the :class:`Cookie` instance to 0. The default value is :const:`None`, in which + case RFC 2109 cookies are downgraded if and only if RFC 2965 handling is turned + off. Therefore, RFC 2109 cookies are downgraded by default. + + .. versionadded:: 2.5 + +General strictness switches: + + +.. attribute:: DefaultCookiePolicy.strict_domain + + Don't allow sites to set two-component domains with country-code top-level + domains like ``.co.uk``, ``.gov.uk``, ``.co.nz``.etc. This is far from perfect + and isn't guaranteed to work! + +RFC 2965 protocol strictness switches: + + +.. attribute:: DefaultCookiePolicy.strict_rfc2965_unverifiable + + Follow RFC 2965 rules on unverifiable transactions (usually, an unverifiable + transaction is one resulting from a redirect or a request for an image hosted on + another site). If this is false, cookies are *never* blocked on the basis of + verifiability + +Netscape protocol strictness switches: + + +.. attribute:: DefaultCookiePolicy.strict_ns_unverifiable + + apply RFC 2965 rules on unverifiable transactions even to Netscape cookies + + +.. attribute:: DefaultCookiePolicy.strict_ns_domain + + Flags indicating how strict to be with domain-matching rules for Netscape + cookies. See below for acceptable values. + + +.. attribute:: DefaultCookiePolicy.strict_ns_set_initial_dollar + + Ignore cookies in Set-Cookie: headers that have names starting with ``'$'``. + + +.. attribute:: DefaultCookiePolicy.strict_ns_set_path + + Don't allow setting cookies whose path doesn't path-match request URI. + +:attr:`strict_ns_domain` is a collection of flags. Its value is constructed by +or-ing together (for example, ``DomainStrictNoDots|DomainStrictNonDomain`` means +both flags are set). + + +.. attribute:: DefaultCookiePolicy.DomainStrictNoDots + + When setting cookies, the 'host prefix' must not contain a dot (eg. + ``www.foo.bar.com`` can't set a cookie for ``.bar.com``, because ``www.foo`` + contains a dot). + + +.. attribute:: DefaultCookiePolicy.DomainStrictNonDomain + + Cookies that did not explicitly specify a ``domain`` cookie-attribute can only + be returned to a domain equal to the domain that set the cookie (eg. + ``spam.example.com`` won't be returned cookies from ``example.com`` that had no + ``domain`` cookie-attribute). + + +.. attribute:: DefaultCookiePolicy.DomainRFC2965Match + + When setting cookies, require a full RFC 2965 domain-match. + +The following attributes are provided for convenience, and are the most useful +combinations of the above flags: + + +.. attribute:: DefaultCookiePolicy.DomainLiberal + + Equivalent to 0 (ie. all of the above Netscape domain strictness flags switched + off). + + +.. attribute:: DefaultCookiePolicy.DomainStrict + + Equivalent to ``DomainStrictNoDots|DomainStrictNonDomain``. + + +.. _cookielib-cookie-objects: + +Cookie Objects +-------------- + +:class:`Cookie` instances have Python attributes roughly corresponding to the +standard cookie-attributes specified in the various cookie standards. The +correspondence is not one-to-one, because there are complicated rules for +assigning default values, because the ``max-age`` and ``expires`` +cookie-attributes contain equivalent information, and because RFC 2109 cookies +may be 'downgraded' by :mod:`cookielib` from version 1 to version 0 (Netscape) +cookies. + +Assignment to these attributes should not be necessary other than in rare +circumstances in a :class:`CookiePolicy` method. The class does not enforce +internal consistency, so you should know what you're doing if you do that. + + +.. attribute:: Cookie.version + + Integer or :const:`None`. Netscape cookies have :attr:`version` 0. RFC 2965 and + RFC 2109 cookies have a ``version`` cookie-attribute of 1. However, note that + :mod:`cookielib` may 'downgrade' RFC 2109 cookies to Netscape cookies, in which + case :attr:`version` is 0. + + +.. attribute:: Cookie.name + + Cookie name (a string). + + +.. attribute:: Cookie.value + + Cookie value (a string), or :const:`None`. + + +.. attribute:: Cookie.port + + String representing a port or a set of ports (eg. '80', or '80,8080'), or + :const:`None`. + + +.. attribute:: Cookie.path + + Cookie path (a string, eg. ``'/acme/rocket_launchers'``). + + +.. attribute:: Cookie.secure + + True if cookie should only be returned over a secure connection. + + +.. attribute:: Cookie.expires + + Integer expiry date in seconds since epoch, or :const:`None`. See also the + :meth:`is_expired` method. + + +.. attribute:: Cookie.discard + + True if this is a session cookie. + + +.. attribute:: Cookie.comment + + String comment from the server explaining the function of this cookie, or + :const:`None`. + + +.. attribute:: Cookie.comment_url + + URL linking to a comment from the server explaining the function of this cookie, + or :const:`None`. + + +.. attribute:: Cookie.rfc2109 + + True if this cookie was received as an RFC 2109 cookie (ie. the cookie + arrived in a :mailheader:`Set-Cookie` header, and the value of the Version + cookie-attribute in that header was 1). This attribute is provided because + :mod:`cookielib` may 'downgrade' RFC 2109 cookies to Netscape cookies, in + which case :attr:`version` is 0. + + .. versionadded:: 2.5 + + +.. attribute:: Cookie.port_specified + + True if a port or set of ports was explicitly specified by the server (in the + :mailheader:`Set-Cookie` / :mailheader:`Set-Cookie2` header). + + +.. attribute:: Cookie.domain_specified + + True if a domain was explicitly specified by the server. + + +.. attribute:: Cookie.domain_initial_dot + + True if the domain explicitly specified by the server began with a dot + (``'.'``). + +Cookies may have additional non-standard cookie-attributes. These may be +accessed using the following methods: + + +.. method:: Cookie.has_nonstandard_attr(name) + + Return true if cookie has the named cookie-attribute. + + +.. method:: Cookie.get_nonstandard_attr(name, default=None) + + If cookie has the named cookie-attribute, return its value. Otherwise, return + *default*. + + +.. method:: Cookie.set_nonstandard_attr(name, value) + + Set the value of the named cookie-attribute. + +The :class:`Cookie` class also defines the following method: + + +.. method:: Cookie.is_expired([now=:const:`None`]) + + True if cookie has passed the time at which the server requested it should + expire. If *now* is given (in seconds since the epoch), return whether the + cookie has expired at the specified time. + + +.. _cookielib-examples: + +Examples +-------- + +The first example shows the most common usage of :mod:`cookielib`:: + + import cookielib, urllib2 + cj = cookielib.CookieJar() + opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) + r = opener.open("http://example.com/") + +This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx +cookies (assumes Unix/Netscape convention for location of the cookies file):: + + import os, cookielib, urllib2 + cj = cookielib.MozillaCookieJar() + cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt")) + opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) + r = opener.open("http://example.com/") + +The next example illustrates the use of :class:`DefaultCookiePolicy`. Turn on +RFC 2965 cookies, be more strict about domains when setting and returning +Netscape cookies, and block some domains from setting cookies or having them +returned:: + + import urllib2 + from cookielib import CookieJar, DefaultCookiePolicy + policy = DefaultCookiePolicy( + rfc2965=True, strict_ns_domain=Policy.DomainStrict, + blocked_domains=["ads.net", ".ads.net"]) + cj = CookieJar(policy) + opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) + r = opener.open("http://example.com/") + diff --git a/Doc/library/copy.rst b/Doc/library/copy.rst new file mode 100644 index 0000000..6fb3100 --- /dev/null +++ b/Doc/library/copy.rst @@ -0,0 +1,85 @@ + +:mod:`copy` --- Shallow and deep copy operations +================================================ + +.. module:: copy + :synopsis: Shallow and deep copy operations. + + +.. index:: + single: copy() (in copy) + single: deepcopy() (in copy) + +This module provides generic (shallow and deep) copying operations. + +Interface summary:: + + import copy + + x = copy.copy(y) # make a shallow copy of y + x = copy.deepcopy(y) # make a deep copy of y + +For module specific errors, :exc:`copy.error` is raised. + +.. % + +The difference between shallow and deep copying is only relevant for compound +objects (objects that contain other objects, like lists or class instances): + +* A *shallow copy* constructs a new compound object and then (to the extent + possible) inserts *references* into it to the objects found in the original. + +* A *deep copy* constructs a new compound object and then, recursively, inserts + *copies* into it of the objects found in the original. + +Two problems often exist with deep copy operations that don't exist with shallow +copy operations: + +* Recursive objects (compound objects that, directly or indirectly, contain a + reference to themselves) may cause a recursive loop. + +* Because deep copy copies *everything* it may copy too much, e.g., + administrative data structures that should be shared even between copies. + +The :func:`deepcopy` function avoids these problems by: + +* keeping a "memo" dictionary of objects already copied during the current + copying pass; and + +* letting user-defined classes override the copying operation or the set of + components copied. + +This module does not copy types like module, method, stack trace, stack frame, +file, socket, window, array, or any similar types. It does "copy" functions and +classes (shallow and deeply), by returning the original object unchanged; this +is compatible with the way these are treated by the :mod:`pickle` module. + +.. versionchanged:: 2.5 + Added copying functions. + +.. index:: module: pickle + +Classes can use the same interfaces to control copying that they use to control +pickling. See the description of module :mod:`pickle` for information on these +methods. The :mod:`copy` module does not use the :mod:`copy_reg` registration +module. + +.. index:: + single: __copy__() (copy protocol) + single: __deepcopy__() (copy protocol) + +In order for a class to define its own copy implementation, it can define +special methods :meth:`__copy__` and :meth:`__deepcopy__`. The former is called +to implement the shallow copy operation; no additional arguments are passed. +The latter is called to implement the deep copy operation; it is passed one +argument, the memo dictionary. If the :meth:`__deepcopy__` implementation needs +to make a deep copy of a component, it should call the :func:`deepcopy` function +with the component as first argument and the memo dictionary as second argument. + + +.. seealso:: + + Module :mod:`pickle` + Discussion of the special methods used to support object state retrieval and + restoration. + diff --git a/Doc/library/copy_reg.rst b/Doc/library/copy_reg.rst new file mode 100644 index 0000000..9b82a31 --- /dev/null +++ b/Doc/library/copy_reg.rst @@ -0,0 +1,42 @@ + +:mod:`copy_reg` --- Register :mod:`pickle` support functions +============================================================ + +.. module:: copy_reg + :synopsis: Register pickle support functions. + + +.. index:: + module: pickle + module: cPickle + module: copy + +The :mod:`copy_reg` module provides support for the :mod:`pickle` and +:mod:`cPickle` modules. The :mod:`copy` module is likely to use this in the +future as well. It provides configuration information about object constructors +which are not classes. Such constructors may be factory functions or class +instances. + + +.. function:: constructor(object) + + Declares *object* to be a valid constructor. If *object* is not callable (and + hence not valid as a constructor), raises :exc:`TypeError`. + + +.. function:: pickle(type, function[, constructor]) + + Declares that *function* should be used as a "reduction" function for objects of + type *type*; *type* must not be a "classic" class object. (Classic classes are + handled differently; see the documentation for the :mod:`pickle` module for + details.) *function* should return either a string or a tuple containing two or + three elements. + + The optional *constructor* parameter, if provided, is a callable object which + can be used to reconstruct the object when called with the tuple of arguments + returned by *function* at pickling time. :exc:`TypeError` will be raised if + *object* is a class or *constructor* is not callable. + + See the :mod:`pickle` module for more details on the interface expected of + *function* and *constructor*. + diff --git a/Doc/library/crypt.rst b/Doc/library/crypt.rst new file mode 100644 index 0000000..8840fc7 --- /dev/null +++ b/Doc/library/crypt.rst @@ -0,0 +1,66 @@ + +:mod:`crypt` --- Function to check Unix passwords +================================================= + +.. module:: crypt + :platform: Unix + :synopsis: The crypt() function used to check Unix passwords. +.. moduleauthor:: Steven D. Majewski <sdm7g@virginia.edu> +.. sectionauthor:: Steven D. Majewski <sdm7g@virginia.edu> +.. sectionauthor:: Peter Funk <pf@artcom-gmbh.de> + + +.. index:: + single: crypt(3) + pair: cipher; DES + +This module implements an interface to the :manpage:`crypt(3)` routine, which is +a one-way hash function based upon a modified DES algorithm; see the Unix man +page for further details. Possible uses include allowing Python scripts to +accept typed passwords from the user, or attempting to crack Unix passwords with +a dictionary. + +.. index:: single: crypt(3) + +Notice that the behavior of this module depends on the actual implementation of +the :manpage:`crypt(3)` routine in the running system. Therefore, any +extensions available on the current implementation will also be available on +this module. + + +.. function:: crypt(word, salt) + + *word* will usually be a user's password as typed at a prompt or in a graphical + interface. *salt* is usually a random two-character string which will be used + to perturb the DES algorithm in one of 4096 ways. The characters in *salt* must + be in the set ``[./a-zA-Z0-9]``. Returns the hashed password as a string, which + will be composed of characters from the same alphabet as the salt (the first two + characters represent the salt itself). + + .. index:: single: crypt(3) + + Since a few :manpage:`crypt(3)` extensions allow different values, with + different sizes in the *salt*, it is recommended to use the full crypted + password as salt when checking for a password. + +A simple example illustrating typical use:: + + import crypt, getpass, pwd + + def raw_input(prompt): + import sys + sys.stdout.write(prompt) + sys.stdout.flush() + return sys.stdin.readline() + + def login(): + username = raw_input('Python login:') + cryptedpasswd = pwd.getpwnam(username)[1] + if cryptedpasswd: + if cryptedpasswd == 'x' or cryptedpasswd == '*': + raise "Sorry, currently no support for shadow passwords" + cleartext = getpass.getpass() + return crypt.crypt(cleartext, cryptedpasswd) == cryptedpasswd + else: + return 1 + diff --git a/Doc/library/crypto.rst b/Doc/library/crypto.rst new file mode 100644 index 0000000..dce5a01 --- /dev/null +++ b/Doc/library/crypto.rst @@ -0,0 +1,30 @@ + +.. _crypto: + +********************** +Cryptographic Services +********************** + +.. index:: single: cryptography + +The modules described in this chapter implement various algorithms of a +cryptographic nature. They are available at the discretion of the installation. +Here's an overview: + + +.. toctree:: + + hashlib.rst + hmac.rst + +.. index:: + pair: AES; algorithm + single: cryptography + single: Kuchling, Andrew + +Hardcore cypherpunks will probably find the cryptographic modules written by +A.M. Kuchling of further interest; the package contains modules for various +encryption algorithms, most notably AES. These modules are not distributed with +Python but available separately. See the URL +http://www.amk.ca/python/code/crypto.html for more information. + diff --git a/Doc/library/csv.rst b/Doc/library/csv.rst new file mode 100644 index 0000000..19123c6 --- /dev/null +++ b/Doc/library/csv.rst @@ -0,0 +1,530 @@ + +:mod:`csv` --- CSV File Reading and Writing +=========================================== + +.. module:: csv + :synopsis: Write and read tabular data to and from delimited files. +.. sectionauthor:: Skip Montanaro <skip@pobox.com> + + +.. versionadded:: 2.3 + +.. index:: + single: csv + pair: data; tabular + +The so-called CSV (Comma Separated Values) format is the most common import and +export format for spreadsheets and databases. There is no "CSV standard", so +the format is operationally defined by the many applications which read and +write it. The lack of a standard means that subtle differences often exist in +the data produced and consumed by different applications. These differences can +make it annoying to process CSV files from multiple sources. Still, while the +delimiters and quoting characters vary, the overall format is similar enough +that it is possible to write a single module which can efficiently manipulate +such data, hiding the details of reading and writing the data from the +programmer. + +The :mod:`csv` module implements classes to read and write tabular data in CSV +format. It allows programmers to say, "write this data in the format preferred +by Excel," or "read data from this file which was generated by Excel," without +knowing the precise details of the CSV format used by Excel. Programmers can +also describe the CSV formats understood by other applications or define their +own special-purpose CSV formats. + +The :mod:`csv` module's :class:`reader` and :class:`writer` objects read and +write sequences. Programmers can also read and write data in dictionary form +using the :class:`DictReader` and :class:`DictWriter` classes. + +.. note:: + + This version of the :mod:`csv` module doesn't support Unicode input. Also, + there are currently some issues regarding ASCII NUL characters. Accordingly, + all input should be UTF-8 or printable ASCII to be safe; see the examples in + section :ref:`csv-examples`. These restrictions will be removed in the future. + + +.. seealso:: + + .. % \seemodule{array}{Arrays of uniformly types numeric values.} + + :pep:`305` - CSV File API + The Python Enhancement Proposal which proposed this addition to Python. + + +.. _csv-contents: + +Module Contents +--------------- + +The :mod:`csv` module defines the following functions: + + +.. function:: reader(csvfile[, dialect='excel'][, fmtparam]) + + Return a reader object which will iterate over lines in the given *csvfile*. + *csvfile* can be any object which supports the iterator protocol and returns a + string each time its :meth:`next` method is called --- file objects and list + objects are both suitable. If *csvfile* is a file object, it must be opened + with the 'b' flag on platforms where that makes a difference. An optional + *dialect* parameter can be given which is used to define a set of parameters + specific to a particular CSV dialect. It may be an instance of a subclass of + the :class:`Dialect` class or one of the strings returned by the + :func:`list_dialects` function. The other optional *fmtparam* keyword arguments + can be given to override individual formatting parameters in the current + dialect. For full details about the dialect and formatting parameters, see + section :ref:`csv-fmt-params`. + + All data read are returned as strings. No automatic data type conversion is + performed. + + .. versionchanged:: 2.5 + The parser is now stricter with respect to multi-line quoted fields. Previously, + if a line ended within a quoted field without a terminating newline character, a + newline would be inserted into the returned field. This behavior caused problems + when reading files which contained carriage return characters within fields. + The behavior was changed to return the field without inserting newlines. As a + consequence, if newlines embedded within fields are important, the input should + be split into lines in a manner which preserves the newline characters. + + +.. function:: writer(csvfile[, dialect='excel'][, fmtparam]) + + Return a writer object responsible for converting the user's data into delimited + strings on the given file-like object. *csvfile* can be any object with a + :func:`write` method. If *csvfile* is a file object, it must be opened with the + 'b' flag on platforms where that makes a difference. An optional *dialect* + parameter can be given which is used to define a set of parameters specific to a + particular CSV dialect. It may be an instance of a subclass of the + :class:`Dialect` class or one of the strings returned by the + :func:`list_dialects` function. The other optional *fmtparam* keyword arguments + can be given to override individual formatting parameters in the current + dialect. For full details about the dialect and formatting parameters, see + section :ref:`csv-fmt-params`. To make it + as easy as possible to interface with modules which implement the DB API, the + value :const:`None` is written as the empty string. While this isn't a + reversible transformation, it makes it easier to dump SQL NULL data values to + CSV files without preprocessing the data returned from a ``cursor.fetch*`` call. + All other non-string data are stringified with :func:`str` before being written. + + +.. function:: register_dialect(name[, dialect][, fmtparam]) + + Associate *dialect* with *name*. *name* must be a string or Unicode object. The + dialect can be specified either by passing a sub-class of :class:`Dialect`, or + by *fmtparam* keyword arguments, or both, with keyword arguments overriding + parameters of the dialect. For full details about the dialect and formatting + parameters, see section :ref:`csv-fmt-params`. + + +.. function:: unregister_dialect(name) + + Delete the dialect associated with *name* from the dialect registry. An + :exc:`Error` is raised if *name* is not a registered dialect name. + + +.. function:: get_dialect(name) + + Return the dialect associated with *name*. An :exc:`Error` is raised if *name* + is not a registered dialect name. + + +.. function:: list_dialects() + + Return the names of all registered dialects. + + +.. function:: field_size_limit([new_limit]) + + Returns the current maximum field size allowed by the parser. If *new_limit* is + given, this becomes the new limit. + + .. versionadded:: 2.5 + +The :mod:`csv` module defines the following classes: + + +.. class:: DictReader(csvfile[, fieldnames=:const:None,[, restkey=:const:None[, restval=None[, dialect='excel'[, *args, **kwds]]]]]) + + Create an object which operates like a regular reader but maps the information + read into a dict whose keys are given by the optional *fieldnames* parameter. + If the *fieldnames* parameter is omitted, the values in the first row of the + *csvfile* will be used as the fieldnames. If the row read has fewer fields than + the fieldnames sequence, the value of *restval* will be used as the default + value. If the row read has more fields than the fieldnames sequence, the + remaining data is added as a sequence keyed by the value of *restkey*. If the + row read has fewer fields than the fieldnames sequence, the remaining keys take + the value of the optional *restval* parameter. Any other optional or keyword + arguments are passed to the underlying :class:`reader` instance. + + +.. class:: DictWriter(csvfile, fieldnames[, restval=''[, extrasaction='raise'[, dialect='excel'[, *args, **kwds]]]]) + + Create an object which operates like a regular writer but maps dictionaries onto + output rows. The *fieldnames* parameter identifies the order in which values in + the dictionary passed to the :meth:`writerow` method are written to the + *csvfile*. The optional *restval* parameter specifies the value to be written + if the dictionary is missing a key in *fieldnames*. If the dictionary passed to + the :meth:`writerow` method contains a key not found in *fieldnames*, the + optional *extrasaction* parameter indicates what action to take. If it is set + to ``'raise'`` a :exc:`ValueError` is raised. If it is set to ``'ignore'``, + extra values in the dictionary are ignored. Any other optional or keyword + arguments are passed to the underlying :class:`writer` instance. + + Note that unlike the :class:`DictReader` class, the *fieldnames* parameter of + the :class:`DictWriter` is not optional. Since Python's :class:`dict` objects + are not ordered, there is not enough information available to deduce the order + in which the row should be written to the *csvfile*. + + +.. class:: Dialect + + The :class:`Dialect` class is a container class relied on primarily for its + attributes, which are used to define the parameters for a specific + :class:`reader` or :class:`writer` instance. + + +.. class:: excel() + + The :class:`excel` class defines the usual properties of an Excel-generated CSV + file. It is registered with the dialect name ``'excel'``. + + +.. class:: excel_tab() + + The :class:`excel_tab` class defines the usual properties of an Excel-generated + TAB-delimited file. It is registered with the dialect name ``'excel-tab'``. + + +.. class:: Sniffer() + + The :class:`Sniffer` class is used to deduce the format of a CSV file. + +The :class:`Sniffer` class provides two methods: + + +.. method:: Sniffer.sniff(sample[, delimiters=None]) + + Analyze the given *sample* and return a :class:`Dialect` subclass reflecting the + parameters found. If the optional *delimiters* parameter is given, it is + interpreted as a string containing possible valid delimiter characters. + + +.. method:: Sniffer.has_header(sample) + + Analyze the sample text (presumed to be in CSV format) and return :const:`True` + if the first row appears to be a series of column headers. + +The :mod:`csv` module defines the following constants: + + +.. data:: QUOTE_ALL + + Instructs :class:`writer` objects to quote all fields. + + +.. data:: QUOTE_MINIMAL + + Instructs :class:`writer` objects to only quote those fields which contain + special characters such as *delimiter*, *quotechar* or any of the characters in + *lineterminator*. + + +.. data:: QUOTE_NONNUMERIC + + Instructs :class:`writer` objects to quote all non-numeric fields. + + Instructs the reader to convert all non-quoted fields to type *float*. + + +.. data:: QUOTE_NONE + + Instructs :class:`writer` objects to never quote fields. When the current + *delimiter* occurs in output data it is preceded by the current *escapechar* + character. If *escapechar* is not set, the writer will raise :exc:`Error` if + any characters that require escaping are encountered. + + Instructs :class:`reader` to perform no special processing of quote characters. + +The :mod:`csv` module defines the following exception: + + +.. exception:: Error + + Raised by any of the functions when an error is detected. + + +.. _csv-fmt-params: + +Dialects and Formatting Parameters +---------------------------------- + +To make it easier to specify the format of input and output records, specific +formatting parameters are grouped together into dialects. A dialect is a +subclass of the :class:`Dialect` class having a set of specific methods and a +single :meth:`validate` method. When creating :class:`reader` or +:class:`writer` objects, the programmer can specify a string or a subclass of +the :class:`Dialect` class as the dialect parameter. In addition to, or instead +of, the *dialect* parameter, the programmer can also specify individual +formatting parameters, which have the same names as the attributes defined below +for the :class:`Dialect` class. + +Dialects support the following attributes: + + +.. attribute:: Dialect.delimiter + + A one-character string used to separate fields. It defaults to ``','``. + + +.. attribute:: Dialect.doublequote + + Controls how instances of *quotechar* appearing inside a field should be + themselves be quoted. When :const:`True`, the character is doubled. When + :const:`False`, the *escapechar* is used as a prefix to the *quotechar*. It + defaults to :const:`True`. + + On output, if *doublequote* is :const:`False` and no *escapechar* is set, + :exc:`Error` is raised if a *quotechar* is found in a field. + + +.. attribute:: Dialect.escapechar + + A one-character string used by the writer to escape the *delimiter* if *quoting* + is set to :const:`QUOTE_NONE` and the *quotechar* if *doublequote* is + :const:`False`. On reading, the *escapechar* removes any special meaning from + the following character. It defaults to :const:`None`, which disables escaping. + + +.. attribute:: Dialect.lineterminator + + The string used to terminate lines produced by the :class:`writer`. It defaults + to ``'\r\n'``. + + .. note:: + + The :class:`reader` is hard-coded to recognise either ``'\r'`` or ``'\n'`` as + end-of-line, and ignores *lineterminator*. This behavior may change in the + future. + + +.. attribute:: Dialect.quotechar + + A one-character string used to quote fields containing special characters, such + as the *delimiter* or *quotechar*, or which contain new-line characters. It + defaults to ``'"'``. + + +.. attribute:: Dialect.quoting + + Controls when quotes should be generated by the writer and recognised by the + reader. It can take on any of the :const:`QUOTE_\*` constants (see section + :ref:`csv-contents`) and defaults to :const:`QUOTE_MINIMAL`. + + +.. attribute:: Dialect.skipinitialspace + + When :const:`True`, whitespace immediately following the *delimiter* is ignored. + The default is :const:`False`. + + +Reader Objects +-------------- + +Reader objects (:class:`DictReader` instances and objects returned by the +:func:`reader` function) have the following public methods: + + +.. method:: csvreader.next() + + Return the next row of the reader's iterable object as a list, parsed according + to the current dialect. + +Reader objects have the following public attributes: + + +.. attribute:: csvreader.dialect + + A read-only description of the dialect in use by the parser. + + +.. attribute:: csvreader.line_num + + The number of lines read from the source iterator. This is not the same as the + number of records returned, as records can span multiple lines. + + .. versionadded:: 2.5 + + +Writer Objects +-------------- + +:class:`Writer` objects (:class:`DictWriter` instances and objects returned by +the :func:`writer` function) have the following public methods. A *row* must be +a sequence of strings or numbers for :class:`Writer` objects and a dictionary +mapping fieldnames to strings or numbers (by passing them through :func:`str` +first) for :class:`DictWriter` objects. Note that complex numbers are written +out surrounded by parens. This may cause some problems for other programs which +read CSV files (assuming they support complex numbers at all). + + +.. method:: csvwriter.writerow(row) + + Write the *row* parameter to the writer's file object, formatted according to + the current dialect. + + +.. method:: csvwriter.writerows(rows) + + Write all the *rows* parameters (a list of *row* objects as described above) to + the writer's file object, formatted according to the current dialect. + +Writer objects have the following public attribute: + + +.. attribute:: csvwriter.dialect + + A read-only description of the dialect in use by the writer. + + +.. _csv-examples: + +Examples +-------- + +The simplest example of reading a CSV file:: + + import csv + reader = csv.reader(open("some.csv", "rb")) + for row in reader: + print row + +Reading a file with an alternate format:: + + import csv + reader = csv.reader(open("passwd", "rb"), delimiter=':', quoting=csv.QUOTE_NONE) + for row in reader: + print row + +The corresponding simplest possible writing example is:: + + import csv + writer = csv.writer(open("some.csv", "wb")) + writer.writerows(someiterable) + +Registering a new dialect:: + + import csv + + csv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE) + + reader = csv.reader(open("passwd", "rb"), 'unixpwd') + +A slightly more advanced use of the reader --- catching and reporting errors:: + + import csv, sys + filename = "some.csv" + reader = csv.reader(open(filename, "rb")) + try: + for row in reader: + print row + except csv.Error as e: + sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e)) + +And while the module doesn't directly support parsing strings, it can easily be +done:: + + import csv + for row in csv.reader(['one,two,three']): + print row + +The :mod:`csv` module doesn't directly support reading and writing Unicode, but +it is 8-bit-clean save for some problems with ASCII NUL characters. So you can +write functions or classes that handle the encoding and decoding for you as long +as you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended. + +:func:`unicode_csv_reader` below is a generator that wraps :class:`csv.reader` +to handle Unicode CSV data (a list of Unicode strings). :func:`utf_8_encoder` +is a generator that encodes the Unicode strings as UTF-8, one string (or row) at +a time. The encoded strings are parsed by the CSV reader, and +:func:`unicode_csv_reader` decodes the UTF-8-encoded cells back into Unicode:: + + import csv + + def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs): + # csv.py doesn't do Unicode; encode temporarily as UTF-8: + csv_reader = csv.reader(utf_8_encoder(unicode_csv_data), + dialect=dialect, **kwargs) + for row in csv_reader: + # decode UTF-8 back to Unicode, cell by cell: + yield [unicode(cell, 'utf-8') for cell in row] + + def utf_8_encoder(unicode_csv_data): + for line in unicode_csv_data: + yield line.encode('utf-8') + +For all other encodings the following :class:`UnicodeReader` and +:class:`UnicodeWriter` classes can be used. They take an additional *encoding* +parameter in their constructor and make sure that the data passes the real +reader or writer encoded as UTF-8:: + + import csv, codecs, cStringIO + + class UTF8Recoder: + """ + Iterator that reads an encoded stream and reencodes the input to UTF-8 + """ + def __init__(self, f, encoding): + self.reader = codecs.getreader(encoding)(f) + + def __iter__(self): + return self + + def __next__(self): + return next(self.reader).encode("utf-8") + + class UnicodeReader: + """ + A CSV reader which will iterate over lines in the CSV file "f", + which is encoded in the given encoding. + """ + + def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): + f = UTF8Recoder(f, encoding) + self.reader = csv.reader(f, dialect=dialect, **kwds) + + def __next__(self): + row = next(self.reader) + return [unicode(s, "utf-8") for s in row] + + def __iter__(self): + return self + + class UnicodeWriter: + """ + A CSV writer which will write rows to CSV file "f", + which is encoded in the given encoding. + """ + + def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): + # Redirect output to a queue + self.queue = cStringIO.StringIO() + self.writer = csv.writer(self.queue, dialect=dialect, **kwds) + self.stream = f + self.encoder = codecs.getincrementalencoder(encoding)() + + def writerow(self, row): + self.writer.writerow([s.encode("utf-8") for s in row]) + # Fetch UTF-8 output from the queue ... + data = self.queue.getvalue() + data = data.decode("utf-8") + # ... and reencode it into the target encoding + data = self.encoder.encode(data) + # write to the target stream + self.stream.write(data) + # empty queue + self.queue.truncate(0) + + def writerows(self, rows): + for row in rows: + self.writerow(row) + diff --git a/Doc/library/ctypes.rst b/Doc/library/ctypes.rst new file mode 100644 index 0000000..dc37565 --- /dev/null +++ b/Doc/library/ctypes.rst @@ -0,0 +1,2364 @@ + +:mod:`ctypes` --- A foreign function library for Python. +======================================================== + +.. module:: ctypes + :synopsis: A foreign function library for Python. +.. moduleauthor:: Thomas Heller <theller@python.net> + + +.. versionadded:: 2.5 + +``ctypes`` is a foreign function library for Python. It provides C compatible +data types, and allows calling functions in dlls/shared libraries. It can be +used to wrap these libraries in pure Python. + + +.. _ctypes-ctypes-tutorial: + +ctypes tutorial +--------------- + +Note: The code samples in this tutorial use ``doctest`` to make sure that they +actually work. Since some code samples behave differently under Linux, Windows, +or Mac OS X, they contain doctest directives in comments. + +Note: Some code sample references the ctypes :class:`c_int` type. This type is +an alias to the :class:`c_long` type on 32-bit systems. So, you should not be +confused if :class:`c_long` is printed if you would expect :class:`c_int` --- +they are actually the same type. + + +.. _ctypes-loading-dynamic-link-libraries: + +Loading dynamic link libraries +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``ctypes`` exports the *cdll*, and on Windows also *windll* and *oledll* objects +to load dynamic link libraries. + +You load libraries by accessing them as attributes of these objects. *cdll* +loads libraries which export functions using the standard ``cdecl`` calling +convention, while *windll* libraries call functions using the ``stdcall`` +calling convention. *oledll* also uses the ``stdcall`` calling convention, and +assumes the functions return a Windows :class:`HRESULT` error code. The error +code is used to automatically raise :class:`WindowsError` Python exceptions when +the function call fails. + +Here are some examples for Windows. Note that ``msvcrt`` is the MS standard C +library containing most standard C functions, and uses the cdecl calling +convention:: + + >>> from ctypes import * + >>> print windll.kernel32 # doctest: +WINDOWS + <WinDLL 'kernel32', handle ... at ...> + >>> print cdll.msvcrt # doctest: +WINDOWS + <CDLL 'msvcrt', handle ... at ...> + >>> libc = cdll.msvcrt # doctest: +WINDOWS + >>> + +Windows appends the usual '.dll' file suffix automatically. + +On Linux, it is required to specify the filename *including* the extension to +load a library, so attribute access does not work. Either the +:meth:`LoadLibrary` method of the dll loaders should be used, or you should load +the library by creating an instance of CDLL by calling the constructor:: + + >>> cdll.LoadLibrary("libc.so.6") # doctest: +LINUX + <CDLL 'libc.so.6', handle ... at ...> + >>> libc = CDLL("libc.so.6") # doctest: +LINUX + >>> libc # doctest: +LINUX + <CDLL 'libc.so.6', handle ... at ...> + >>> + +.. % XXX Add section for Mac OS X. + + +.. _ctypes-accessing-functions-from-loaded-dlls: + +Accessing functions from loaded dlls +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Functions are accessed as attributes of dll objects:: + + >>> from ctypes import * + >>> libc.printf + <_FuncPtr object at 0x...> + >>> print windll.kernel32.GetModuleHandleA # doctest: +WINDOWS + <_FuncPtr object at 0x...> + >>> print windll.kernel32.MyOwnFunction # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "ctypes.py", line 239, in __getattr__ + func = _StdcallFuncPtr(name, self) + AttributeError: function 'MyOwnFunction' not found + >>> + +Note that win32 system dlls like ``kernel32`` and ``user32`` often export ANSI +as well as UNICODE versions of a function. The UNICODE version is exported with +an ``W`` appended to the name, while the ANSI version is exported with an ``A`` +appended to the name. The win32 ``GetModuleHandle`` function, which returns a +*module handle* for a given module name, has the following C prototype, and a +macro is used to expose one of them as ``GetModuleHandle`` depending on whether +UNICODE is defined or not:: + + /* ANSI version */ + HMODULE GetModuleHandleA(LPCSTR lpModuleName); + /* UNICODE version */ + HMODULE GetModuleHandleW(LPCWSTR lpModuleName); + +*windll* does not try to select one of them by magic, you must access the +version you need by specifying ``GetModuleHandleA`` or ``GetModuleHandleW`` +explicitely, and then call it with normal strings or unicode strings +respectively. + +Sometimes, dlls export functions with names which aren't valid Python +identifiers, like ``"??2@YAPAXI@Z"``. In this case you have to use ``getattr`` +to retrieve the function:: + + >>> getattr(cdll.msvcrt, "??2@YAPAXI@Z") # doctest: +WINDOWS + <_FuncPtr object at 0x...> + >>> + +On Windows, some dlls export functions not by name but by ordinal. These +functions can be accessed by indexing the dll object with the ordinal number:: + + >>> cdll.kernel32[1] # doctest: +WINDOWS + <_FuncPtr object at 0x...> + >>> cdll.kernel32[0] # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "ctypes.py", line 310, in __getitem__ + func = _StdcallFuncPtr(name, self) + AttributeError: function ordinal 0 not found + >>> + + +.. _ctypes-calling-functions: + +Calling functions +^^^^^^^^^^^^^^^^^ + +You can call these functions like any other Python callable. This example uses +the ``time()`` function, which returns system time in seconds since the Unix +epoch, and the ``GetModuleHandleA()`` function, which returns a win32 module +handle. + +This example calls both functions with a NULL pointer (``None`` should be used +as the NULL pointer):: + + >>> print libc.time(None) # doctest: +SKIP + 1150640792 + >>> print hex(windll.kernel32.GetModuleHandleA(None)) # doctest: +WINDOWS + 0x1d000000 + >>> + +``ctypes`` tries to protect you from calling functions with the wrong number of +arguments or the wrong calling convention. Unfortunately this only works on +Windows. It does this by examining the stack after the function returns, so +although an error is raised the function *has* been called:: + + >>> windll.kernel32.GetModuleHandleA() # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: Procedure probably called with not enough arguments (4 bytes missing) + >>> windll.kernel32.GetModuleHandleA(0, 0) # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: Procedure probably called with too many arguments (4 bytes in excess) + >>> + +The same exception is raised when you call an ``stdcall`` function with the +``cdecl`` calling convention, or vice versa:: + + >>> cdll.kernel32.GetModuleHandleA(None) # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: Procedure probably called with not enough arguments (4 bytes missing) + >>> + + >>> windll.msvcrt.printf("spam") # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: Procedure probably called with too many arguments (4 bytes in excess) + >>> + +To find out the correct calling convention you have to look into the C header +file or the documentation for the function you want to call. + +On Windows, ``ctypes`` uses win32 structured exception handling to prevent +crashes from general protection faults when functions are called with invalid +argument values:: + + >>> windll.kernel32.GetModuleHandleA(32) # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + WindowsError: exception: access violation reading 0x00000020 + >>> + +There are, however, enough ways to crash Python with ``ctypes``, so you should +be careful anyway. + +``None``, integers, longs, byte strings and unicode strings are the only native +Python objects that can directly be used as parameters in these function calls. +``None`` is passed as a C ``NULL`` pointer, byte strings and unicode strings are +passed as pointer to the memory block that contains their data (``char *`` or +``wchar_t *``). Python integers and Python longs are passed as the platforms +default C ``int`` type, their value is masked to fit into the C type. + +Before we move on calling functions with other parameter types, we have to learn +more about ``ctypes`` data types. + + +.. _ctypes-fundamental-data-types: + +Fundamental data types +^^^^^^^^^^^^^^^^^^^^^^ + +``ctypes`` defines a number of primitive C compatible data types : + + +----------------------+--------------------------------+----------------------------+ + | ctypes type | C type | Python type | + +======================+================================+============================+ + | :class:`c_char` | ``char`` | 1-character string | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_wchar` | ``wchar_t`` | 1-character unicode string | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_byte` | ``char`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_ubyte` | ``unsigned char`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_short` | ``short`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_ushort` | ``unsigned short`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_int` | ``int`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_uint` | ``unsigned int`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_long` | ``long`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_ulong` | ``unsigned long`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_longlong` | ``__int64`` or ``long long`` | int/long | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_ulonglong` | ``unsigned __int64`` or | int/long | + | | ``unsigned long long`` | | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_float` | ``float`` | float | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_double` | ``double`` | float | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_char_p` | ``char *`` (NUL terminated) | string or ``None`` | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_wchar_p` | ``wchar_t *`` (NUL terminated) | unicode or ``None`` | + +----------------------+--------------------------------+----------------------------+ + | :class:`c_void_p` | ``void *`` | int/long or ``None`` | + +----------------------+--------------------------------+----------------------------+ + + +All these types can be created by calling them with an optional initializer of +the correct type and value:: + + >>> c_int() + c_long(0) + >>> c_char_p("Hello, World") + c_char_p('Hello, World') + >>> c_ushort(-3) + c_ushort(65533) + >>> + +Since these types are mutable, their value can also be changed afterwards:: + + >>> i = c_int(42) + >>> print i + c_long(42) + >>> print i.value + 42 + >>> i.value = -99 + >>> print i.value + -99 + >>> + +Assigning a new value to instances of the pointer types :class:`c_char_p`, +:class:`c_wchar_p`, and :class:`c_void_p` changes the *memory location* they +point to, *not the contents* of the memory block (of course not, because Python +strings are immutable):: + + >>> s = "Hello, World" + >>> c_s = c_char_p(s) + >>> print c_s + c_char_p('Hello, World') + >>> c_s.value = "Hi, there" + >>> print c_s + c_char_p('Hi, there') + >>> print s # first string is unchanged + Hello, World + >>> + +You should be careful, however, not to pass them to functions expecting pointers +to mutable memory. If you need mutable memory blocks, ctypes has a +``create_string_buffer`` function which creates these in various ways. The +current memory block contents can be accessed (or changed) with the ``raw`` +property; if you want to access it as NUL terminated string, use the ``value`` +property:: + + >>> from ctypes import * + >>> p = create_string_buffer(3) # create a 3 byte buffer, initialized to NUL bytes + >>> print sizeof(p), repr(p.raw) + 3 '\x00\x00\x00' + >>> p = create_string_buffer("Hello") # create a buffer containing a NUL terminated string + >>> print sizeof(p), repr(p.raw) + 6 'Hello\x00' + >>> print repr(p.value) + 'Hello' + >>> p = create_string_buffer("Hello", 10) # create a 10 byte buffer + >>> print sizeof(p), repr(p.raw) + 10 'Hello\x00\x00\x00\x00\x00' + >>> p.value = "Hi" + >>> print sizeof(p), repr(p.raw) + 10 'Hi\x00lo\x00\x00\x00\x00\x00' + >>> + +The ``create_string_buffer`` function replaces the ``c_buffer`` function (which +is still available as an alias), as well as the ``c_string`` function from +earlier ctypes releases. To create a mutable memory block containing unicode +characters of the C type ``wchar_t`` use the ``create_unicode_buffer`` function. + + +.. _ctypes-calling-functions-continued: + +Calling functions, continued +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Note that printf prints to the real standard output channel, *not* to +``sys.stdout``, so these examples will only work at the console prompt, not from +within *IDLE* or *PythonWin*:: + + >>> printf = libc.printf + >>> printf("Hello, %s\n", "World!") + Hello, World! + 14 + >>> printf("Hello, %S", u"World!") + Hello, World! + 13 + >>> printf("%d bottles of beer\n", 42) + 42 bottles of beer + 19 + >>> printf("%f bottles of beer\n", 42.5) + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ArgumentError: argument 2: exceptions.TypeError: Don't know how to convert parameter 2 + >>> + +As has been mentioned before, all Python types except integers, strings, and +unicode strings have to be wrapped in their corresponding ``ctypes`` type, so +that they can be converted to the required C data type:: + + >>> printf("An int %d, a double %f\n", 1234, c_double(3.14)) + Integer 1234, double 3.1400001049 + 31 + >>> + + +.. _ctypes-calling-functions-with-own-custom-data-types: + +Calling functions with your own custom data types +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +You can also customize ``ctypes`` argument conversion to allow instances of your +own classes be used as function arguments. ``ctypes`` looks for an +:attr:`_as_parameter_` attribute and uses this as the function argument. Of +course, it must be one of integer, string, or unicode:: + + >>> class Bottles(object): + ... def __init__(self, number): + ... self._as_parameter_ = number + ... + >>> bottles = Bottles(42) + >>> printf("%d bottles of beer\n", bottles) + 42 bottles of beer + 19 + >>> + +If you don't want to store the instance's data in the :attr:`_as_parameter_` +instance variable, you could define a ``property`` which makes the data +avaiblable. + + +.. _ctypes-specifying-required-argument-types: + +Specifying the required argument types (function prototypes) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It is possible to specify the required argument types of functions exported from +DLLs by setting the :attr:`argtypes` attribute. + +:attr:`argtypes` must be a sequence of C data types (the ``printf`` function is +probably not a good example here, because it takes a variable number and +different types of parameters depending on the format string, on the other hand +this is quite handy to experiment with this feature):: + + >>> printf.argtypes = [c_char_p, c_char_p, c_int, c_double] + >>> printf("String '%s', Int %d, Double %f\n", "Hi", 10, 2.2) + String 'Hi', Int 10, Double 2.200000 + 37 + >>> + +Specifying a format protects against incompatible argument types (just as a +prototype for a C function), and tries to convert the arguments to valid types:: + + >>> printf("%d %d %d", 1, 2, 3) + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ArgumentError: argument 2: exceptions.TypeError: wrong type + >>> printf("%s %d %f", "X", 2, 3) + X 2 3.00000012 + 12 + >>> + +If you have defined your own classes which you pass to function calls, you have +to implement a :meth:`from_param` class method for them to be able to use them +in the :attr:`argtypes` sequence. The :meth:`from_param` class method receives +the Python object passed to the function call, it should do a typecheck or +whatever is needed to make sure this object is acceptable, and then return the +object itself, it's :attr:`_as_parameter_` attribute, or whatever you want to +pass as the C function argument in this case. Again, the result should be an +integer, string, unicode, a ``ctypes`` instance, or something having the +:attr:`_as_parameter_` attribute. + + +.. _ctypes-return-types: + +Return types +^^^^^^^^^^^^ + +By default functions are assumed to return the C ``int`` type. Other return +types can be specified by setting the :attr:`restype` attribute of the function +object. + +Here is a more advanced example, it uses the ``strchr`` function, which expects +a string pointer and a char, and returns a pointer to a string:: + + >>> strchr = libc.strchr + >>> strchr("abcdef", ord("d")) # doctest: +SKIP + 8059983 + >>> strchr.restype = c_char_p # c_char_p is a pointer to a string + >>> strchr("abcdef", ord("d")) + 'def' + >>> print strchr("abcdef", ord("x")) + None + >>> + +If you want to avoid the ``ord("x")`` calls above, you can set the +:attr:`argtypes` attribute, and the second argument will be converted from a +single character Python string into a C char:: + + >>> strchr.restype = c_char_p + >>> strchr.argtypes = [c_char_p, c_char] + >>> strchr("abcdef", "d") + 'def' + >>> strchr("abcdef", "def") + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ArgumentError: argument 2: exceptions.TypeError: one character string expected + >>> print strchr("abcdef", "x") + None + >>> strchr("abcdef", "d") + 'def' + >>> + +You can also use a callable Python object (a function or a class for example) as +the :attr:`restype` attribute, if the foreign function returns an integer. The +callable will be called with the ``integer`` the C function returns, and the +result of this call will be used as the result of your function call. This is +useful to check for error return values and automatically raise an exception:: + + >>> GetModuleHandle = windll.kernel32.GetModuleHandleA # doctest: +WINDOWS + >>> def ValidHandle(value): + ... if value == 0: + ... raise WinError() + ... return value + ... + >>> + >>> GetModuleHandle.restype = ValidHandle # doctest: +WINDOWS + >>> GetModuleHandle(None) # doctest: +WINDOWS + 486539264 + >>> GetModuleHandle("something silly") # doctest: +WINDOWS + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "<stdin>", line 3, in ValidHandle + WindowsError: [Errno 126] The specified module could not be found. + >>> + +``WinError`` is a function which will call Windows ``FormatMessage()`` api to +get the string representation of an error code, and *returns* an exception. +``WinError`` takes an optional error code parameter, if no one is used, it calls +:func:`GetLastError` to retrieve it. + +Please note that a much more powerful error checking mechanism is available +through the :attr:`errcheck` attribute; see the reference manual for details. + + +.. _ctypes-passing-pointers: + +Passing pointers (or: passing parameters by reference) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes a C api function expects a *pointer* to a data type as parameter, +probably to write into the corresponding location, or if the data is too large +to be passed by value. This is also known as *passing parameters by reference*. + +``ctypes`` exports the :func:`byref` function which is used to pass parameters +by reference. The same effect can be achieved with the ``pointer`` function, +although ``pointer`` does a lot more work since it constructs a real pointer +object, so it is faster to use :func:`byref` if you don't need the pointer +object in Python itself:: + + >>> i = c_int() + >>> f = c_float() + >>> s = create_string_buffer('\000' * 32) + >>> print i.value, f.value, repr(s.value) + 0 0.0 '' + >>> libc.sscanf("1 3.14 Hello", "%d %f %s", + ... byref(i), byref(f), s) + 3 + >>> print i.value, f.value, repr(s.value) + 1 3.1400001049 'Hello' + >>> + + +.. _ctypes-structures-unions: + +Structures and unions +^^^^^^^^^^^^^^^^^^^^^ + +Structures and unions must derive from the :class:`Structure` and :class:`Union` +base classes which are defined in the ``ctypes`` module. Each subclass must +define a :attr:`_fields_` attribute. :attr:`_fields_` must be a list of +*2-tuples*, containing a *field name* and a *field type*. + +The field type must be a ``ctypes`` type like :class:`c_int`, or any other +derived ``ctypes`` type: structure, union, array, pointer. + +Here is a simple example of a POINT structure, which contains two integers named +``x`` and ``y``, and also shows how to initialize a structure in the +constructor:: + + >>> from ctypes import * + >>> class POINT(Structure): + ... _fields_ = [("x", c_int), + ... ("y", c_int)] + ... + >>> point = POINT(10, 20) + >>> print point.x, point.y + 10 20 + >>> point = POINT(y=5) + >>> print point.x, point.y + 0 5 + >>> POINT(1, 2, 3) + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: too many initializers + >>> + +You can, however, build much more complicated structures. Structures can itself +contain other structures by using a structure as a field type. + +Here is a RECT structure which contains two POINTs named ``upperleft`` and +``lowerright`` :: + + >>> class RECT(Structure): + ... _fields_ = [("upperleft", POINT), + ... ("lowerright", POINT)] + ... + >>> rc = RECT(point) + >>> print rc.upperleft.x, rc.upperleft.y + 0 5 + >>> print rc.lowerright.x, rc.lowerright.y + 0 0 + >>> + +Nested structures can also be initialized in the constructor in several ways:: + + >>> r = RECT(POINT(1, 2), POINT(3, 4)) + >>> r = RECT((1, 2), (3, 4)) + +Fields descriptors can be retrieved from the *class*, they are useful for +debugging because they can provide useful information:: + + >>> print POINT.x + <Field type=c_long, ofs=0, size=4> + >>> print POINT.y + <Field type=c_long, ofs=4, size=4> + >>> + + +.. _ctypes-structureunion-alignment-byte-order: + +Structure/union alignment and byte order +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +By default, Structure and Union fields are aligned in the same way the C +compiler does it. It is possible to override this behaviour be specifying a +:attr:`_pack_` class attribute in the subclass definition. This must be set to a +positive integer and specifies the maximum alignment for the fields. This is +what ``#pragma pack(n)`` also does in MSVC. + +``ctypes`` uses the native byte order for Structures and Unions. To build +structures with non-native byte order, you can use one of the +BigEndianStructure, LittleEndianStructure, BigEndianUnion, and LittleEndianUnion +base classes. These classes cannot contain pointer fields. + + +.. _ctypes-bit-fields-in-structures-unions: + +Bit fields in structures and unions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It is possible to create structures and unions containing bit fields. Bit fields +are only possible for integer fields, the bit width is specified as the third +item in the :attr:`_fields_` tuples:: + + >>> class Int(Structure): + ... _fields_ = [("first_16", c_int, 16), + ... ("second_16", c_int, 16)] + ... + >>> print Int.first_16 + <Field type=c_long, ofs=0:0, bits=16> + >>> print Int.second_16 + <Field type=c_long, ofs=0:16, bits=16> + >>> + + +.. _ctypes-arrays: + +Arrays +^^^^^^ + +Arrays are sequences, containing a fixed number of instances of the same type. + +The recommended way to create array types is by multiplying a data type with a +positive integer:: + + TenPointsArrayType = POINT * 10 + +Here is an example of an somewhat artifical data type, a structure containing 4 +POINTs among other stuff:: + + >>> from ctypes import * + >>> class POINT(Structure): + ... _fields_ = ("x", c_int), ("y", c_int) + ... + >>> class MyStruct(Structure): + ... _fields_ = [("a", c_int), + ... ("b", c_float), + ... ("point_array", POINT * 4)] + >>> + >>> print len(MyStruct().point_array) + 4 + >>> + +Instances are created in the usual way, by calling the class:: + + arr = TenPointsArrayType() + for pt in arr: + print pt.x, pt.y + +The above code print a series of ``0 0`` lines, because the array contents is +initialized to zeros. + +Initializers of the correct type can also be specified:: + + >>> from ctypes import * + >>> TenIntegers = c_int * 10 + >>> ii = TenIntegers(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) + >>> print ii + <c_long_Array_10 object at 0x...> + >>> for i in ii: print i, + ... + 1 2 3 4 5 6 7 8 9 10 + >>> + + +.. _ctypes-pointers: + +Pointers +^^^^^^^^ + +Pointer instances are created by calling the ``pointer`` function on a +``ctypes`` type:: + + >>> from ctypes import * + >>> i = c_int(42) + >>> pi = pointer(i) + >>> + +Pointer instances have a ``contents`` attribute which returns the object to +which the pointer points, the ``i`` object above:: + + >>> pi.contents + c_long(42) + >>> + +Note that ``ctypes`` does not have OOR (original object return), it constructs a +new, equivalent object each time you retrieve an attribute:: + + >>> pi.contents is i + False + >>> pi.contents is pi.contents + False + >>> + +Assigning another :class:`c_int` instance to the pointer's contents attribute +would cause the pointer to point to the memory location where this is stored:: + + >>> i = c_int(99) + >>> pi.contents = i + >>> pi.contents + c_long(99) + >>> + +Pointer instances can also be indexed with integers:: + + >>> pi[0] + 99 + >>> + +Assigning to an integer index changes the pointed to value:: + + >>> print i + c_long(99) + >>> pi[0] = 22 + >>> print i + c_long(22) + >>> + +It is also possible to use indexes different from 0, but you must know what +you're doing, just as in C: You can access or change arbitrary memory locations. +Generally you only use this feature if you receive a pointer from a C function, +and you *know* that the pointer actually points to an array instead of a single +item. + +Behind the scenes, the ``pointer`` function does more than simply create pointer +instances, it has to create pointer *types* first. This is done with the +``POINTER`` function, which accepts any ``ctypes`` type, and returns a new +type:: + + >>> PI = POINTER(c_int) + >>> PI + <class 'ctypes.LP_c_long'> + >>> PI(42) + Traceback (most recent call last): + File "<stdin>", line 1, in ? + TypeError: expected c_long instead of int + >>> PI(c_int(42)) + <ctypes.LP_c_long object at 0x...> + >>> + +Calling the pointer type without an argument creates a ``NULL`` pointer. +``NULL`` pointers have a ``False`` boolean value:: + + >>> null_ptr = POINTER(c_int)() + >>> print bool(null_ptr) + False + >>> + +``ctypes`` checks for ``NULL`` when dereferencing pointers (but dereferencing +non-\ ``NULL`` pointers would crash Python):: + + >>> null_ptr[0] + Traceback (most recent call last): + .... + ValueError: NULL pointer access + >>> + + >>> null_ptr[0] = 1234 + Traceback (most recent call last): + .... + ValueError: NULL pointer access + >>> + + +.. _ctypes-type-conversions: + +Type conversions +^^^^^^^^^^^^^^^^ + +Usually, ctypes does strict type checking. This means, if you have +``POINTER(c_int)`` in the :attr:`argtypes` list of a function or as the type of +a member field in a structure definition, only instances of exactly the same +type are accepted. There are some exceptions to this rule, where ctypes accepts +other objects. For example, you can pass compatible array instances instead of +pointer types. So, for ``POINTER(c_int)``, ctypes accepts an array of c_int:: + + >>> class Bar(Structure): + ... _fields_ = [("count", c_int), ("values", POINTER(c_int))] + ... + >>> bar = Bar() + >>> bar.values = (c_int * 3)(1, 2, 3) + >>> bar.count = 3 + >>> for i in range(bar.count): + ... print bar.values[i] + ... + 1 + 2 + 3 + >>> + +To set a POINTER type field to ``NULL``, you can assign ``None``:: + + >>> bar.values = None + >>> + +XXX list other conversions... + +Sometimes you have instances of incompatible types. In ``C``, you can cast one +type into another type. ``ctypes`` provides a ``cast`` function which can be +used in the same way. The ``Bar`` structure defined above accepts +``POINTER(c_int)`` pointers or :class:`c_int` arrays for its ``values`` field, +but not instances of other types:: + + >>> bar.values = (c_byte * 4)() + Traceback (most recent call last): + File "<stdin>", line 1, in ? + TypeError: incompatible types, c_byte_Array_4 instance instead of LP_c_long instance + >>> + +For these cases, the ``cast`` function is handy. + +The ``cast`` function can be used to cast a ctypes instance into a pointer to a +different ctypes data type. ``cast`` takes two parameters, a ctypes object that +is or can be converted to a pointer of some kind, and a ctypes pointer type. It +returns an instance of the second argument, which references the same memory +block as the first argument:: + + >>> a = (c_byte * 4)() + >>> cast(a, POINTER(c_int)) + <ctypes.LP_c_long object at ...> + >>> + +So, ``cast`` can be used to assign to the ``values`` field of ``Bar`` the +structure:: + + >>> bar = Bar() + >>> bar.values = cast((c_byte * 4)(), POINTER(c_int)) + >>> print bar.values[0] + 0 + >>> + + +.. _ctypes-incomplete-types: + +Incomplete Types +^^^^^^^^^^^^^^^^ + +*Incomplete Types* are structures, unions or arrays whose members are not yet +specified. In C, they are specified by forward declarations, which are defined +later:: + + struct cell; /* forward declaration */ + + struct { + char *name; + struct cell *next; + } cell; + +The straightforward translation into ctypes code would be this, but it does not +work:: + + >>> class cell(Structure): + ... _fields_ = [("name", c_char_p), + ... ("next", POINTER(cell))] + ... + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "<stdin>", line 2, in cell + NameError: name 'cell' is not defined + >>> + +because the new ``class cell`` is not available in the class statement itself. +In ``ctypes``, we can define the ``cell`` class and set the :attr:`_fields_` +attribute later, after the class statement:: + + >>> from ctypes import * + >>> class cell(Structure): + ... pass + ... + >>> cell._fields_ = [("name", c_char_p), + ... ("next", POINTER(cell))] + >>> + +Lets try it. We create two instances of ``cell``, and let them point to each +other, and finally follow the pointer chain a few times:: + + >>> c1 = cell() + >>> c1.name = "foo" + >>> c2 = cell() + >>> c2.name = "bar" + >>> c1.next = pointer(c2) + >>> c2.next = pointer(c1) + >>> p = c1 + >>> for i in range(8): + ... print p.name, + ... p = p.next[0] + ... + foo bar foo bar foo bar foo bar + >>> + + +.. _ctypes-callback-functions: + +Callback functions +^^^^^^^^^^^^^^^^^^ + +``ctypes`` allows to create C callable function pointers from Python callables. +These are sometimes called *callback functions*. + +First, you must create a class for the callback function, the class knows the +calling convention, the return type, and the number and types of arguments this +function will receive. + +The CFUNCTYPE factory function creates types for callback functions using the +normal cdecl calling convention, and, on Windows, the WINFUNCTYPE factory +function creates types for callback functions using the stdcall calling +convention. + +Both of these factory functions are called with the result type as first +argument, and the callback functions expected argument types as the remaining +arguments. + +I will present an example here which uses the standard C library's :func:`qsort` +function, this is used to sort items with the help of a callback function. +:func:`qsort` will be used to sort an array of integers:: + + >>> IntArray5 = c_int * 5 + >>> ia = IntArray5(5, 1, 7, 33, 99) + >>> qsort = libc.qsort + >>> qsort.restype = None + >>> + +:func:`qsort` must be called with a pointer to the data to sort, the number of +items in the data array, the size of one item, and a pointer to the comparison +function, the callback. The callback will then be called with two pointers to +items, and it must return a negative integer if the first item is smaller than +the second, a zero if they are equal, and a positive integer else. + +So our callback function receives pointers to integers, and must return an +integer. First we create the ``type`` for the callback function:: + + >>> CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int)) + >>> + +For the first implementation of the callback function, we simply print the +arguments we get, and return 0 (incremental development ;-):: + + >>> def py_cmp_func(a, b): + ... print "py_cmp_func", a, b + ... return 0 + ... + >>> + +Create the C callable callback:: + + >>> cmp_func = CMPFUNC(py_cmp_func) + >>> + +And we're ready to go:: + + >>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> + >>> + +We know how to access the contents of a pointer, so lets redefine our callback:: + + >>> def py_cmp_func(a, b): + ... print "py_cmp_func", a[0], b[0] + ... return 0 + ... + >>> cmp_func = CMPFUNC(py_cmp_func) + >>> + +Here is what we get on Windows:: + + >>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS + py_cmp_func 7 1 + py_cmp_func 33 1 + py_cmp_func 99 1 + py_cmp_func 5 1 + py_cmp_func 7 5 + py_cmp_func 33 5 + py_cmp_func 99 5 + py_cmp_func 7 99 + py_cmp_func 33 99 + py_cmp_func 7 33 + >>> + +It is funny to see that on linux the sort function seems to work much more +efficient, it is doing less comparisons:: + + >>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +LINUX + py_cmp_func 5 1 + py_cmp_func 33 99 + py_cmp_func 7 33 + py_cmp_func 5 7 + py_cmp_func 1 7 + >>> + +Ah, we're nearly done! The last step is to actually compare the two items and +return a useful result:: + + >>> def py_cmp_func(a, b): + ... print "py_cmp_func", a[0], b[0] + ... return a[0] - b[0] + ... + >>> + +Final run on Windows:: + + >>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +WINDOWS + py_cmp_func 33 7 + py_cmp_func 99 33 + py_cmp_func 5 99 + py_cmp_func 1 99 + py_cmp_func 33 7 + py_cmp_func 1 33 + py_cmp_func 5 33 + py_cmp_func 5 7 + py_cmp_func 1 7 + py_cmp_func 5 1 + >>> + +and on Linux:: + + >>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +LINUX + py_cmp_func 5 1 + py_cmp_func 33 99 + py_cmp_func 7 33 + py_cmp_func 1 7 + py_cmp_func 5 7 + >>> + +It is quite interesting to see that the Windows :func:`qsort` function needs +more comparisons than the linux version! + +As we can easily check, our array is sorted now:: + + >>> for i in ia: print i, + ... + 1 5 7 33 99 + >>> + +**Important note for callback functions:** + +Make sure you keep references to CFUNCTYPE objects as long as they are used from +C code. ``ctypes`` doesn't, and if you don't, they may be garbage collected, +crashing your program when a callback is made. + + +.. _ctypes-accessing-values-exported-from-dlls: + +Accessing values exported from dlls +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes, a dll not only exports functions, it also exports variables. An +example in the Python library itself is the ``Py_OptimizeFlag``, an integer set +to 0, 1, or 2, depending on the :option:`-O` or :option:`-OO` flag given on +startup. + +``ctypes`` can access values like this with the :meth:`in_dll` class methods of +the type. *pythonapi* is a predefined symbol giving access to the Python C +api:: + + >>> opt_flag = c_int.in_dll(pythonapi, "Py_OptimizeFlag") + >>> print opt_flag + c_long(0) + >>> + +If the interpreter would have been started with :option:`-O`, the sample would +have printed ``c_long(1)``, or ``c_long(2)`` if :option:`-OO` would have been +specified. + +An extended example which also demonstrates the use of pointers accesses the +``PyImport_FrozenModules`` pointer exported by Python. + +Quoting the Python docs: *This pointer is initialized to point to an array of +"struct _frozen" records, terminated by one whose members are all NULL or zero. +When a frozen module is imported, it is searched in this table. Third-party code +could play tricks with this to provide a dynamically created collection of +frozen modules.* + +So manipulating this pointer could even prove useful. To restrict the example +size, we show only how this table can be read with ``ctypes``:: + + >>> from ctypes import * + >>> + >>> class struct_frozen(Structure): + ... _fields_ = [("name", c_char_p), + ... ("code", POINTER(c_ubyte)), + ... ("size", c_int)] + ... + >>> + +We have defined the ``struct _frozen`` data type, so we can get the pointer to +the table:: + + >>> FrozenTable = POINTER(struct_frozen) + >>> table = FrozenTable.in_dll(pythonapi, "PyImport_FrozenModules") + >>> + +Since ``table`` is a ``pointer`` to the array of ``struct_frozen`` records, we +can iterate over it, but we just have to make sure that our loop terminates, +because pointers have no size. Sooner or later it would probably crash with an +access violation or whatever, so it's better to break out of the loop when we +hit the NULL entry:: + + >>> for item in table: + ... print item.name, item.size + ... if item.name is None: + ... break + ... + __hello__ 104 + __phello__ -104 + __phello__.spam 104 + None 0 + >>> + +The fact that standard Python has a frozen module and a frozen package +(indicated by the negative size member) is not wellknown, it is only used for +testing. Try it out with ``import __hello__`` for example. + + +.. _ctypes-surprises: + +Surprises +^^^^^^^^^ + +There are some edges in ``ctypes`` where you may be expect something else than +what actually happens. + +Consider the following example:: + + >>> from ctypes import * + >>> class POINT(Structure): + ... _fields_ = ("x", c_int), ("y", c_int) + ... + >>> class RECT(Structure): + ... _fields_ = ("a", POINT), ("b", POINT) + ... + >>> p1 = POINT(1, 2) + >>> p2 = POINT(3, 4) + >>> rc = RECT(p1, p2) + >>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y + 1 2 3 4 + >>> # now swap the two points + >>> rc.a, rc.b = rc.b, rc.a + >>> print rc.a.x, rc.a.y, rc.b.x, rc.b.y + 3 4 3 4 + >>> + +Hm. We certainly expected the last statement to print ``3 4 1 2``. What +happended? Here are the steps of the ``rc.a, rc.b = rc.b, rc.a`` line above:: + + >>> temp0, temp1 = rc.b, rc.a + >>> rc.a = temp0 + >>> rc.b = temp1 + >>> + +Note that ``temp0`` and ``temp1`` are objects still using the internal buffer of +the ``rc`` object above. So executing ``rc.a = temp0`` copies the buffer +contents of ``temp0`` into ``rc`` 's buffer. This, in turn, changes the +contents of ``temp1``. So, the last assignment ``rc.b = temp1``, doesn't have +the expected effect. + +Keep in mind that retrieving subobjects from Structure, Unions, and Arrays +doesn't *copy* the subobject, instead it retrieves a wrapper object accessing +the root-object's underlying buffer. + +Another example that may behave different from what one would expect is this:: + + >>> s = c_char_p() + >>> s.value = "abc def ghi" + >>> s.value + 'abc def ghi' + >>> s.value is s.value + False + >>> + +Why is it printing ``False``? ctypes instances are objects containing a memory +block plus some descriptors accessing the contents of the memory. Storing a +Python object in the memory block does not store the object itself, instead the +``contents`` of the object is stored. Accessing the contents again constructs a +new Python each time! + + +.. _ctypes-variable-sized-data-types: + +Variable-sized data types +^^^^^^^^^^^^^^^^^^^^^^^^^ + +``ctypes`` provides some support for variable-sized arrays and structures (this +was added in version 0.9.9.7). + +The ``resize`` function can be used to resize the memory buffer of an existing +ctypes object. The function takes the object as first argument, and the +requested size in bytes as the second argument. The memory block cannot be made +smaller than the natural memory block specified by the objects type, a +``ValueError`` is raised if this is tried:: + + >>> short_array = (c_short * 4)() + >>> print sizeof(short_array) + 8 + >>> resize(short_array, 4) + Traceback (most recent call last): + ... + ValueError: minimum size is 8 + >>> resize(short_array, 32) + >>> sizeof(short_array) + 32 + >>> sizeof(type(short_array)) + 8 + >>> + +This is nice and fine, but how would one access the additional elements +contained in this array? Since the type still only knows about 4 elements, we +get errors accessing other elements:: + + >>> short_array[:] + [0, 0, 0, 0] + >>> short_array[7] + Traceback (most recent call last): + ... + IndexError: invalid index + >>> + +Another way to use variable-sized data types with ``ctypes`` is to use the +dynamic nature of Python, and (re-)define the data type after the required size +is already known, on a case by case basis. + + +.. _ctypes-bugs-todo-non-implemented-things: + +Bugs, ToDo and non-implemented things +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Enumeration types are not implemented. You can do it easily yourself, using +:class:`c_int` as the base class. + +``long double`` is not implemented. + +.. % Local Variables: +.. % compile-command: "make.bat" +.. % End: + + +.. _ctypes-ctypes-reference: + +ctypes reference +---------------- + + +.. _ctypes-finding-shared-libraries: + +Finding shared libraries +^^^^^^^^^^^^^^^^^^^^^^^^ + +When programming in a compiled language, shared libraries are accessed when +compiling/linking a program, and when the program is run. + +The purpose of the ``find_library`` function is to locate a library in a way +similar to what the compiler does (on platforms with several versions of a +shared library the most recent should be loaded), while the ctypes library +loaders act like when a program is run, and call the runtime loader directly. + +The ``ctypes.util`` module provides a function which can help to determine the +library to load. + + +.. data:: find_library(name) + :noindex: + + Try to find a library and return a pathname. *name* is the library name without + any prefix like *lib*, suffix like ``.so``, ``.dylib`` or version number (this + is the form used for the posix linker option :option:`-l`). If no library can + be found, returns ``None``. + +The exact functionality is system dependend. + +On Linux, ``find_library`` tries to run external programs (/sbin/ldconfig, gcc, +and objdump) to find the library file. It returns the filename of the library +file. Here are sone examples:: + + >>> from ctypes.util import find_library + >>> find_library("m") + 'libm.so.6' + >>> find_library("c") + 'libc.so.6' + >>> find_library("bz2") + 'libbz2.so.1.0' + >>> + +On OS X, ``find_library`` tries several predefined naming schemes and paths to +locate the library, and returns a full pathname if successfull:: + + >>> from ctypes.util import find_library + >>> find_library("c") + '/usr/lib/libc.dylib' + >>> find_library("m") + '/usr/lib/libm.dylib' + >>> find_library("bz2") + '/usr/lib/libbz2.dylib' + >>> find_library("AGL") + '/System/Library/Frameworks/AGL.framework/AGL' + >>> + +On Windows, ``find_library`` searches along the system search path, and returns +the full pathname, but since there is no predefined naming scheme a call like +``find_library("c")`` will fail and return ``None``. + +If wrapping a shared library with ``ctypes``, it *may* be better to determine +the shared library name at development type, and hardcode that into the wrapper +module instead of using ``find_library`` to locate the library at runtime. + + +.. _ctypes-loading-shared-libraries: + +Loading shared libraries +^^^^^^^^^^^^^^^^^^^^^^^^ + +There are several ways to loaded shared libraries into the Python process. One +way is to instantiate one of the following classes: + + +.. class:: CDLL(name, mode=DEFAULT_MODE, handle=None) + + Instances of this class represent loaded shared libraries. Functions in these + libraries use the standard C calling convention, and are assumed to return + ``int``. + + +.. class:: OleDLL(name, mode=DEFAULT_MODE, handle=None) + + Windows only: Instances of this class represent loaded shared libraries, + functions in these libraries use the ``stdcall`` calling convention, and are + assumed to return the windows specific :class:`HRESULT` code. :class:`HRESULT` + values contain information specifying whether the function call failed or + succeeded, together with additional error code. If the return value signals a + failure, an :class:`WindowsError` is automatically raised. + + +.. class:: WinDLL(name, mode=DEFAULT_MODE, handle=None) + + Windows only: Instances of this class represent loaded shared libraries, + functions in these libraries use the ``stdcall`` calling convention, and are + assumed to return ``int`` by default. + + On Windows CE only the standard calling convention is used, for convenience the + :class:`WinDLL` and :class:`OleDLL` use the standard calling convention on this + platform. + +The Python GIL is released before calling any function exported by these +libraries, and reaquired afterwards. + + +.. class:: PyDLL(name, mode=DEFAULT_MODE, handle=None) + + Instances of this class behave like :class:`CDLL` instances, except that the + Python GIL is *not* released during the function call, and after the function + execution the Python error flag is checked. If the error flag is set, a Python + exception is raised. + + Thus, this is only useful to call Python C api functions directly. + +All these classes can be instantiated by calling them with at least one +argument, the pathname of the shared library. If you have an existing handle to +an already loaded shard library, it can be passed as the ``handle`` named +parameter, otherwise the underlying platforms ``dlopen`` or :meth:`LoadLibrary` +function is used to load the library into the process, and to get a handle to +it. + +The *mode* parameter can be used to specify how the library is loaded. For +details, consult the ``dlopen(3)`` manpage, on Windows, *mode* is ignored. + + +.. data:: RTLD_GLOBAL + :noindex: + + Flag to use as *mode* parameter. On platforms where this flag is not available, + it is defined as the integer zero. + + +.. data:: RTLD_LOCAL + :noindex: + + Flag to use as *mode* parameter. On platforms where this is not available, it + is the same as *RTLD_GLOBAL*. + + +.. data:: DEFAULT_MODE + :noindex: + + The default mode which is used to load shared libraries. On OSX 10.3, this is + *RTLD_GLOBAL*, otherwise it is the same as *RTLD_LOCAL*. + +Instances of these classes have no public methods, however :meth:`__getattr__` +and :meth:`__getitem__` have special behaviour: functions exported by the shared +library can be accessed as attributes of by index. Please note that both +:meth:`__getattr__` and :meth:`__getitem__` cache their result, so calling them +repeatedly returns the same object each time. + +The following public attributes are available, their name starts with an +underscore to not clash with exported function names: + + +.. attribute:: PyDLL._handle + + The system handle used to access the library. + + +.. attribute:: PyDLL._name + + The name of the library passed in the contructor. + +Shared libraries can also be loaded by using one of the prefabricated objects, +which are instances of the :class:`LibraryLoader` class, either by calling the +:meth:`LoadLibrary` method, or by retrieving the library as attribute of the +loader instance. + + +.. class:: LibraryLoader(dlltype) + + Class which loads shared libraries. ``dlltype`` should be one of the + :class:`CDLL`, :class:`PyDLL`, :class:`WinDLL`, or :class:`OleDLL` types. + + :meth:`__getattr__` has special behaviour: It allows to load a shared library by + accessing it as attribute of a library loader instance. The result is cached, + so repeated attribute accesses return the same library each time. + + +.. method:: LibraryLoader.LoadLibrary(name) + + Load a shared library into the process and return it. This method always + returns a new instance of the library. + +These prefabricated library loaders are available: + + +.. data:: cdll + :noindex: + + Creates :class:`CDLL` instances. + + +.. data:: windll + :noindex: + + Windows only: Creates :class:`WinDLL` instances. + + +.. data:: oledll + :noindex: + + Windows only: Creates :class:`OleDLL` instances. + + +.. data:: pydll + :noindex: + + Creates :class:`PyDLL` instances. + +For accessing the C Python api directly, a ready-to-use Python shared library +object is available: + + +.. data:: pythonapi + :noindex: + + An instance of :class:`PyDLL` that exposes Python C api functions as attributes. + Note that all these functions are assumed to return C ``int``, which is of + course not always the truth, so you have to assign the correct :attr:`restype` + attribute to use these functions. + + +.. _ctypes-foreign-functions: + +Foreign functions +^^^^^^^^^^^^^^^^^ + +As explained in the previous section, foreign functions can be accessed as +attributes of loaded shared libraries. The function objects created in this way +by default accept any number of arguments, accept any ctypes data instances as +arguments, and return the default result type specified by the library loader. +They are instances of a private class: + + +.. class:: _FuncPtr + + Base class for C callable foreign functions. + +Instances of foreign functions are also C compatible data types; they represent +C function pointers. + +This behaviour can be customized by assigning to special attributes of the +foreign function object. + + +.. attribute:: _FuncPtr.restype + + Assign a ctypes type to specify the result type of the foreign function. Use + ``None`` for ``void`` a function not returning anything. + + It is possible to assign a callable Python object that is not a ctypes type, in + this case the function is assumed to return a C ``int``, and the callable will + be called with this integer, allowing to do further processing or error + checking. Using this is deprecated, for more flexible postprocessing or error + checking use a ctypes data type as :attr:`restype` and assign a callable to the + :attr:`errcheck` attribute. + + +.. attribute:: _FuncPtr.argtypes + + Assign a tuple of ctypes types to specify the argument types that the function + accepts. Functions using the ``stdcall`` calling convention can only be called + with the same number of arguments as the length of this tuple; functions using + the C calling convention accept additional, unspecified arguments as well. + + When a foreign function is called, each actual argument is passed to the + :meth:`from_param` class method of the items in the :attr:`argtypes` tuple, this + method allows to adapt the actual argument to an object that the foreign + function accepts. For example, a :class:`c_char_p` item in the :attr:`argtypes` + tuple will convert a unicode string passed as argument into an byte string using + ctypes conversion rules. + + New: It is now possible to put items in argtypes which are not ctypes types, but + each item must have a :meth:`from_param` method which returns a value usable as + argument (integer, string, ctypes instance). This allows to define adapters + that can adapt custom objects as function parameters. + + +.. attribute:: _FuncPtr.errcheck + + Assign a Python function or another callable to this attribute. The callable + will be called with three or more arguments: + + +.. function:: callable(result, func, arguments) + :noindex: + + ``result`` is what the foreign function returns, as specified by the + :attr:`restype` attribute. + + ``func`` is the foreign function object itself, this allows to reuse the same + callable object to check or postprocess the results of several functions. + + ``arguments`` is a tuple containing the parameters originally passed to the + function call, this allows to specialize the behaviour on the arguments used. + + The object that this function returns will be returned from the foreign function + call, but it can also check the result value and raise an exception if the + foreign function call failed. + + +.. exception:: ArgumentError() + + This exception is raised when a foreign function call cannot convert one of the + passed arguments. + + +.. _ctypes-function-prototypes: + +Function prototypes +^^^^^^^^^^^^^^^^^^^ + +Foreign functions can also be created by instantiating function prototypes. +Function prototypes are similar to function prototypes in C; they describe a +function (return type, argument types, calling convention) without defining an +implementation. The factory functions must be called with the desired result +type and the argument types of the function. + + +.. function:: CFUNCTYPE(restype, *argtypes) + + The returned function prototype creates functions that use the standard C + calling convention. The function will release the GIL during the call. + + +.. function:: WINFUNCTYPE(restype, *argtypes) + + Windows only: The returned function prototype creates functions that use the + ``stdcall`` calling convention, except on Windows CE where :func:`WINFUNCTYPE` + is the same as :func:`CFUNCTYPE`. The function will release the GIL during the + call. + + +.. function:: PYFUNCTYPE(restype, *argtypes) + + The returned function prototype creates functions that use the Python calling + convention. The function will *not* release the GIL during the call. + +Function prototypes created by the factory functions can be instantiated in +different ways, depending on the type and number of the parameters in the call. + + +.. function:: prototype(address) + :noindex: + + Returns a foreign function at the specified address. + + +.. function:: prototype(callable) + :noindex: + + Create a C callable function (a callback function) from a Python ``callable``. + + +.. function:: prototype(func_spec[, paramflags]) + :noindex: + + Returns a foreign function exported by a shared library. ``func_spec`` must be a + 2-tuple ``(name_or_ordinal, library)``. The first item is the name of the + exported function as string, or the ordinal of the exported function as small + integer. The second item is the shared library instance. + + +.. function:: prototype(vtbl_index, name[, paramflags[, iid]]) + :noindex: + + Returns a foreign function that will call a COM method. ``vtbl_index`` is the + index into the virtual function table, a small nonnegative integer. *name* is + name of the COM method. *iid* is an optional pointer to the interface identifier + which is used in extended error reporting. + + COM methods use a special calling convention: They require a pointer to the COM + interface as first argument, in addition to those parameters that are specified + in the :attr:`argtypes` tuple. + +The optional *paramflags* parameter creates foreign function wrappers with much +more functionality than the features described above. + +*paramflags* must be a tuple of the same length as :attr:`argtypes`. + +Each item in this tuple contains further information about a parameter, it must +be a tuple containing 1, 2, or 3 items. + +The first item is an integer containing flags for the parameter: + + +.. data:: 1 + :noindex: + + Specifies an input parameter to the function. + + +.. data:: 2 + :noindex: + + Output parameter. The foreign function fills in a value. + + +.. data:: 4 + :noindex: + + Input parameter which defaults to the integer zero. + +The optional second item is the parameter name as string. If this is specified, +the foreign function can be called with named parameters. + +The optional third item is the default value for this parameter. + +This example demonstrates how to wrap the Windows ``MessageBoxA`` function so +that it supports default parameters and named arguments. The C declaration from +the windows header file is this:: + + WINUSERAPI int WINAPI + MessageBoxA( + HWND hWnd , + LPCSTR lpText, + LPCSTR lpCaption, + UINT uType); + +Here is the wrapping with ``ctypes``: + + :: + + >>> from ctypes import c_int, WINFUNCTYPE, windll + >>> from ctypes.wintypes import HWND, LPCSTR, UINT + >>> prototype = WINFUNCTYPE(c_int, HWND, LPCSTR, LPCSTR, UINT) + >>> paramflags = (1, "hwnd", 0), (1, "text", "Hi"), (1, "caption", None), (1, "flags", 0) + >>> MessageBox = prototype(("MessageBoxA", windll.user32), paramflags) + >>> + +The MessageBox foreign function can now be called in these ways:: + + >>> MessageBox() + >>> MessageBox(text="Spam, spam, spam") + >>> MessageBox(flags=2, text="foo bar") + >>> + +A second example demonstrates output parameters. The win32 ``GetWindowRect`` +function retrieves the dimensions of a specified window by copying them into +``RECT`` structure that the caller has to supply. Here is the C declaration:: + + WINUSERAPI BOOL WINAPI + GetWindowRect( + HWND hWnd, + LPRECT lpRect); + +Here is the wrapping with ``ctypes``: + + :: + + >>> from ctypes import POINTER, WINFUNCTYPE, windll, WinError + >>> from ctypes.wintypes import BOOL, HWND, RECT + >>> prototype = WINFUNCTYPE(BOOL, HWND, POINTER(RECT)) + >>> paramflags = (1, "hwnd"), (2, "lprect") + >>> GetWindowRect = prototype(("GetWindowRect", windll.user32), paramflags) + >>> + +Functions with output parameters will automatically return the output parameter +value if there is a single one, or a tuple containing the output parameter +values when there are more than one, so the GetWindowRect function now returns a +RECT instance, when called. + +Output parameters can be combined with the :attr:`errcheck` protocol to do +further output processing and error checking. The win32 ``GetWindowRect`` api +function returns a ``BOOL`` to signal success or failure, so this function could +do the error checking, and raises an exception when the api call failed:: + + >>> def errcheck(result, func, args): + ... if not result: + ... raise WinError() + ... return args + >>> GetWindowRect.errcheck = errcheck + >>> + +If the :attr:`errcheck` function returns the argument tuple it receives +unchanged, ``ctypes`` continues the normal processing it does on the output +parameters. If you want to return a tuple of window coordinates instead of a +``RECT`` instance, you can retrieve the fields in the function and return them +instead, the normal processing will no longer take place:: + + >>> def errcheck(result, func, args): + ... if not result: + ... raise WinError() + ... rc = args[1] + ... return rc.left, rc.top, rc.bottom, rc.right + >>> + >>> GetWindowRect.errcheck = errcheck + >>> + + +.. _ctypes-utility-functions: + +Utility functions +^^^^^^^^^^^^^^^^^ + + +.. function:: addressof(obj) + + Returns the address of the memory buffer as integer. ``obj`` must be an + instance of a ctypes type. + + +.. function:: alignment(obj_or_type) + + Returns the alignment requirements of a ctypes type. ``obj_or_type`` must be a + ctypes type or instance. + + +.. function:: byref(obj) + + Returns a light-weight pointer to ``obj``, which must be an instance of a ctypes + type. The returned object can only be used as a foreign function call parameter. + It behaves similar to ``pointer(obj)``, but the construction is a lot faster. + + +.. function:: cast(obj, type) + + This function is similar to the cast operator in C. It returns a new instance of + ``type`` which points to the same memory block as ``obj``. ``type`` must be a + pointer type, and ``obj`` must be an object that can be interpreted as a + pointer. + + +.. function:: create_string_buffer(init_or_size[, size]) + + This function creates a mutable character buffer. The returned object is a + ctypes array of :class:`c_char`. + + ``init_or_size`` must be an integer which specifies the size of the array, or a + string which will be used to initialize the array items. + + If a string is specified as first argument, the buffer is made one item larger + than the length of the string so that the last element in the array is a NUL + termination character. An integer can be passed as second argument which allows + to specify the size of the array if the length of the string should not be used. + + If the first parameter is a unicode string, it is converted into an 8-bit string + according to ctypes conversion rules. + + +.. function:: create_unicode_buffer(init_or_size[, size]) + + This function creates a mutable unicode character buffer. The returned object is + a ctypes array of :class:`c_wchar`. + + ``init_or_size`` must be an integer which specifies the size of the array, or a + unicode string which will be used to initialize the array items. + + If a unicode string is specified as first argument, the buffer is made one item + larger than the length of the string so that the last element in the array is a + NUL termination character. An integer can be passed as second argument which + allows to specify the size of the array if the length of the string should not + be used. + + If the first parameter is a 8-bit string, it is converted into an unicode string + according to ctypes conversion rules. + + +.. function:: DllCanUnloadNow() + + Windows only: This function is a hook which allows to implement inprocess COM + servers with ctypes. It is called from the DllCanUnloadNow function that the + _ctypes extension dll exports. + + +.. function:: DllGetClassObject() + + Windows only: This function is a hook which allows to implement inprocess COM + servers with ctypes. It is called from the DllGetClassObject function that the + ``_ctypes`` extension dll exports. + + +.. function:: FormatError([code]) + + Windows only: Returns a textual description of the error code. If no error code + is specified, the last error code is used by calling the Windows api function + GetLastError. + + +.. function:: GetLastError() + + Windows only: Returns the last error code set by Windows in the calling thread. + + +.. function:: memmove(dst, src, count) + + Same as the standard C memmove library function: copies *count* bytes from + ``src`` to *dst*. *dst* and ``src`` must be integers or ctypes instances that + can be converted to pointers. + + +.. function:: memset(dst, c, count) + + Same as the standard C memset library function: fills the memory block at + address *dst* with *count* bytes of value *c*. *dst* must be an integer + specifying an address, or a ctypes instance. + + +.. function:: POINTER(type) + + This factory function creates and returns a new ctypes pointer type. Pointer + types are cached an reused internally, so calling this function repeatedly is + cheap. type must be a ctypes type. + + +.. function:: pointer(obj) + + This function creates a new pointer instance, pointing to ``obj``. The returned + object is of the type POINTER(type(obj)). + + Note: If you just want to pass a pointer to an object to a foreign function + call, you should use ``byref(obj)`` which is much faster. + + +.. function:: resize(obj, size) + + This function resizes the internal memory buffer of obj, which must be an + instance of a ctypes type. It is not possible to make the buffer smaller than + the native size of the objects type, as given by sizeof(type(obj)), but it is + possible to enlarge the buffer. + + +.. function:: set_conversion_mode(encoding, errors) + + This function sets the rules that ctypes objects use when converting between + 8-bit strings and unicode strings. encoding must be a string specifying an + encoding, like ``'utf-8'`` or ``'mbcs'``, errors must be a string specifying the + error handling on encoding/decoding errors. Examples of possible values are + ``"strict"``, ``"replace"``, or ``"ignore"``. + + ``set_conversion_mode`` returns a 2-tuple containing the previous conversion + rules. On windows, the initial conversion rules are ``('mbcs', 'ignore')``, on + other systems ``('ascii', 'strict')``. + + +.. function:: sizeof(obj_or_type) + + Returns the size in bytes of a ctypes type or instance memory buffer. Does the + same as the C ``sizeof()`` function. + + +.. function:: string_at(address[, size]) + + This function returns the string starting at memory address address. If size + is specified, it is used as size, otherwise the string is assumed to be + zero-terminated. + + +.. function:: WinError(code=None, descr=None) + + Windows only: this function is probably the worst-named thing in ctypes. It + creates an instance of WindowsError. If *code* is not specified, + ``GetLastError`` is called to determine the error code. If ``descr`` is not + spcified, :func:`FormatError` is called to get a textual description of the + error. + + +.. function:: wstring_at(address) + + This function returns the wide character string starting at memory address + ``address`` as unicode string. If ``size`` is specified, it is used as the + number of characters of the string, otherwise the string is assumed to be + zero-terminated. + + +.. _ctypes-data-types: + +Data types +^^^^^^^^^^ + + +.. class:: _CData + + This non-public class is the common base class of all ctypes data types. Among + other things, all ctypes type instances contain a memory block that hold C + compatible data; the address of the memory block is returned by the + ``addressof()`` helper function. Another instance variable is exposed as + :attr:`_objects`; this contains other Python objects that need to be kept alive + in case the memory block contains pointers. + +Common methods of ctypes data types, these are all class methods (to be exact, +they are methods of the metaclass): + + +.. method:: _CData.from_address(address) + + This method returns a ctypes type instance using the memory specified by address + which must be an integer. + + +.. method:: _CData.from_param(obj) + + This method adapts obj to a ctypes type. It is called with the actual object + used in a foreign function call, when the type is present in the foreign + functions :attr:`argtypes` tuple; it must return an object that can be used as + function call parameter. + + All ctypes data types have a default implementation of this classmethod, + normally it returns ``obj`` if that is an instance of the type. Some types + accept other objects as well. + + +.. method:: _CData.in_dll(library, name) + + This method returns a ctypes type instance exported by a shared library. *name* + is the name of the symbol that exports the data, *library* is the loaded shared + library. + +Common instance variables of ctypes data types: + + +.. attribute:: _CData._b_base_ + + Sometimes ctypes data instances do not own the memory block they contain, + instead they share part of the memory block of a base object. The + :attr:`_b_base_` readonly member is the root ctypes object that owns the memory + block. + + +.. attribute:: _CData._b_needsfree_ + + This readonly variable is true when the ctypes data instance has allocated the + memory block itself, false otherwise. + + +.. attribute:: _CData._objects + + This member is either ``None`` or a dictionary containing Python objects that + need to be kept alive so that the memory block contents is kept valid. This + object is only exposed for debugging; never modify the contents of this + dictionary. + + +.. _ctypes-fundamental-data-types-2: + +Fundamental data types +^^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: _SimpleCData + + This non-public class is the base class of all fundamental ctypes data types. It + is mentioned here because it contains the common attributes of the fundamental + ctypes data types. ``_SimpleCData`` is a subclass of ``_CData``, so it inherits + their methods and attributes. + +Instances have a single attribute: + + +.. attribute:: _SimpleCData.value + + This attribute contains the actual value of the instance. For integer and + pointer types, it is an integer, for character types, it is a single character + string, for character pointer types it is a Python string or unicode string. + + When the ``value`` attribute is retrieved from a ctypes instance, usually a new + object is returned each time. ``ctypes`` does *not* implement original object + return, always a new object is constructed. The same is true for all other + ctypes object instances. + +Fundamental data types, when returned as foreign function call results, or, for +example, by retrieving structure field members or array items, are transparently +converted to native Python types. In other words, if a foreign function has a +:attr:`restype` of :class:`c_char_p`, you will always receive a Python string, +*not* a :class:`c_char_p` instance. + +Subclasses of fundamental data types do *not* inherit this behaviour. So, if a +foreign functions :attr:`restype` is a subclass of :class:`c_void_p`, you will +receive an instance of this subclass from the function call. Of course, you can +get the value of the pointer by accessing the ``value`` attribute. + +These are the fundamental ctypes data types: + + +.. class:: c_byte + + Represents the C signed char datatype, and interprets the value as small + integer. The constructor accepts an optional integer initializer; no overflow + checking is done. + + +.. class:: c_char + + Represents the C char datatype, and interprets the value as a single character. + The constructor accepts an optional string initializer, the length of the string + must be exactly one character. + + +.. class:: c_char_p + + Represents the C char \* datatype, which must be a pointer to a zero-terminated + string. The constructor accepts an integer address, or a string. + + +.. class:: c_double + + Represents the C double datatype. The constructor accepts an optional float + initializer. + + +.. class:: c_float + + Represents the C double datatype. The constructor accepts an optional float + initializer. + + +.. class:: c_int + + Represents the C signed int datatype. The constructor accepts an optional + integer initializer; no overflow checking is done. On platforms where + ``sizeof(int) == sizeof(long)`` it is an alias to :class:`c_long`. + + +.. class:: c_int8 + + Represents the C 8-bit ``signed int`` datatype. Usually an alias for + :class:`c_byte`. + + +.. class:: c_int16 + + Represents the C 16-bit signed int datatype. Usually an alias for + :class:`c_short`. + + +.. class:: c_int32 + + Represents the C 32-bit signed int datatype. Usually an alias for + :class:`c_int`. + + +.. class:: c_int64 + + Represents the C 64-bit ``signed int`` datatype. Usually an alias for + :class:`c_longlong`. + + +.. class:: c_long + + Represents the C ``signed long`` datatype. The constructor accepts an optional + integer initializer; no overflow checking is done. + + +.. class:: c_longlong + + Represents the C ``signed long long`` datatype. The constructor accepts an + optional integer initializer; no overflow checking is done. + + +.. class:: c_short + + Represents the C ``signed short`` datatype. The constructor accepts an optional + integer initializer; no overflow checking is done. + + +.. class:: c_size_t + + Represents the C ``size_t`` datatype. + + +.. class:: c_ubyte + + Represents the C ``unsigned char`` datatype, it interprets the value as small + integer. The constructor accepts an optional integer initializer; no overflow + checking is done. + + +.. class:: c_uint + + Represents the C ``unsigned int`` datatype. The constructor accepts an optional + integer initializer; no overflow checking is done. On platforms where + ``sizeof(int) == sizeof(long)`` it is an alias for :class:`c_ulong`. + + +.. class:: c_uint8 + + Represents the C 8-bit unsigned int datatype. Usually an alias for + :class:`c_ubyte`. + + +.. class:: c_uint16 + + Represents the C 16-bit unsigned int datatype. Usually an alias for + :class:`c_ushort`. + + +.. class:: c_uint32 + + Represents the C 32-bit unsigned int datatype. Usually an alias for + :class:`c_uint`. + + +.. class:: c_uint64 + + Represents the C 64-bit unsigned int datatype. Usually an alias for + :class:`c_ulonglong`. + + +.. class:: c_ulong + + Represents the C ``unsigned long`` datatype. The constructor accepts an optional + integer initializer; no overflow checking is done. + + +.. class:: c_ulonglong + + Represents the C ``unsigned long long`` datatype. The constructor accepts an + optional integer initializer; no overflow checking is done. + + +.. class:: c_ushort + + Represents the C ``unsigned short`` datatype. The constructor accepts an + optional integer initializer; no overflow checking is done. + + +.. class:: c_void_p + + Represents the C ``void *`` type. The value is represented as integer. The + constructor accepts an optional integer initializer. + + +.. class:: c_wchar + + Represents the C ``wchar_t`` datatype, and interprets the value as a single + character unicode string. The constructor accepts an optional string + initializer, the length of the string must be exactly one character. + + +.. class:: c_wchar_p + + Represents the C ``wchar_t *`` datatype, which must be a pointer to a + zero-terminated wide character string. The constructor accepts an integer + address, or a string. + + +.. class:: c_bool + + Represent the C ``bool`` datatype (more accurately, _Bool from C99). Its value + can be True or False, and the constructor accepts any object that has a truth + value. + + .. versionadded:: 2.6 + + +.. class:: HRESULT + + Windows only: Represents a :class:`HRESULT` value, which contains success or + error information for a function or method call. + + +.. class:: py_object + + Represents the C ``PyObject *`` datatype. Calling this without an argument + creates a ``NULL`` ``PyObject *`` pointer. + +The ``ctypes.wintypes`` module provides quite some other Windows specific data +types, for example ``HWND``, ``WPARAM``, or ``DWORD``. Some useful structures +like ``MSG`` or ``RECT`` are also defined. + + +.. _ctypes-structured-data-types: + +Structured data types +^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: Union(*args, **kw) + + Abstract base class for unions in native byte order. + + +.. class:: BigEndianStructure(*args, **kw) + + Abstract base class for structures in *big endian* byte order. + + +.. class:: LittleEndianStructure(*args, **kw) + + Abstract base class for structures in *little endian* byte order. + +Structures with non-native byte order cannot contain pointer type fields, or any +other data types containing pointer type fields. + + +.. class:: Structure(*args, **kw) + + Abstract base class for structures in *native* byte order. + +Concrete structure and union types must be created by subclassing one of these +types, and at least define a :attr:`_fields_` class variable. ``ctypes`` will +create descriptors which allow reading and writing the fields by direct +attribute accesses. These are the + + +.. attribute:: Structure._fields_ + + A sequence defining the structure fields. The items must be 2-tuples or + 3-tuples. The first item is the name of the field, the second item specifies + the type of the field; it can be any ctypes data type. + + For integer type fields like :class:`c_int`, a third optional item can be given. + It must be a small positive integer defining the bit width of the field. + + Field names must be unique within one structure or union. This is not checked, + only one field can be accessed when names are repeated. + + It is possible to define the :attr:`_fields_` class variable *after* the class + statement that defines the Structure subclass, this allows to create data types + that directly or indirectly reference themselves:: + + class List(Structure): + pass + List._fields_ = [("pnext", POINTER(List)), + ... + ] + + The :attr:`_fields_` class variable must, however, be defined before the type is + first used (an instance is created, ``sizeof()`` is called on it, and so on). + Later assignments to the :attr:`_fields_` class variable will raise an + AttributeError. + + Structure and union subclass constructors accept both positional and named + arguments. Positional arguments are used to initialize the fields in the same + order as they appear in the :attr:`_fields_` definition, named arguments are + used to initialize the fields with the corresponding name. + + It is possible to defined sub-subclasses of structure types, they inherit the + fields of the base class plus the :attr:`_fields_` defined in the sub-subclass, + if any. + + +.. attribute:: Structure._pack_ + + An optional small integer that allows to override the alignment of structure + fields in the instance. :attr:`_pack_` must already be defined when + :attr:`_fields_` is assigned, otherwise it will have no effect. + + +.. attribute:: Structure._anonymous_ + + An optional sequence that lists the names of unnamed (anonymous) fields. + ``_anonymous_`` must be already defined when :attr:`_fields_` is assigned, + otherwise it will have no effect. + + The fields listed in this variable must be structure or union type fields. + ``ctypes`` will create descriptors in the structure type that allows to access + the nested fields directly, without the need to create the structure or union + field. + + Here is an example type (Windows):: + + class _U(Union): + _fields_ = [("lptdesc", POINTER(TYPEDESC)), + ("lpadesc", POINTER(ARRAYDESC)), + ("hreftype", HREFTYPE)] + + class TYPEDESC(Structure): + _fields_ = [("u", _U), + ("vt", VARTYPE)] + + _anonymous_ = ("u",) + + The ``TYPEDESC`` structure describes a COM data type, the ``vt`` field specifies + which one of the union fields is valid. Since the ``u`` field is defined as + anonymous field, it is now possible to access the members directly off the + TYPEDESC instance. ``td.lptdesc`` and ``td.u.lptdesc`` are equivalent, but the + former is faster since it does not need to create a temporary union instance:: + + td = TYPEDESC() + td.vt = VT_PTR + td.lptdesc = POINTER(some_type) + td.u.lptdesc = POINTER(some_type) + +It is possible to defined sub-subclasses of structures, they inherit the fields +of the base class. If the subclass definition has a separate :attr:`_fields_` +variable, the fields specified in this are appended to the fields of the base +class. + +Structure and union constructors accept both positional and keyword arguments. +Positional arguments are used to initialize member fields in the same order as +they are appear in :attr:`_fields_`. Keyword arguments in the constructor are +interpreted as attribute assignments, so they will initialize :attr:`_fields_` +with the same name, or create new attributes for names not present in +:attr:`_fields_`. + + +.. _ctypes-arrays-pointers: + +Arrays and pointers +^^^^^^^^^^^^^^^^^^^ + +Not yet written - please see the sections :ref:`ctypes-pointers` and +section :ref:`ctypes-arrays` in the tutorial. + diff --git a/Doc/library/curses.ascii.rst b/Doc/library/curses.ascii.rst new file mode 100644 index 0000000..0a45c2a --- /dev/null +++ b/Doc/library/curses.ascii.rst @@ -0,0 +1,228 @@ + +:mod:`curses.ascii` --- Utilities for ASCII characters +====================================================== + +.. module:: curses.ascii + :synopsis: Constants and set-membership functions for ASCII characters. +.. moduleauthor:: Eric S. Raymond <esr@thyrsus.com> +.. sectionauthor:: Eric S. Raymond <esr@thyrsus.com> + + +.. versionadded:: 1.6 + +The :mod:`curses.ascii` module supplies name constants for ASCII characters and +functions to test membership in various ASCII character classes. The constants +supplied are names for control characters as follows: + ++--------------+----------------------------------------------+ +| Name | Meaning | ++==============+==============================================+ +| :const:`NUL` | | ++--------------+----------------------------------------------+ +| :const:`SOH` | Start of heading, console interrupt | ++--------------+----------------------------------------------+ +| :const:`STX` | Start of text | ++--------------+----------------------------------------------+ +| :const:`ETX` | End of text | ++--------------+----------------------------------------------+ +| :const:`EOT` | End of transmission | ++--------------+----------------------------------------------+ +| :const:`ENQ` | Enquiry, goes with :const:`ACK` flow control | ++--------------+----------------------------------------------+ +| :const:`ACK` | Acknowledgement | ++--------------+----------------------------------------------+ +| :const:`BEL` | Bell | ++--------------+----------------------------------------------+ +| :const:`BS` | Backspace | ++--------------+----------------------------------------------+ +| :const:`TAB` | Tab | ++--------------+----------------------------------------------+ +| :const:`HT` | Alias for :const:`TAB`: "Horizontal tab" | ++--------------+----------------------------------------------+ +| :const:`LF` | Line feed | ++--------------+----------------------------------------------+ +| :const:`NL` | Alias for :const:`LF`: "New line" | ++--------------+----------------------------------------------+ +| :const:`VT` | Vertical tab | ++--------------+----------------------------------------------+ +| :const:`FF` | Form feed | ++--------------+----------------------------------------------+ +| :const:`CR` | Carriage return | ++--------------+----------------------------------------------+ +| :const:`SO` | Shift-out, begin alternate character set | ++--------------+----------------------------------------------+ +| :const:`SI` | Shift-in, resume default character set | ++--------------+----------------------------------------------+ +| :const:`DLE` | Data-link escape | ++--------------+----------------------------------------------+ +| :const:`DC1` | XON, for flow control | ++--------------+----------------------------------------------+ +| :const:`DC2` | Device control 2, block-mode flow control | ++--------------+----------------------------------------------+ +| :const:`DC3` | XOFF, for flow control | ++--------------+----------------------------------------------+ +| :const:`DC4` | Device control 4 | ++--------------+----------------------------------------------+ +| :const:`NAK` | Negative acknowledgement | ++--------------+----------------------------------------------+ +| :const:`SYN` | Synchronous idle | ++--------------+----------------------------------------------+ +| :const:`ETB` | End transmission block | ++--------------+----------------------------------------------+ +| :const:`CAN` | Cancel | ++--------------+----------------------------------------------+ +| :const:`EM` | End of medium | ++--------------+----------------------------------------------+ +| :const:`SUB` | Substitute | ++--------------+----------------------------------------------+ +| :const:`ESC` | Escape | ++--------------+----------------------------------------------+ +| :const:`FS` | File separator | ++--------------+----------------------------------------------+ +| :const:`GS` | Group separator | ++--------------+----------------------------------------------+ +| :const:`RS` | Record separator, block-mode terminator | ++--------------+----------------------------------------------+ +| :const:`US` | Unit separator | ++--------------+----------------------------------------------+ +| :const:`SP` | Space | ++--------------+----------------------------------------------+ +| :const:`DEL` | Delete | ++--------------+----------------------------------------------+ + +Note that many of these have little practical significance in modern usage. The +mnemonics derive from teleprinter conventions that predate digital computers. + +The module supplies the following functions, patterned on those in the standard +C library: + + +.. function:: isalnum(c) + + Checks for an ASCII alphanumeric character; it is equivalent to ``isalpha(c) or + isdigit(c)``. + + +.. function:: isalpha(c) + + Checks for an ASCII alphabetic character; it is equivalent to ``isupper(c) or + islower(c)``. + + +.. function:: isascii(c) + + Checks for a character value that fits in the 7-bit ASCII set. + + +.. function:: isblank(c) + + Checks for an ASCII whitespace character. + + +.. function:: iscntrl(c) + + Checks for an ASCII control character (in the range 0x00 to 0x1f). + + +.. function:: isdigit(c) + + Checks for an ASCII decimal digit, ``'0'`` through ``'9'``. This is equivalent + to ``c in string.digits``. + + +.. function:: isgraph(c) + + Checks for ASCII any printable character except space. + + +.. function:: islower(c) + + Checks for an ASCII lower-case character. + + +.. function:: isprint(c) + + Checks for any ASCII printable character including space. + + +.. function:: ispunct(c) + + Checks for any printable ASCII character which is not a space or an alphanumeric + character. + + +.. function:: isspace(c) + + Checks for ASCII white-space characters; space, line feed, carriage return, form + feed, horizontal tab, vertical tab. + + +.. function:: isupper(c) + + Checks for an ASCII uppercase letter. + + +.. function:: isxdigit(c) + + Checks for an ASCII hexadecimal digit. This is equivalent to ``c in + string.hexdigits``. + + +.. function:: isctrl(c) + + Checks for an ASCII control character (ordinal values 0 to 31). + + +.. function:: ismeta(c) + + Checks for a non-ASCII character (ordinal values 0x80 and above). + +These functions accept either integers or strings; when the argument is a +string, it is first converted using the built-in function :func:`ord`. + +Note that all these functions check ordinal bit values derived from the first +character of the string you pass in; they do not actually know anything about +the host machine's character encoding. For functions that know about the +character encoding (and handle internationalization properly) see the +:mod:`string` module. + +The following two functions take either a single-character string or integer +byte value; they return a value of the same type. + + +.. function:: ascii(c) + + Return the ASCII value corresponding to the low 7 bits of *c*. + + +.. function:: ctrl(c) + + Return the control character corresponding to the given character (the character + bit value is bitwise-anded with 0x1f). + + +.. function:: alt(c) + + Return the 8-bit character corresponding to the given ASCII character (the + character bit value is bitwise-ored with 0x80). + +The following function takes either a single-character string or integer value; +it returns a string. + + +.. function:: unctrl(c) + + Return a string representation of the ASCII character *c*. If *c* is printable, + this string is the character itself. If the character is a control character + (0x00-0x1f) the string consists of a caret (``'^'``) followed by the + corresponding uppercase letter. If the character is an ASCII delete (0x7f) the + string is ``'^?'``. If the character has its meta bit (0x80) set, the meta bit + is stripped, the preceding rules applied, and ``'!'`` prepended to the result. + + +.. data:: controlnames + + A 33-element string array that contains the ASCII mnemonics for the thirty-two + ASCII control characters from 0 (NUL) to 0x1f (US), in order, plus the mnemonic + ``SP`` for the space character. + diff --git a/Doc/library/curses.panel.rst b/Doc/library/curses.panel.rst new file mode 100644 index 0000000..59e5b86 --- /dev/null +++ b/Doc/library/curses.panel.rst @@ -0,0 +1,119 @@ + +:mod:`curses.panel` --- A panel stack extension for curses. +=========================================================== + +.. module:: curses.panel + :synopsis: A panel stack extension that adds depth to curses windows. +.. sectionauthor:: A.M. Kuchling <amk@amk.ca> + + +Panels are windows with the added feature of depth, so they can be stacked on +top of each other, and only the visible portions of each window will be +displayed. Panels can be added, moved up or down in the stack, and removed. + + +.. _cursespanel-functions: + +Functions +--------- + +The module :mod:`curses.panel` defines the following functions: + + +.. function:: bottom_panel() + + Returns the bottom panel in the panel stack. + + +.. function:: new_panel(win) + + Returns a panel object, associating it with the given window *win*. Be aware + that you need to keep the returned panel object referenced explicitly. If you + don't, the panel object is garbage collected and removed from the panel stack. + + +.. function:: top_panel() + + Returns the top panel in the panel stack. + + +.. function:: update_panels() + + Updates the virtual screen after changes in the panel stack. This does not call + :func:`curses.doupdate`, so you'll have to do this yourself. + + +.. _curses-panel-objects: + +Panel Objects +------------- + +Panel objects, as returned by :func:`new_panel` above, are windows with a +stacking order. There's always a window associated with a panel which determines +the content, while the panel methods are responsible for the window's depth in +the panel stack. + +Panel objects have the following methods: + + +.. method:: Panel.above() + + Returns the panel above the current panel. + + +.. method:: Panel.below() + + Returns the panel below the current panel. + + +.. method:: Panel.bottom() + + Push the panel to the bottom of the stack. + + +.. method:: Panel.hidden() + + Returns true if the panel is hidden (not visible), false otherwise. + + +.. method:: Panel.hide() + + Hide the panel. This does not delete the object, it just makes the window on + screen invisible. + + +.. method:: Panel.move(y, x) + + Move the panel to the screen coordinates ``(y, x)``. + + +.. method:: Panel.replace(win) + + Change the window associated with the panel to the window *win*. + + +.. method:: Panel.set_userptr(obj) + + Set the panel's user pointer to *obj*. This is used to associate an arbitrary + piece of data with the panel, and can be any Python object. + + +.. method:: Panel.show() + + Display the panel (which might have been hidden). + + +.. method:: Panel.top() + + Push panel to the top of the stack. + + +.. method:: Panel.userptr() + + Returns the user pointer for the panel. This might be any Python object. + + +.. method:: Panel.window() + + Returns the window object associated with the panel. + diff --git a/Doc/library/curses.rst b/Doc/library/curses.rst new file mode 100644 index 0000000..91af757 --- /dev/null +++ b/Doc/library/curses.rst @@ -0,0 +1,1679 @@ + +:mod:`curses` --- Terminal handling for character-cell displays +=============================================================== + +.. module:: curses + :synopsis: An interface to the curses library, providing portable terminal handling. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> +.. sectionauthor:: Eric Raymond <esr@thyrsus.com> + + +.. versionchanged:: 1.6 + Added support for the ``ncurses`` library and converted to a package. + +The :mod:`curses` module provides an interface to the curses library, the +de-facto standard for portable advanced terminal handling. + +While curses is most widely used in the Unix environment, versions are available +for DOS, OS/2, and possibly other systems as well. This extension module is +designed to match the API of ncurses, an open-source curses library hosted on +Linux and the BSD variants of Unix. + + +.. seealso:: + + Module :mod:`curses.ascii` + Utilities for working with ASCII characters, regardless of your locale settings. + + Module :mod:`curses.panel` + A panel stack extension that adds depth to curses windows. + + Module :mod:`curses.textpad` + Editable text widget for curses supporting :program:`Emacs`\ -like bindings. + + Module :mod:`curses.wrapper` + Convenience function to ensure proper terminal setup and resetting on + application entry and exit. + + `Curses Programming with Python <http://www.python.org/doc/howto/curses/curses.html>`_ + Tutorial material on using curses with Python, by Andrew Kuchling and Eric + Raymond, is available on the Python Web site. + + The :file:`Demo/curses/` directory in the Python source distribution contains + some example programs using the curses bindings provided by this module. + + +.. _curses-functions: + +Functions +--------- + +The module :mod:`curses` defines the following exception: + + +.. exception:: error + + Exception raised when a curses library function returns an error. + +.. note:: + + Whenever *x* or *y* arguments to a function or a method are optional, they + default to the current cursor location. Whenever *attr* is optional, it defaults + to :const:`A_NORMAL`. + +The module :mod:`curses` defines the following functions: + + +.. function:: baudrate() + + Returns the output speed of the terminal in bits per second. On software + terminal emulators it will have a fixed high value. Included for historical + reasons; in former times, it was used to write output loops for time delays and + occasionally to change interfaces depending on the line speed. + + +.. function:: beep() + + Emit a short attention sound. + + +.. function:: can_change_color() + + Returns true or false, depending on whether the programmer can change the colors + displayed by the terminal. + + +.. function:: cbreak() + + Enter cbreak mode. In cbreak mode (sometimes called "rare" mode) normal tty + line buffering is turned off and characters are available to be read one by one. + However, unlike raw mode, special characters (interrupt, quit, suspend, and flow + control) retain their effects on the tty driver and calling program. Calling + first :func:`raw` then :func:`cbreak` leaves the terminal in cbreak mode. + + +.. function:: color_content(color_number) + + Returns the intensity of the red, green, and blue (RGB) components in the color + *color_number*, which must be between ``0`` and :const:`COLORS`. A 3-tuple is + returned, containing the R,G,B values for the given color, which will be between + ``0`` (no component) and ``1000`` (maximum amount of component). + + +.. function:: color_pair(color_number) + + Returns the attribute value for displaying text in the specified color. This + attribute value can be combined with :const:`A_STANDOUT`, :const:`A_REVERSE`, + and the other :const:`A_\*` attributes. :func:`pair_number` is the counterpart + to this function. + + +.. function:: curs_set(visibility) + + Sets the cursor state. *visibility* can be set to 0, 1, or 2, for invisible, + normal, or very visible. If the terminal supports the visibility requested, the + previous cursor state is returned; otherwise, an exception is raised. On many + terminals, the "visible" mode is an underline cursor and the "very visible" mode + is a block cursor. + + +.. function:: def_prog_mode() + + Saves the current terminal mode as the "program" mode, the mode when the running + program is using curses. (Its counterpart is the "shell" mode, for when the + program is not in curses.) Subsequent calls to :func:`reset_prog_mode` will + restore this mode. + + +.. function:: def_shell_mode() + + Saves the current terminal mode as the "shell" mode, the mode when the running + program is not using curses. (Its counterpart is the "program" mode, when the + program is using curses capabilities.) Subsequent calls to + :func:`reset_shell_mode` will restore this mode. + + +.. function:: delay_output(ms) + + Inserts an *ms* millisecond pause in output. + + +.. function:: doupdate() + + Update the physical screen. The curses library keeps two data structures, one + representing the current physical screen contents and a virtual screen + representing the desired next state. The :func:`doupdate` ground updates the + physical screen to match the virtual screen. + + The virtual screen may be updated by a :meth:`noutrefresh` call after write + operations such as :meth:`addstr` have been performed on a window. The normal + :meth:`refresh` call is simply :meth:`noutrefresh` followed by :func:`doupdate`; + if you have to update multiple windows, you can speed performance and perhaps + reduce screen flicker by issuing :meth:`noutrefresh` calls on all windows, + followed by a single :func:`doupdate`. + + +.. function:: echo() + + Enter echo mode. In echo mode, each character input is echoed to the screen as + it is entered. + + +.. function:: endwin() + + De-initialize the library, and return terminal to normal status. + + +.. function:: erasechar() + + Returns the user's current erase character. Under Unix operating systems this + is a property of the controlling tty of the curses program, and is not set by + the curses library itself. + + +.. function:: filter() + + The :func:`filter` routine, if used, must be called before :func:`initscr` is + called. The effect is that, during those calls, LINES is set to 1; the + capabilities clear, cup, cud, cud1, cuu1, cuu, vpa are disabled; and the home + string is set to the value of cr. The effect is that the cursor is confined to + the current line, and so are screen updates. This may be used for enabling + character-at-a-time line editing without touching the rest of the screen. + + +.. function:: flash() + + Flash the screen. That is, change it to reverse-video and then change it back + in a short interval. Some people prefer such as 'visible bell' to the audible + attention signal produced by :func:`beep`. + + +.. function:: flushinp() + + Flush all input buffers. This throws away any typeahead that has been typed + by the user and has not yet been processed by the program. + + +.. function:: getmouse() + + After :meth:`getch` returns :const:`KEY_MOUSE` to signal a mouse event, this + method should be call to retrieve the queued mouse event, represented as a + 5-tuple ``(id, x, y, z, bstate)``. *id* is an ID value used to distinguish + multiple devices, and *x*, *y*, *z* are the event's coordinates. (*z* is + currently unused.). *bstate* is an integer value whose bits will be set to + indicate the type of event, and will be the bitwise OR of one or more of the + following constants, where *n* is the button number from 1 to 4: + :const:`BUTTONn_PRESSED`, :const:`BUTTONn_RELEASED`, :const:`BUTTONn_CLICKED`, + :const:`BUTTONn_DOUBLE_CLICKED`, :const:`BUTTONn_TRIPLE_CLICKED`, + :const:`BUTTON_SHIFT`, :const:`BUTTON_CTRL`, :const:`BUTTON_ALT`. + + +.. function:: getsyx() + + Returns the current coordinates of the virtual screen cursor in y and x. If + leaveok is currently true, then -1,-1 is returned. + + +.. function:: getwin(file) + + Reads window related data stored in the file by an earlier :func:`putwin` call. + The routine then creates and initializes a new window using that data, returning + the new window object. + + +.. function:: has_colors() + + Returns true if the terminal can display colors; otherwise, it returns false. + + +.. function:: has_ic() + + Returns true if the terminal has insert- and delete- character capabilities. + This function is included for historical reasons only, as all modern software + terminal emulators have such capabilities. + + +.. function:: has_il() + + Returns true if the terminal has insert- and delete-line capabilities, or can + simulate them using scrolling regions. This function is included for + historical reasons only, as all modern software terminal emulators have such + capabilities. + + +.. function:: has_key(ch) + + Takes a key value *ch*, and returns true if the current terminal type recognizes + a key with that value. + + +.. function:: halfdelay(tenths) + + Used for half-delay mode, which is similar to cbreak mode in that characters + typed by the user are immediately available to the program. However, after + blocking for *tenths* tenths of seconds, an exception is raised if nothing has + been typed. The value of *tenths* must be a number between 1 and 255. Use + :func:`nocbreak` to leave half-delay mode. + + +.. function:: init_color(color_number, r, g, b) + + Changes the definition of a color, taking the number of the color to be changed + followed by three RGB values (for the amounts of red, green, and blue + components). The value of *color_number* must be between ``0`` and + :const:`COLORS`. Each of *r*, *g*, *b*, must be a value between ``0`` and + ``1000``. When :func:`init_color` is used, all occurrences of that color on the + screen immediately change to the new definition. This function is a no-op on + most terminals; it is active only if :func:`can_change_color` returns ``1``. + + +.. function:: init_pair(pair_number, fg, bg) + + Changes the definition of a color-pair. It takes three arguments: the number of + the color-pair to be changed, the foreground color number, and the background + color number. The value of *pair_number* must be between ``1`` and + ``COLOR_PAIRS - 1`` (the ``0`` color pair is wired to white on black and cannot + be changed). The value of *fg* and *bg* arguments must be between ``0`` and + :const:`COLORS`. If the color-pair was previously initialized, the screen is + refreshed and all occurrences of that color-pair are changed to the new + definition. + + +.. function:: initscr() + + Initialize the library. Returns a :class:`WindowObject` which represents the + whole screen. + + .. note:: + + If there is an error opening the terminal, the underlying curses library may + cause the interpreter to exit. + + +.. function:: isendwin() + + Returns true if :func:`endwin` has been called (that is, the curses library has + been deinitialized). + + +.. function:: keyname(k) + + Return the name of the key numbered *k*. The name of a key generating printable + ASCII character is the key's character. The name of a control-key combination + is a two-character string consisting of a caret followed by the corresponding + printable ASCII character. The name of an alt-key combination (128-255) is a + string consisting of the prefix 'M-' followed by the name of the corresponding + ASCII character. + + +.. function:: killchar() + + Returns the user's current line kill character. Under Unix operating systems + this is a property of the controlling tty of the curses program, and is not set + by the curses library itself. + + +.. function:: longname() + + Returns a string containing the terminfo long name field describing the current + terminal. The maximum length of a verbose description is 128 characters. It is + defined only after the call to :func:`initscr`. + + +.. function:: meta(yes) + + If *yes* is 1, allow 8-bit characters to be input. If *yes* is 0, allow only + 7-bit chars. + + +.. function:: mouseinterval(interval) + + Sets the maximum time in milliseconds that can elapse between press and release + events in order for them to be recognized as a click, and returns the previous + interval value. The default value is 200 msec, or one fifth of a second. + + +.. function:: mousemask(mousemask) + + Sets the mouse events to be reported, and returns a tuple ``(availmask, + oldmask)``. *availmask* indicates which of the specified mouse events can be + reported; on complete failure it returns 0. *oldmask* is the previous value of + the given window's mouse event mask. If this function is never called, no mouse + events are ever reported. + + +.. function:: napms(ms) + + Sleep for *ms* milliseconds. + + +.. function:: newpad(nlines, ncols) + + Creates and returns a pointer to a new pad data structure with the given number + of lines and columns. A pad is returned as a window object. + + A pad is like a window, except that it is not restricted by the screen size, and + is not necessarily associated with a particular part of the screen. Pads can be + used when a large window is needed, and only a part of the window will be on the + screen at one time. Automatic refreshes of pads (such as from scrolling or + echoing of input) do not occur. The :meth:`refresh` and :meth:`noutrefresh` + methods of a pad require 6 arguments to specify the part of the pad to be + displayed and the location on the screen to be used for the display. The + arguments are pminrow, pmincol, sminrow, smincol, smaxrow, smaxcol; the p + arguments refer to the upper left corner of the pad region to be displayed and + the s arguments define a clipping box on the screen within which the pad region + is to be displayed. + + +.. function:: newwin([nlines, ncols,] begin_y, begin_x) + + Return a new window, whose left-upper corner is at ``(begin_y, begin_x)``, and + whose height/width is *nlines*/*ncols*. + + By default, the window will extend from the specified position to the lower + right corner of the screen. + + +.. function:: nl() + + Enter newline mode. This mode translates the return key into newline on input, + and translates newline into return and line-feed on output. Newline mode is + initially on. + + +.. function:: nocbreak() + + Leave cbreak mode. Return to normal "cooked" mode with line buffering. + + +.. function:: noecho() + + Leave echo mode. Echoing of input characters is turned off. + + +.. function:: nonl() + + Leave newline mode. Disable translation of return into newline on input, and + disable low-level translation of newline into newline/return on output (but this + does not change the behavior of ``addch('\n')``, which always does the + equivalent of return and line feed on the virtual screen). With translation + off, curses can sometimes speed up vertical motion a little; also, it will be + able to detect the return key on input. + + +.. function:: noqiflush() + + When the noqiflush routine is used, normal flush of input and output queues + associated with the INTR, QUIT and SUSP characters will not be done. You may + want to call :func:`noqiflush` in a signal handler if you want output to + continue as though the interrupt had not occurred, after the handler exits. + + +.. function:: noraw() + + Leave raw mode. Return to normal "cooked" mode with line buffering. + + +.. function:: pair_content(pair_number) + + Returns a tuple ``(fg, bg)`` containing the colors for the requested color pair. + The value of *pair_number* must be between ``1`` and ``COLOR_PAIRS - 1``. + + +.. function:: pair_number(attr) + + Returns the number of the color-pair set by the attribute value *attr*. + :func:`color_pair` is the counterpart to this function. + + +.. function:: putp(string) + + Equivalent to ``tputs(str, 1, putchar)``; emits the value of a specified + terminfo capability for the current terminal. Note that the output of putp + always goes to standard output. + + +.. function:: qiflush( [flag] ) + + If *flag* is false, the effect is the same as calling :func:`noqiflush`. If + *flag* is true, or no argument is provided, the queues will be flushed when + these control characters are read. + + +.. function:: raw() + + Enter raw mode. In raw mode, normal line buffering and processing of + interrupt, quit, suspend, and flow control keys are turned off; characters are + presented to curses input functions one by one. + + +.. function:: reset_prog_mode() + + Restores the terminal to "program" mode, as previously saved by + :func:`def_prog_mode`. + + +.. function:: reset_shell_mode() + + Restores the terminal to "shell" mode, as previously saved by + :func:`def_shell_mode`. + + +.. function:: setsyx(y, x) + + Sets the virtual screen cursor to *y*, *x*. If *y* and *x* are both -1, then + leaveok is set. + + +.. function:: setupterm([termstr, fd]) + + Initializes the terminal. *termstr* is a string giving the terminal name; if + omitted, the value of the TERM environment variable will be used. *fd* is the + file descriptor to which any initialization sequences will be sent; if not + supplied, the file descriptor for ``sys.stdout`` will be used. + + +.. function:: start_color() + + Must be called if the programmer wants to use colors, and before any other color + manipulation routine is called. It is good practice to call this routine right + after :func:`initscr`. + + :func:`start_color` initializes eight basic colors (black, red, green, yellow, + blue, magenta, cyan, and white), and two global variables in the :mod:`curses` + module, :const:`COLORS` and :const:`COLOR_PAIRS`, containing the maximum number + of colors and color-pairs the terminal can support. It also restores the colors + on the terminal to the values they had when the terminal was just turned on. + + +.. function:: termattrs() + + Returns a logical OR of all video attributes supported by the terminal. This + information is useful when a curses program needs complete control over the + appearance of the screen. + + +.. function:: termname() + + Returns the value of the environment variable TERM, truncated to 14 characters. + + +.. function:: tigetflag(capname) + + Returns the value of the Boolean capability corresponding to the terminfo + capability name *capname*. The value ``-1`` is returned if *capname* is not a + Boolean capability, or ``0`` if it is canceled or absent from the terminal + description. + + +.. function:: tigetnum(capname) + + Returns the value of the numeric capability corresponding to the terminfo + capability name *capname*. The value ``-2`` is returned if *capname* is not a + numeric capability, or ``-1`` if it is canceled or absent from the terminal + description. + + +.. function:: tigetstr(capname) + + Returns the value of the string capability corresponding to the terminfo + capability name *capname*. ``None`` is returned if *capname* is not a string + capability, or is canceled or absent from the terminal description. + + +.. function:: tparm(str[,...]) + + Instantiates the string *str* with the supplied parameters, where *str* should + be a parameterized string obtained from the terminfo database. E.g. + ``tparm(tigetstr("cup"), 5, 3)`` could result in ``'\033[6;4H'``, the exact + result depending on terminal type. + + +.. function:: typeahead(fd) + + Specifies that the file descriptor *fd* be used for typeahead checking. If *fd* + is ``-1``, then no typeahead checking is done. + + The curses library does "line-breakout optimization" by looking for typeahead + periodically while updating the screen. If input is found, and it is coming + from a tty, the current update is postponed until refresh or doupdate is called + again, allowing faster response to commands typed in advance. This function + allows specifying a different file descriptor for typeahead checking. + + +.. function:: unctrl(ch) + + Returns a string which is a printable representation of the character *ch*. + Control characters are displayed as a caret followed by the character, for + example as ``^C``. Printing characters are left as they are. + + +.. function:: ungetch(ch) + + Push *ch* so the next :meth:`getch` will return it. + + .. note:: + + Only one *ch* can be pushed before :meth:`getch` is called. + + +.. function:: ungetmouse(id, x, y, z, bstate) + + Push a :const:`KEY_MOUSE` event onto the input queue, associating the given + state data with it. + + +.. function:: use_env(flag) + + If used, this function should be called before :func:`initscr` or newterm are + called. When *flag* is false, the values of lines and columns specified in the + terminfo database will be used, even if environment variables :envvar:`LINES` + and :envvar:`COLUMNS` (used by default) are set, or if curses is running in a + window (in which case default behavior would be to use the window size if + :envvar:`LINES` and :envvar:`COLUMNS` are not set). + + +.. function:: use_default_colors() + + Allow use of default values for colors on terminals supporting this feature. Use + this to support transparency in your application. The default color is assigned + to the color number -1. After calling this function, ``init_pair(x, + curses.COLOR_RED, -1)`` initializes, for instance, color pair *x* to a red + foreground color on the default background. + + +.. _curses-window-objects: + +Window Objects +-------------- + +Window objects, as returned by :func:`initscr` and :func:`newwin` above, have +the following methods: + + +.. method:: window.addch([y, x,] ch[, attr]) + + .. note:: + + A *character* means a C character (an ASCII code), rather then a Python + character (a string of length 1). (This note is true whenever the documentation + mentions a character.) The builtin :func:`ord` is handy for conveying strings to + codes. + + Paint character *ch* at ``(y, x)`` with attributes *attr*, overwriting any + character previously painter at that location. By default, the character + position and attributes are the current settings for the window object. + + +.. method:: window.addnstr([y, x,] str, n[, attr]) + + Paint at most *n* characters of the string *str* at ``(y, x)`` with attributes + *attr*, overwriting anything previously on the display. + + +.. method:: window.addstr([y, x,] str[, attr]) + + Paint the string *str* at ``(y, x)`` with attributes *attr*, overwriting + anything previously on the display. + + +.. method:: window.attroff(attr) + + Remove attribute *attr* from the "background" set applied to all writes to the + current window. + + +.. method:: window.attron(attr) + + Add attribute *attr* from the "background" set applied to all writes to the + current window. + + +.. method:: window.attrset(attr) + + Set the "background" set of attributes to *attr*. This set is initially 0 (no + attributes). + + +.. method:: window.bkgd(ch[, attr]) + + Sets the background property of the window to the character *ch*, with + attributes *attr*. The change is then applied to every character position in + that window: + + * The attribute of every character in the window is changed to the new + background attribute. + + * Wherever the former background character appears, it is changed to the new + background character. + + +.. method:: window.bkgdset(ch[, attr]) + + Sets the window's background. A window's background consists of a character and + any combination of attributes. The attribute part of the background is combined + (OR'ed) with all non-blank characters that are written into the window. Both + the character and attribute parts of the background are combined with the blank + characters. The background becomes a property of the character and moves with + the character through any scrolling and insert/delete line/character operations. + + +.. method:: window.border([ls[, rs[, ts[, bs[, tl[, tr[, bl[, br]]]]]]]]) + + Draw a border around the edges of the window. Each parameter specifies the + character to use for a specific part of the border; see the table below for more + details. The characters can be specified as integers or as one-character + strings. + + .. note:: + + A ``0`` value for any parameter will cause the default character to be used for + that parameter. Keyword parameters can *not* be used. The defaults are listed + in this table: + + +-----------+---------------------+-----------------------+ + | Parameter | Description | Default value | + +===========+=====================+=======================+ + | *ls* | Left side | :const:`ACS_VLINE` | + +-----------+---------------------+-----------------------+ + | *rs* | Right side | :const:`ACS_VLINE` | + +-----------+---------------------+-----------------------+ + | *ts* | Top | :const:`ACS_HLINE` | + +-----------+---------------------+-----------------------+ + | *bs* | Bottom | :const:`ACS_HLINE` | + +-----------+---------------------+-----------------------+ + | *tl* | Upper-left corner | :const:`ACS_ULCORNER` | + +-----------+---------------------+-----------------------+ + | *tr* | Upper-right corner | :const:`ACS_URCORNER` | + +-----------+---------------------+-----------------------+ + | *bl* | Bottom-left corner | :const:`ACS_LLCORNER` | + +-----------+---------------------+-----------------------+ + | *br* | Bottom-right corner | :const:`ACS_LRCORNER` | + +-----------+---------------------+-----------------------+ + + +.. method:: window.box([vertch, horch]) + + Similar to :meth:`border`, but both *ls* and *rs* are *vertch* and both *ts* and + bs are *horch*. The default corner characters are always used by this function. + + +.. method:: window.chgat([y, x, ] [num,] attr) + + Sets the attributes of *num* characters at the current cursor position, or at + position ``(y, x)`` if supplied. If no value of *num* is given or *num* = -1, + the attribute will be set on all the characters to the end of the line. This + function does not move the cursor. The changed line will be touched using the + :meth:`touchline` method so that the contents will be redisplayed by the next + window refresh. + + +.. method:: window.clear() + + Like :meth:`erase`, but also causes the whole window to be repainted upon next + call to :meth:`refresh`. + + +.. method:: window.clearok(yes) + + If *yes* is 1, the next call to :meth:`refresh` will clear the window + completely. + + +.. method:: window.clrtobot() + + Erase from cursor to the end of the window: all lines below the cursor are + deleted, and then the equivalent of :meth:`clrtoeol` is performed. + + +.. method:: window.clrtoeol() + + Erase from cursor to the end of the line. + + +.. method:: window.cursyncup() + + Updates the current cursor position of all the ancestors of the window to + reflect the current cursor position of the window. + + +.. method:: window.delch([y, x]) + + Delete any character at ``(y, x)``. + + +.. method:: window.deleteln() + + Delete the line under the cursor. All following lines are moved up by 1 line. + + +.. method:: window.derwin([nlines, ncols,] begin_y, begin_x) + + An abbreviation for "derive window", :meth:`derwin` is the same as calling + :meth:`subwin`, except that *begin_y* and *begin_x* are relative to the origin + of the window, rather than relative to the entire screen. Returns a window + object for the derived window. + + +.. method:: window.echochar(ch[, attr]) + + Add character *ch* with attribute *attr*, and immediately call :meth:`refresh` + on the window. + + +.. method:: window.enclose(y, x) + + Tests whether the given pair of screen-relative character-cell coordinates are + enclosed by the given window, returning true or false. It is useful for + determining what subset of the screen windows enclose the location of a mouse + event. + + +.. method:: window.erase() + + Clear the window. + + +.. method:: window.getbegyx() + + Return a tuple ``(y, x)`` of co-ordinates of upper-left corner. + + +.. method:: window.getch([y, x]) + + Get a character. Note that the integer returned does *not* have to be in ASCII + range: function keys, keypad keys and so on return numbers higher than 256. In + no-delay mode, -1 is returned if there is no input. + + +.. method:: window.getkey([y, x]) + + Get a character, returning a string instead of an integer, as :meth:`getch` + does. Function keys, keypad keys and so on return a multibyte string containing + the key name. In no-delay mode, an exception is raised if there is no input. + + +.. method:: window.getmaxyx() + + Return a tuple ``(y, x)`` of the height and width of the window. + + +.. method:: window.getparyx() + + Returns the beginning coordinates of this window relative to its parent window + into two integer variables y and x. Returns ``-1,-1`` if this window has no + parent. + + +.. method:: window.getstr([y, x]) + + Read a string from the user, with primitive line editing capacity. + + +.. method:: window.getyx() + + Return a tuple ``(y, x)`` of current cursor position relative to the window's + upper-left corner. + + +.. method:: window.hline([y, x,] ch, n) + + Display a horizontal line starting at ``(y, x)`` with length *n* consisting of + the character *ch*. + + +.. method:: window.idcok(flag) + + If *flag* is false, curses no longer considers using the hardware insert/delete + character feature of the terminal; if *flag* is true, use of character insertion + and deletion is enabled. When curses is first initialized, use of character + insert/delete is enabled by default. + + +.. method:: window.idlok(yes) + + If called with *yes* equal to 1, :mod:`curses` will try and use hardware line + editing facilities. Otherwise, line insertion/deletion are disabled. + + +.. method:: window.immedok(flag) + + If *flag* is true, any change in the window image automatically causes the + window to be refreshed; you no longer have to call :meth:`refresh` yourself. + However, it may degrade performance considerably, due to repeated calls to + wrefresh. This option is disabled by default. + + +.. method:: window.inch([y, x]) + + Return the character at the given position in the window. The bottom 8 bits are + the character proper, and upper bits are the attributes. + + +.. method:: window.insch([y, x,] ch[, attr]) + + Paint character *ch* at ``(y, x)`` with attributes *attr*, moving the line from + position *x* right by one character. + + +.. method:: window.insdelln(nlines) + + Inserts *nlines* lines into the specified window above the current line. The + *nlines* bottom lines are lost. For negative *nlines*, delete *nlines* lines + starting with the one under the cursor, and move the remaining lines up. The + bottom *nlines* lines are cleared. The current cursor position remains the + same. + + +.. method:: window.insertln() + + Insert a blank line under the cursor. All following lines are moved down by 1 + line. + + +.. method:: window.insnstr([y, x,] str, n [, attr]) + + Insert a character string (as many characters as will fit on the line) before + the character under the cursor, up to *n* characters. If *n* is zero or + negative, the entire string is inserted. All characters to the right of the + cursor are shifted right, with the rightmost characters on the line being lost. + The cursor position does not change (after moving to *y*, *x*, if specified). + + +.. method:: window.insstr([y, x, ] str [, attr]) + + Insert a character string (as many characters as will fit on the line) before + the character under the cursor. All characters to the right of the cursor are + shifted right, with the rightmost characters on the line being lost. The cursor + position does not change (after moving to *y*, *x*, if specified). + + +.. method:: window.instr([y, x] [, n]) + + Returns a string of characters, extracted from the window starting at the + current cursor position, or at *y*, *x* if specified. Attributes are stripped + from the characters. If *n* is specified, :meth:`instr` returns return a string + at most *n* characters long (exclusive of the trailing NUL). + + +.. method:: window.is_linetouched(line) + + Returns true if the specified line was modified since the last call to + :meth:`refresh`; otherwise returns false. Raises a :exc:`curses.error` + exception if *line* is not valid for the given window. + + +.. method:: window.is_wintouched() + + Returns true if the specified window was modified since the last call to + :meth:`refresh`; otherwise returns false. + + +.. method:: window.keypad(yes) + + If *yes* is 1, escape sequences generated by some keys (keypad, function keys) + will be interpreted by :mod:`curses`. If *yes* is 0, escape sequences will be + left as is in the input stream. + + +.. method:: window.leaveok(yes) + + If *yes* is 1, cursor is left where it is on update, instead of being at "cursor + position." This reduces cursor movement where possible. If possible the cursor + will be made invisible. + + If *yes* is 0, cursor will always be at "cursor position" after an update. + + +.. method:: window.move(new_y, new_x) + + Move cursor to ``(new_y, new_x)``. + + +.. method:: window.mvderwin(y, x) + + Moves the window inside its parent window. The screen-relative parameters of + the window are not changed. This routine is used to display different parts of + the parent window at the same physical position on the screen. + + +.. method:: window.mvwin(new_y, new_x) + + Move the window so its upper-left corner is at ``(new_y, new_x)``. + + +.. method:: window.nodelay(yes) + + If *yes* is ``1``, :meth:`getch` will be non-blocking. + + +.. method:: window.notimeout(yes) + + If *yes* is ``1``, escape sequences will not be timed out. + + If *yes* is ``0``, after a few milliseconds, an escape sequence will not be + interpreted, and will be left in the input stream as is. + + +.. method:: window.noutrefresh() + + Mark for refresh but wait. This function updates the data structure + representing the desired state of the window, but does not force an update of + the physical screen. To accomplish that, call :func:`doupdate`. + + +.. method:: window.overlay(destwin[, sminrow, smincol, dminrow, dmincol, dmaxrow, dmaxcol]) + + Overlay the window on top of *destwin*. The windows need not be the same size, + only the overlapping region is copied. This copy is non-destructive, which means + that the current background character does not overwrite the old contents of + *destwin*. + + To get fine-grained control over the copied region, the second form of + :meth:`overlay` can be used. *sminrow* and *smincol* are the upper-left + coordinates of the source window, and the other variables mark a rectangle in + the destination window. + + +.. method:: window.overwrite(destwin[, sminrow, smincol, dminrow, dmincol, dmaxrow, dmaxcol]) + + Overwrite the window on top of *destwin*. The windows need not be the same size, + in which case only the overlapping region is copied. This copy is destructive, + which means that the current background character overwrites the old contents of + *destwin*. + + To get fine-grained control over the copied region, the second form of + :meth:`overwrite` can be used. *sminrow* and *smincol* are the upper-left + coordinates of the source window, the other variables mark a rectangle in the + destination window. + + +.. method:: window.putwin(file) + + Writes all data associated with the window into the provided file object. This + information can be later retrieved using the :func:`getwin` function. + + +.. method:: window.redrawln(beg, num) + + Indicates that the *num* screen lines, starting at line *beg*, are corrupted and + should be completely redrawn on the next :meth:`refresh` call. + + +.. method:: window.redrawwin() + + Touches the entire window, causing it to be completely redrawn on the next + :meth:`refresh` call. + + +.. method:: window.refresh([pminrow, pmincol, sminrow, smincol, smaxrow, smaxcol]) + + Update the display immediately (sync actual screen with previous + drawing/deleting methods). + + The 6 optional arguments can only be specified when the window is a pad created + with :func:`newpad`. The additional parameters are needed to indicate what part + of the pad and screen are involved. *pminrow* and *pmincol* specify the upper + left-hand corner of the rectangle to be displayed in the pad. *sminrow*, + *smincol*, *smaxrow*, and *smaxcol* specify the edges of the rectangle to be + displayed on the screen. The lower right-hand corner of the rectangle to be + displayed in the pad is calculated from the screen coordinates, since the + rectangles must be the same size. Both rectangles must be entirely contained + within their respective structures. Negative values of *pminrow*, *pmincol*, + *sminrow*, or *smincol* are treated as if they were zero. + + +.. method:: window.scroll([lines=1]) + + Scroll the screen or scrolling region upward by *lines* lines. + + +.. method:: window.scrollok(flag) + + Controls what happens when the cursor of a window is moved off the edge of the + window or scrolling region, either as a result of a newline action on the bottom + line, or typing the last character of the last line. If *flag* is false, the + cursor is left on the bottom line. If *flag* is true, the window is scrolled up + one line. Note that in order to get the physical scrolling effect on the + terminal, it is also necessary to call :meth:`idlok`. + + +.. method:: window.setscrreg(top, bottom) + + Set the scrolling region from line *top* to line *bottom*. All scrolling actions + will take place in this region. + + +.. method:: window.standend() + + Turn off the standout attribute. On some terminals this has the side effect of + turning off all attributes. + + +.. method:: window.standout() + + Turn on attribute *A_STANDOUT*. + + +.. method:: window.subpad([nlines, ncols,] begin_y, begin_x) + + Return a sub-window, whose upper-left corner is at ``(begin_y, begin_x)``, and + whose width/height is *ncols*/*nlines*. + + +.. method:: window.subwin([nlines, ncols,] begin_y, begin_x) + + Return a sub-window, whose upper-left corner is at ``(begin_y, begin_x)``, and + whose width/height is *ncols*/*nlines*. + + By default, the sub-window will extend from the specified position to the lower + right corner of the window. + + +.. method:: window.syncdown() + + Touches each location in the window that has been touched in any of its ancestor + windows. This routine is called by :meth:`refresh`, so it should almost never + be necessary to call it manually. + + +.. method:: window.syncok(flag) + + If called with *flag* set to true, then :meth:`syncup` is called automatically + whenever there is a change in the window. + + +.. method:: window.syncup() + + Touches all locations in ancestors of the window that have been changed in the + window. + + +.. method:: window.timeout(delay) + + Sets blocking or non-blocking read behavior for the window. If *delay* is + negative, blocking read is used (which will wait indefinitely for input). If + *delay* is zero, then non-blocking read is used, and -1 will be returned by + :meth:`getch` if no input is waiting. If *delay* is positive, then + :meth:`getch` will block for *delay* milliseconds, and return -1 if there is + still no input at the end of that time. + + +.. method:: window.touchline(start, count[, changed]) + + Pretend *count* lines have been changed, starting with line *start*. If + *changed* is supplied, it specifies whether the affected lines are marked as + having been changed (*changed*\ =1) or unchanged (*changed*\ =0). + + +.. method:: window.touchwin() + + Pretend the whole window has been changed, for purposes of drawing + optimizations. + + +.. method:: window.untouchwin() + + Marks all lines in the window as unchanged since the last call to + :meth:`refresh`. + + +.. method:: window.vline([y, x,] ch, n) + + Display a vertical line starting at ``(y, x)`` with length *n* consisting of the + character *ch*. + + +Constants +--------- + +The :mod:`curses` module defines the following data members: + + +.. data:: ERR + + Some curses routines that return an integer, such as :func:`getch`, return + :const:`ERR` upon failure. + + +.. data:: OK + + Some curses routines that return an integer, such as :func:`napms`, return + :const:`OK` upon success. + + +.. data:: version + + A string representing the current version of the module. Also available as + :const:`__version__`. + +Several constants are available to specify character cell attributes: + ++------------------+-------------------------------+ +| Attribute | Meaning | ++==================+===============================+ +| ``A_ALTCHARSET`` | Alternate character set mode. | ++------------------+-------------------------------+ +| ``A_BLINK`` | Blink mode. | ++------------------+-------------------------------+ +| ``A_BOLD`` | Bold mode. | ++------------------+-------------------------------+ +| ``A_DIM`` | Dim mode. | ++------------------+-------------------------------+ +| ``A_NORMAL`` | Normal attribute. | ++------------------+-------------------------------+ +| ``A_STANDOUT`` | Standout mode. | ++------------------+-------------------------------+ +| ``A_UNDERLINE`` | Underline mode. | ++------------------+-------------------------------+ + +Keys are referred to by integer constants with names starting with ``KEY_``. +The exact keycaps available are system dependent. + +.. % XXX this table is far too large! +.. % XXX should this table be alphabetized? + ++-------------------+--------------------------------------------+ +| Key constant | Key | ++===================+============================================+ +| ``KEY_MIN`` | Minimum key value | ++-------------------+--------------------------------------------+ +| ``KEY_BREAK`` | Break key (unreliable) | ++-------------------+--------------------------------------------+ +| ``KEY_DOWN`` | Down-arrow | ++-------------------+--------------------------------------------+ +| ``KEY_UP`` | Up-arrow | ++-------------------+--------------------------------------------+ +| ``KEY_LEFT`` | Left-arrow | ++-------------------+--------------------------------------------+ +| ``KEY_RIGHT`` | Right-arrow | ++-------------------+--------------------------------------------+ +| ``KEY_HOME`` | Home key (upward+left arrow) | ++-------------------+--------------------------------------------+ +| ``KEY_BACKSPACE`` | Backspace (unreliable) | ++-------------------+--------------------------------------------+ +| ``KEY_F0`` | Function keys. Up to 64 function keys are | +| | supported. | ++-------------------+--------------------------------------------+ +| ``KEY_Fn`` | Value of function key *n* | ++-------------------+--------------------------------------------+ +| ``KEY_DL`` | Delete line | ++-------------------+--------------------------------------------+ +| ``KEY_IL`` | Insert line | ++-------------------+--------------------------------------------+ +| ``KEY_DC`` | Delete character | ++-------------------+--------------------------------------------+ +| ``KEY_IC`` | Insert char or enter insert mode | ++-------------------+--------------------------------------------+ +| ``KEY_EIC`` | Exit insert char mode | ++-------------------+--------------------------------------------+ +| ``KEY_CLEAR`` | Clear screen | ++-------------------+--------------------------------------------+ +| ``KEY_EOS`` | Clear to end of screen | ++-------------------+--------------------------------------------+ +| ``KEY_EOL`` | Clear to end of line | ++-------------------+--------------------------------------------+ +| ``KEY_SF`` | Scroll 1 line forward | ++-------------------+--------------------------------------------+ +| ``KEY_SR`` | Scroll 1 line backward (reverse) | ++-------------------+--------------------------------------------+ +| ``KEY_NPAGE`` | Next page | ++-------------------+--------------------------------------------+ +| ``KEY_PPAGE`` | Previous page | ++-------------------+--------------------------------------------+ +| ``KEY_STAB`` | Set tab | ++-------------------+--------------------------------------------+ +| ``KEY_CTAB`` | Clear tab | ++-------------------+--------------------------------------------+ +| ``KEY_CATAB`` | Clear all tabs | ++-------------------+--------------------------------------------+ +| ``KEY_ENTER`` | Enter or send (unreliable) | ++-------------------+--------------------------------------------+ +| ``KEY_SRESET`` | Soft (partial) reset (unreliable) | ++-------------------+--------------------------------------------+ +| ``KEY_RESET`` | Reset or hard reset (unreliable) | ++-------------------+--------------------------------------------+ +| ``KEY_PRINT`` | Print | ++-------------------+--------------------------------------------+ +| ``KEY_LL`` | Home down or bottom (lower left) | ++-------------------+--------------------------------------------+ +| ``KEY_A1`` | Upper left of keypad | ++-------------------+--------------------------------------------+ +| ``KEY_A3`` | Upper right of keypad | ++-------------------+--------------------------------------------+ +| ``KEY_B2`` | Center of keypad | ++-------------------+--------------------------------------------+ +| ``KEY_C1`` | Lower left of keypad | ++-------------------+--------------------------------------------+ +| ``KEY_C3`` | Lower right of keypad | ++-------------------+--------------------------------------------+ +| ``KEY_BTAB`` | Back tab | ++-------------------+--------------------------------------------+ +| ``KEY_BEG`` | Beg (beginning) | ++-------------------+--------------------------------------------+ +| ``KEY_CANCEL`` | Cancel | ++-------------------+--------------------------------------------+ +| ``KEY_CLOSE`` | Close | ++-------------------+--------------------------------------------+ +| ``KEY_COMMAND`` | Cmd (command) | ++-------------------+--------------------------------------------+ +| ``KEY_COPY`` | Copy | ++-------------------+--------------------------------------------+ +| ``KEY_CREATE`` | Create | ++-------------------+--------------------------------------------+ +| ``KEY_END`` | End | ++-------------------+--------------------------------------------+ +| ``KEY_EXIT`` | Exit | ++-------------------+--------------------------------------------+ +| ``KEY_FIND`` | Find | ++-------------------+--------------------------------------------+ +| ``KEY_HELP`` | Help | ++-------------------+--------------------------------------------+ +| ``KEY_MARK`` | Mark | ++-------------------+--------------------------------------------+ +| ``KEY_MESSAGE`` | Message | ++-------------------+--------------------------------------------+ +| ``KEY_MOVE`` | Move | ++-------------------+--------------------------------------------+ +| ``KEY_NEXT`` | Next | ++-------------------+--------------------------------------------+ +| ``KEY_OPEN`` | Open | ++-------------------+--------------------------------------------+ +| ``KEY_OPTIONS`` | Options | ++-------------------+--------------------------------------------+ +| ``KEY_PREVIOUS`` | Prev (previous) | ++-------------------+--------------------------------------------+ +| ``KEY_REDO`` | Redo | ++-------------------+--------------------------------------------+ +| ``KEY_REFERENCE`` | Ref (reference) | ++-------------------+--------------------------------------------+ +| ``KEY_REFRESH`` | Refresh | ++-------------------+--------------------------------------------+ +| ``KEY_REPLACE`` | Replace | ++-------------------+--------------------------------------------+ +| ``KEY_RESTART`` | Restart | ++-------------------+--------------------------------------------+ +| ``KEY_RESUME`` | Resume | ++-------------------+--------------------------------------------+ +| ``KEY_SAVE`` | Save | ++-------------------+--------------------------------------------+ +| ``KEY_SBEG`` | Shifted Beg (beginning) | ++-------------------+--------------------------------------------+ +| ``KEY_SCANCEL`` | Shifted Cancel | ++-------------------+--------------------------------------------+ +| ``KEY_SCOMMAND`` | Shifted Command | ++-------------------+--------------------------------------------+ +| ``KEY_SCOPY`` | Shifted Copy | ++-------------------+--------------------------------------------+ +| ``KEY_SCREATE`` | Shifted Create | ++-------------------+--------------------------------------------+ +| ``KEY_SDC`` | Shifted Delete char | ++-------------------+--------------------------------------------+ +| ``KEY_SDL`` | Shifted Delete line | ++-------------------+--------------------------------------------+ +| ``KEY_SELECT`` | Select | ++-------------------+--------------------------------------------+ +| ``KEY_SEND`` | Shifted End | ++-------------------+--------------------------------------------+ +| ``KEY_SEOL`` | Shifted Clear line | ++-------------------+--------------------------------------------+ +| ``KEY_SEXIT`` | Shifted Dxit | ++-------------------+--------------------------------------------+ +| ``KEY_SFIND`` | Shifted Find | ++-------------------+--------------------------------------------+ +| ``KEY_SHELP`` | Shifted Help | ++-------------------+--------------------------------------------+ +| ``KEY_SHOME`` | Shifted Home | ++-------------------+--------------------------------------------+ +| ``KEY_SIC`` | Shifted Input | ++-------------------+--------------------------------------------+ +| ``KEY_SLEFT`` | Shifted Left arrow | ++-------------------+--------------------------------------------+ +| ``KEY_SMESSAGE`` | Shifted Message | ++-------------------+--------------------------------------------+ +| ``KEY_SMOVE`` | Shifted Move | ++-------------------+--------------------------------------------+ +| ``KEY_SNEXT`` | Shifted Next | ++-------------------+--------------------------------------------+ +| ``KEY_SOPTIONS`` | Shifted Options | ++-------------------+--------------------------------------------+ +| ``KEY_SPREVIOUS`` | Shifted Prev | ++-------------------+--------------------------------------------+ +| ``KEY_SPRINT`` | Shifted Print | ++-------------------+--------------------------------------------+ +| ``KEY_SREDO`` | Shifted Redo | ++-------------------+--------------------------------------------+ +| ``KEY_SREPLACE`` | Shifted Replace | ++-------------------+--------------------------------------------+ +| ``KEY_SRIGHT`` | Shifted Right arrow | ++-------------------+--------------------------------------------+ +| ``KEY_SRSUME`` | Shifted Resume | ++-------------------+--------------------------------------------+ +| ``KEY_SSAVE`` | Shifted Save | ++-------------------+--------------------------------------------+ +| ``KEY_SSUSPEND`` | Shifted Suspend | ++-------------------+--------------------------------------------+ +| ``KEY_SUNDO`` | Shifted Undo | ++-------------------+--------------------------------------------+ +| ``KEY_SUSPEND`` | Suspend | ++-------------------+--------------------------------------------+ +| ``KEY_UNDO`` | Undo | ++-------------------+--------------------------------------------+ +| ``KEY_MOUSE`` | Mouse event has occurred | ++-------------------+--------------------------------------------+ +| ``KEY_RESIZE`` | Terminal resize event | ++-------------------+--------------------------------------------+ +| ``KEY_MAX`` | Maximum key value | ++-------------------+--------------------------------------------+ + +On VT100s and their software emulations, such as X terminal emulators, there are +normally at least four function keys (:const:`KEY_F1`, :const:`KEY_F2`, +:const:`KEY_F3`, :const:`KEY_F4`) available, and the arrow keys mapped to +:const:`KEY_UP`, :const:`KEY_DOWN`, :const:`KEY_LEFT` and :const:`KEY_RIGHT` in +the obvious way. If your machine has a PC keyboard, it is safe to expect arrow +keys and twelve function keys (older PC keyboards may have only ten function +keys); also, the following keypad mappings are standard: + ++------------------+-----------+ +| Keycap | Constant | ++==================+===========+ +| :kbd:`Insert` | KEY_IC | ++------------------+-----------+ +| :kbd:`Delete` | KEY_DC | ++------------------+-----------+ +| :kbd:`Home` | KEY_HOME | ++------------------+-----------+ +| :kbd:`End` | KEY_END | ++------------------+-----------+ +| :kbd:`Page Up` | KEY_NPAGE | ++------------------+-----------+ +| :kbd:`Page Down` | KEY_PPAGE | ++------------------+-----------+ + +The following table lists characters from the alternate character set. These are +inherited from the VT100 terminal, and will generally be available on software +emulations such as X terminals. When there is no graphic available, curses +falls back on a crude printable ASCII approximation. + +.. note:: + + These are available only after :func:`initscr` has been called. + ++------------------+------------------------------------------+ +| ACS code | Meaning | ++==================+==========================================+ +| ``ACS_BBSS`` | alternate name for upper right corner | ++------------------+------------------------------------------+ +| ``ACS_BLOCK`` | solid square block | ++------------------+------------------------------------------+ +| ``ACS_BOARD`` | board of squares | ++------------------+------------------------------------------+ +| ``ACS_BSBS`` | alternate name for horizontal line | ++------------------+------------------------------------------+ +| ``ACS_BSSB`` | alternate name for upper left corner | ++------------------+------------------------------------------+ +| ``ACS_BSSS`` | alternate name for top tee | ++------------------+------------------------------------------+ +| ``ACS_BTEE`` | bottom tee | ++------------------+------------------------------------------+ +| ``ACS_BULLET`` | bullet | ++------------------+------------------------------------------+ +| ``ACS_CKBOARD`` | checker board (stipple) | ++------------------+------------------------------------------+ +| ``ACS_DARROW`` | arrow pointing down | ++------------------+------------------------------------------+ +| ``ACS_DEGREE`` | degree symbol | ++------------------+------------------------------------------+ +| ``ACS_DIAMOND`` | diamond | ++------------------+------------------------------------------+ +| ``ACS_GEQUAL`` | greater-than-or-equal-to | ++------------------+------------------------------------------+ +| ``ACS_HLINE`` | horizontal line | ++------------------+------------------------------------------+ +| ``ACS_LANTERN`` | lantern symbol | ++------------------+------------------------------------------+ +| ``ACS_LARROW`` | left arrow | ++------------------+------------------------------------------+ +| ``ACS_LEQUAL`` | less-than-or-equal-to | ++------------------+------------------------------------------+ +| ``ACS_LLCORNER`` | lower left-hand corner | ++------------------+------------------------------------------+ +| ``ACS_LRCORNER`` | lower right-hand corner | ++------------------+------------------------------------------+ +| ``ACS_LTEE`` | left tee | ++------------------+------------------------------------------+ +| ``ACS_NEQUAL`` | not-equal sign | ++------------------+------------------------------------------+ +| ``ACS_PI`` | letter pi | ++------------------+------------------------------------------+ +| ``ACS_PLMINUS`` | plus-or-minus sign | ++------------------+------------------------------------------+ +| ``ACS_PLUS`` | big plus sign | ++------------------+------------------------------------------+ +| ``ACS_RARROW`` | right arrow | ++------------------+------------------------------------------+ +| ``ACS_RTEE`` | right tee | ++------------------+------------------------------------------+ +| ``ACS_S1`` | scan line 1 | ++------------------+------------------------------------------+ +| ``ACS_S3`` | scan line 3 | ++------------------+------------------------------------------+ +| ``ACS_S7`` | scan line 7 | ++------------------+------------------------------------------+ +| ``ACS_S9`` | scan line 9 | ++------------------+------------------------------------------+ +| ``ACS_SBBS`` | alternate name for lower right corner | ++------------------+------------------------------------------+ +| ``ACS_SBSB`` | alternate name for vertical line | ++------------------+------------------------------------------+ +| ``ACS_SBSS`` | alternate name for right tee | ++------------------+------------------------------------------+ +| ``ACS_SSBB`` | alternate name for lower left corner | ++------------------+------------------------------------------+ +| ``ACS_SSBS`` | alternate name for bottom tee | ++------------------+------------------------------------------+ +| ``ACS_SSSB`` | alternate name for left tee | ++------------------+------------------------------------------+ +| ``ACS_SSSS`` | alternate name for crossover or big plus | ++------------------+------------------------------------------+ +| ``ACS_STERLING`` | pound sterling | ++------------------+------------------------------------------+ +| ``ACS_TTEE`` | top tee | ++------------------+------------------------------------------+ +| ``ACS_UARROW`` | up arrow | ++------------------+------------------------------------------+ +| ``ACS_ULCORNER`` | upper left corner | ++------------------+------------------------------------------+ +| ``ACS_URCORNER`` | upper right corner | ++------------------+------------------------------------------+ +| ``ACS_VLINE`` | vertical line | ++------------------+------------------------------------------+ + +The following table lists the predefined colors: + ++-------------------+----------------------------+ +| Constant | Color | ++===================+============================+ +| ``COLOR_BLACK`` | Black | ++-------------------+----------------------------+ +| ``COLOR_BLUE`` | Blue | ++-------------------+----------------------------+ +| ``COLOR_CYAN`` | Cyan (light greenish blue) | ++-------------------+----------------------------+ +| ``COLOR_GREEN`` | Green | ++-------------------+----------------------------+ +| ``COLOR_MAGENTA`` | Magenta (purplish red) | ++-------------------+----------------------------+ +| ``COLOR_RED`` | Red | ++-------------------+----------------------------+ +| ``COLOR_WHITE`` | White | ++-------------------+----------------------------+ +| ``COLOR_YELLOW`` | Yellow | ++-------------------+----------------------------+ + + +:mod:`curses.textpad` --- Text input widget for curses programs +=============================================================== + +.. module:: curses.textpad + :synopsis: Emacs-like input editing in a curses window. +.. moduleauthor:: Eric Raymond <esr@thyrsus.com> +.. sectionauthor:: Eric Raymond <esr@thyrsus.com> + + +.. versionadded:: 1.6 + +The :mod:`curses.textpad` module provides a :class:`Textbox` class that handles +elementary text editing in a curses window, supporting a set of keybindings +resembling those of Emacs (thus, also of Netscape Navigator, BBedit 6.x, +FrameMaker, and many other programs). The module also provides a +rectangle-drawing function useful for framing text boxes or for other purposes. + +The module :mod:`curses.textpad` defines the following function: + + +.. function:: rectangle(win, uly, ulx, lry, lrx) + + Draw a rectangle. The first argument must be a window object; the remaining + arguments are coordinates relative to that window. The second and third + arguments are the y and x coordinates of the upper left hand corner of the + rectangle to be drawn; the fourth and fifth arguments are the y and x + coordinates of the lower right hand corner. The rectangle will be drawn using + VT100/IBM PC forms characters on terminals that make this possible (including + xterm and most other software terminal emulators). Otherwise it will be drawn + with ASCII dashes, vertical bars, and plus signs. + + +.. _curses-textpad-objects: + +Textbox objects +--------------- + +You can instantiate a :class:`Textbox` object as follows: + + +.. class:: Textbox(win) + + Return a textbox widget object. The *win* argument should be a curses + :class:`WindowObject` in which the textbox is to be contained. The edit cursor + of the textbox is initially located at the upper left hand corner of the + containing window, with coordinates ``(0, 0)``. The instance's + :attr:`stripspaces` flag is initially on. + +:class:`Textbox` objects have the following methods: + + +.. method:: Textbox.edit([validator]) + + This is the entry point you will normally use. It accepts editing keystrokes + until one of the termination keystrokes is entered. If *validator* is supplied, + it must be a function. It will be called for each keystroke entered with the + keystroke as a parameter; command dispatch is done on the result. This method + returns the window contents as a string; whether blanks in the window are + included is affected by the :attr:`stripspaces` member. + + +.. method:: Textbox.do_command(ch) + + Process a single command keystroke. Here are the supported special keystrokes: + + +------------------+-------------------------------------------+ + | Keystroke | Action | + +==================+===========================================+ + | :kbd:`Control-A` | Go to left edge of window. | + +------------------+-------------------------------------------+ + | :kbd:`Control-B` | Cursor left, wrapping to previous line if | + | | appropriate. | + +------------------+-------------------------------------------+ + | :kbd:`Control-D` | Delete character under cursor. | + +------------------+-------------------------------------------+ + | :kbd:`Control-E` | Go to right edge (stripspaces off) or end | + | | of line (stripspaces on). | + +------------------+-------------------------------------------+ + | :kbd:`Control-F` | Cursor right, wrapping to next line when | + | | appropriate. | + +------------------+-------------------------------------------+ + | :kbd:`Control-G` | Terminate, returning the window contents. | + +------------------+-------------------------------------------+ + | :kbd:`Control-H` | Delete character backward. | + +------------------+-------------------------------------------+ + | :kbd:`Control-J` | Terminate if the window is 1 line, | + | | otherwise insert newline. | + +------------------+-------------------------------------------+ + | :kbd:`Control-K` | If line is blank, delete it, otherwise | + | | clear to end of line. | + +------------------+-------------------------------------------+ + | :kbd:`Control-L` | Refresh screen. | + +------------------+-------------------------------------------+ + | :kbd:`Control-N` | Cursor down; move down one line. | + +------------------+-------------------------------------------+ + | :kbd:`Control-O` | Insert a blank line at cursor location. | + +------------------+-------------------------------------------+ + | :kbd:`Control-P` | Cursor up; move up one line. | + +------------------+-------------------------------------------+ + + Move operations do nothing if the cursor is at an edge where the movement is not + possible. The following synonyms are supported where possible: + + +------------------------+------------------+ + | Constant | Keystroke | + +========================+==================+ + | :const:`KEY_LEFT` | :kbd:`Control-B` | + +------------------------+------------------+ + | :const:`KEY_RIGHT` | :kbd:`Control-F` | + +------------------------+------------------+ + | :const:`KEY_UP` | :kbd:`Control-P` | + +------------------------+------------------+ + | :const:`KEY_DOWN` | :kbd:`Control-N` | + +------------------------+------------------+ + | :const:`KEY_BACKSPACE` | :kbd:`Control-h` | + +------------------------+------------------+ + + All other keystrokes are treated as a command to insert the given character and + move right (with line wrapping). + + +.. method:: Textbox.gather() + + This method returns the window contents as a string; whether blanks in the + window are included is affected by the :attr:`stripspaces` member. + + +.. attribute:: Textbox.stripspaces + + This data member is a flag which controls the interpretation of blanks in the + window. When it is on, trailing blanks on each line are ignored; any cursor + motion that would land the cursor on a trailing blank goes to the end of that + line instead, and trailing blanks are stripped when the window contents are + gathered. + + +:mod:`curses.wrapper` --- Terminal handler for curses programs +============================================================== + +.. module:: curses.wrapper + :synopsis: Terminal configuration wrapper for curses programs. +.. moduleauthor:: Eric Raymond <esr@thyrsus.com> +.. sectionauthor:: Eric Raymond <esr@thyrsus.com> + + +.. versionadded:: 1.6 + +This module supplies one function, :func:`wrapper`, which runs another function +which should be the rest of your curses-using application. If the application +raises an exception, :func:`wrapper` will restore the terminal to a sane state +before re-raising the exception and generating a traceback. + + +.. function:: wrapper(func, ...) + + Wrapper function that initializes curses and calls another function, *func*, + restoring normal keyboard/screen behavior on error. The callable object *func* + is then passed the main window 'stdscr' as its first argument, followed by any + other arguments passed to :func:`wrapper`. + +Before calling the hook function, :func:`wrapper` turns on cbreak mode, turns +off echo, enables the terminal keypad, and initializes colors if the terminal +has color support. On exit (whether normally or by exception) it restores +cooked mode, turns on echo, and disables the terminal keypad. + diff --git a/Doc/library/custominterp.rst b/Doc/library/custominterp.rst new file mode 100644 index 0000000..2a9f0a4 --- /dev/null +++ b/Doc/library/custominterp.rst @@ -0,0 +1,20 @@ + +.. _custominterp: + +************************** +Custom Python Interpreters +************************** + +The modules described in this chapter allow writing interfaces similar to +Python's interactive interpreter. If you want a Python interpreter that +supports some special feature in addition to the Python language, you should +look at the :mod:`code` module. (The :mod:`codeop` module is lower-level, used +to support compiling a possibly-incomplete chunk of Python code.) + +The full list of modules described in this chapter is: + + +.. toctree:: + + code.rst + codeop.rst diff --git a/Doc/library/datatypes.rst b/Doc/library/datatypes.rst new file mode 100644 index 0000000..4cd042d --- /dev/null +++ b/Doc/library/datatypes.rst @@ -0,0 +1,37 @@ + +.. _datatypes: + +********** +Data Types +********** + +The modules described in this chapter provide a variety of specialized data +types such as dates and times, fixed-type arrays, heap queues, synchronized +queues, and sets. + +Python also provides some built-in data types, in particular, +:class:`dict`, :class:`list`, :class:`set` and :class:`frozenset`, and +:class:`tuple`. The :class:`str` class can be used to handle binary data +and 8-bit text, and the :class:`unicode` class to handle Unicode text. + +The following modules are documented in this chapter: + + +.. toctree:: + + datetime.rst + calendar.rst + collections.rst + heapq.rst + bisect.rst + array.rst + sched.rst + mutex.rst + queue.rst + weakref.rst + userdict.rst + types.rst + new.rst + copy.rst + pprint.rst + repr.rst diff --git a/Doc/library/datetime.rst b/Doc/library/datetime.rst new file mode 100644 index 0000000..24d4f69 --- /dev/null +++ b/Doc/library/datetime.rst @@ -0,0 +1,1348 @@ +.. % XXX what order should the types be discussed in? + + +:mod:`datetime` --- Basic date and time types +============================================= + +.. module:: datetime + :synopsis: Basic date and time types. +.. moduleauthor:: Tim Peters <tim@zope.com> +.. sectionauthor:: Tim Peters <tim@zope.com> +.. sectionauthor:: A.M. Kuchling <amk@amk.ca> + + +.. versionadded:: 2.3 + +The :mod:`datetime` module supplies classes for manipulating dates and times in +both simple and complex ways. While date and time arithmetic is supported, the +focus of the implementation is on efficient member extraction for output +formatting and manipulation. For related +functionality, see also the :mod:`time` and :mod:`calendar` modules. + +There are two kinds of date and time objects: "naive" and "aware". This +distinction refers to whether the object has any notion of time zone, daylight +saving time, or other kind of algorithmic or political time adjustment. Whether +a naive :class:`datetime` object represents Coordinated Universal Time (UTC), +local time, or time in some other timezone is purely up to the program, just +like it's up to the program whether a particular number represents metres, +miles, or mass. Naive :class:`datetime` objects are easy to understand and to +work with, at the cost of ignoring some aspects of reality. + +For applications requiring more, :class:`datetime` and :class:`time` objects +have an optional time zone information member, :attr:`tzinfo`, that can contain +an instance of a subclass of the abstract :class:`tzinfo` class. These +:class:`tzinfo` objects capture information about the offset from UTC time, the +time zone name, and whether Daylight Saving Time is in effect. Note that no +concrete :class:`tzinfo` classes are supplied by the :mod:`datetime` module. +Supporting timezones at whatever level of detail is required is up to the +application. The rules for time adjustment across the world are more political +than rational, and there is no standard suitable for every application. + +The :mod:`datetime` module exports the following constants: + + +.. data:: MINYEAR + + The smallest year number allowed in a :class:`date` or :class:`datetime` object. + :const:`MINYEAR` is ``1``. + + +.. data:: MAXYEAR + + The largest year number allowed in a :class:`date` or :class:`datetime` object. + :const:`MAXYEAR` is ``9999``. + + +.. seealso:: + + Module :mod:`calendar` + General calendar related functions. + + Module :mod:`time` + Time access and conversions. + + +Available Types +--------------- + + +.. class:: date + + An idealized naive date, assuming the current Gregorian calendar always was, and + always will be, in effect. Attributes: :attr:`year`, :attr:`month`, and + :attr:`day`. + + +.. class:: time + + An idealized time, independent of any particular day, assuming that every day + has exactly 24\*60\*60 seconds (there is no notion of "leap seconds" here). + Attributes: :attr:`hour`, :attr:`minute`, :attr:`second`, :attr:`microsecond`, + and :attr:`tzinfo`. + + +.. class:: datetime + + A combination of a date and a time. Attributes: :attr:`year`, :attr:`month`, + :attr:`day`, :attr:`hour`, :attr:`minute`, :attr:`second`, :attr:`microsecond`, + and :attr:`tzinfo`. + + +.. class:: timedelta + + A duration expressing the difference between two :class:`date`, :class:`time`, + or :class:`datetime` instances to microsecond resolution. + + +.. class:: tzinfo + + An abstract base class for time zone information objects. These are used by the + :class:`datetime` and :class:`time` classes to provide a customizable notion of + time adjustment (for example, to account for time zone and/or daylight saving + time). + +Objects of these types are immutable. + +Objects of the :class:`date` type are always naive. + +An object *d* of type :class:`time` or :class:`datetime` may be naive or aware. +*d* is aware if ``d.tzinfo`` is not ``None`` and ``d.tzinfo.utcoffset(d)`` does +not return ``None``. If ``d.tzinfo`` is ``None``, or if ``d.tzinfo`` is not +``None`` but ``d.tzinfo.utcoffset(d)`` returns ``None``, *d* is naive. + +The distinction between naive and aware doesn't apply to :class:`timedelta` +objects. + +Subclass relationships:: + + object + timedelta + tzinfo + time + date + datetime + + +.. _datetime-timedelta: + +:class:`timedelta` Objects +-------------------------- + +A :class:`timedelta` object represents a duration, the difference between two +dates or times. + + +.. class:: timedelta([days[, seconds[, microseconds[, milliseconds[, minutes[, hours[, weeks]]]]]]]) + + All arguments are optional and default to ``0``. Arguments may be ints, longs, + or floats, and may be positive or negative. + + Only *days*, *seconds* and *microseconds* are stored internally. Arguments are + converted to those units: + + * A millisecond is converted to 1000 microseconds. + * A minute is converted to 60 seconds. + * An hour is converted to 3600 seconds. + * A week is converted to 7 days. + + and days, seconds and microseconds are then normalized so that the + representation is unique, with + + * ``0 <= microseconds < 1000000`` + * ``0 <= seconds < 3600*24`` (the number of seconds in one day) + * ``-999999999 <= days <= 999999999`` + + If any argument is a float and there are fractional microseconds, the fractional + microseconds left over from all arguments are combined and their sum is rounded + to the nearest microsecond. If no argument is a float, the conversion and + normalization processes are exact (no information is lost). + + If the normalized value of days lies outside the indicated range, + :exc:`OverflowError` is raised. + + Note that normalization of negative values may be surprising at first. For + example, :: + + >>> d = timedelta(microseconds=-1) + >>> (d.days, d.seconds, d.microseconds) + (-1, 86399, 999999) + +Class attributes are: + + +.. attribute:: timedelta.min + + The most negative :class:`timedelta` object, ``timedelta(-999999999)``. + + +.. attribute:: timedelta.max + + The most positive :class:`timedelta` object, ``timedelta(days=999999999, + hours=23, minutes=59, seconds=59, microseconds=999999)``. + + +.. attribute:: timedelta.resolution + + The smallest possible difference between non-equal :class:`timedelta` objects, + ``timedelta(microseconds=1)``. + +Note that, because of normalization, ``timedelta.max`` > ``-timedelta.min``. +``-timedelta.max`` is not representable as a :class:`timedelta` object. + +Instance attributes (read-only): + ++------------------+--------------------------------------------+ +| Attribute | Value | ++==================+============================================+ +| ``days`` | Between -999999999 and 999999999 inclusive | ++------------------+--------------------------------------------+ +| ``seconds`` | Between 0 and 86399 inclusive | ++------------------+--------------------------------------------+ +| ``microseconds`` | Between 0 and 999999 inclusive | ++------------------+--------------------------------------------+ + +Supported operations: + +.. % XXX this table is too wide! + ++--------------------------------+-----------------------------------------------+ +| Operation | Result | ++================================+===============================================+ +| ``t1 = t2 + t3`` | Sum of *t2* and *t3*. Afterwards *t1*-*t2* == | +| | *t3* and *t1*-*t3* == *t2* are true. (1) | ++--------------------------------+-----------------------------------------------+ +| ``t1 = t2 - t3`` | Difference of *t2* and *t3*. Afterwards *t1* | +| | == *t2* - *t3* and *t2* == *t1* + *t3* are | +| | true. (1) | ++--------------------------------+-----------------------------------------------+ +| ``t1 = t2 * i or t1 = i * t2`` | Delta multiplied by an integer or long. | +| | Afterwards *t1* // i == *t2* is true, | +| | provided ``i != 0``. | ++--------------------------------+-----------------------------------------------+ +| | In general, *t1* \* i == *t1* \* (i-1) + *t1* | +| | is true. (1) | ++--------------------------------+-----------------------------------------------+ +| ``t1 = t2 // i`` | The floor is computed and the remainder (if | +| | any) is thrown away. (3) | ++--------------------------------+-----------------------------------------------+ +| ``+t1`` | Returns a :class:`timedelta` object with the | +| | same value. (2) | ++--------------------------------+-----------------------------------------------+ +| ``-t1`` | equivalent to :class:`timedelta`\ | +| | (-*t1.days*, -*t1.seconds*, | +| | -*t1.microseconds*), and to *t1*\* -1. (1)(4) | ++--------------------------------+-----------------------------------------------+ +| ``abs(t)`` | equivalent to +*t* when ``t.days >= 0``, and | +| | to -*t* when ``t.days < 0``. (2) | ++--------------------------------+-----------------------------------------------+ + +Notes: + +(1) + This is exact, but may overflow. + +(2) + This is exact, and cannot overflow. + +(3) + Division by 0 raises :exc:`ZeroDivisionError`. + +(4) + -*timedelta.max* is not representable as a :class:`timedelta` object. + +In addition to the operations listed above :class:`timedelta` objects support +certain additions and subtractions with :class:`date` and :class:`datetime` +objects (see below). + +Comparisons of :class:`timedelta` objects are supported with the +:class:`timedelta` object representing the smaller duration considered to be the +smaller timedelta. In order to stop mixed-type comparisons from falling back to +the default comparison by object address, when a :class:`timedelta` object is +compared to an object of a different type, :exc:`TypeError` is raised unless the +comparison is ``==`` or ``!=``. The latter cases return :const:`False` or +:const:`True`, respectively. + +:class:`timedelta` objects are hashable (usable as dictionary keys), support +efficient pickling, and in Boolean contexts, a :class:`timedelta` object is +considered to be true if and only if it isn't equal to ``timedelta(0)``. + + +.. _datetime-date: + +:class:`date` Objects +--------------------- + +A :class:`date` object represents a date (year, month and day) in an idealized +calendar, the current Gregorian calendar indefinitely extended in both +directions. January 1 of year 1 is called day number 1, January 2 of year 1 is +called day number 2, and so on. This matches the definition of the "proleptic +Gregorian" calendar in Dershowitz and Reingold's book Calendrical Calculations, +where it's the base calendar for all computations. See the book for algorithms +for converting between proleptic Gregorian ordinals and many other calendar +systems. + + +.. class:: date(year, month, day) + + All arguments are required. Arguments may be ints or longs, in the following + ranges: + + * ``MINYEAR <= year <= MAXYEAR`` + * ``1 <= month <= 12`` + * ``1 <= day <= number of days in the given month and year`` + + If an argument outside those ranges is given, :exc:`ValueError` is raised. + +Other constructors, all class methods: + + +.. method:: date.today() + + Return the current local date. This is equivalent to + ``date.fromtimestamp(time.time())``. + + +.. method:: date.fromtimestamp(timestamp) + + Return the local date corresponding to the POSIX timestamp, such as is returned + by :func:`time.time`. This may raise :exc:`ValueError`, if the timestamp is out + of the range of values supported by the platform C :cfunc:`localtime` function. + It's common for this to be restricted to years from 1970 through 2038. Note + that on non-POSIX systems that include leap seconds in their notion of a + timestamp, leap seconds are ignored by :meth:`fromtimestamp`. + + +.. method:: date.fromordinal(ordinal) + + Return the date corresponding to the proleptic Gregorian ordinal, where January + 1 of year 1 has ordinal 1. :exc:`ValueError` is raised unless ``1 <= ordinal <= + date.max.toordinal()``. For any date *d*, ``date.fromordinal(d.toordinal()) == + d``. + +Class attributes: + + +.. attribute:: date.min + + The earliest representable date, ``date(MINYEAR, 1, 1)``. + + +.. attribute:: date.max + + The latest representable date, ``date(MAXYEAR, 12, 31)``. + + +.. attribute:: date.resolution + + The smallest possible difference between non-equal date objects, + ``timedelta(days=1)``. + +Instance attributes (read-only): + + +.. attribute:: date.year + + Between :const:`MINYEAR` and :const:`MAXYEAR` inclusive. + + +.. attribute:: date.month + + Between 1 and 12 inclusive. + + +.. attribute:: date.day + + Between 1 and the number of days in the given month of the given year. + +Supported operations: + ++-------------------------------+----------------------------------------------+ +| Operation | Result | ++===============================+==============================================+ +| ``date2 = date1 + timedelta`` | *date2* is ``timedelta.days`` days removed | +| | from *date1*. (1) | ++-------------------------------+----------------------------------------------+ +| ``date2 = date1 - timedelta`` | Computes *date2* such that ``date2 + | +| | timedelta == date1``. (2) | ++-------------------------------+----------------------------------------------+ +| ``timedelta = date1 - date2`` | \(3) | ++-------------------------------+----------------------------------------------+ +| ``date1 < date2`` | *date1* is considered less than *date2* when | +| | *date1* precedes *date2* in time. (4) | ++-------------------------------+----------------------------------------------+ + +Notes: + +(1) + *date2* is moved forward in time if ``timedelta.days > 0``, or backward if + ``timedelta.days < 0``. Afterward ``date2 - date1 == timedelta.days``. + ``timedelta.seconds`` and ``timedelta.microseconds`` are ignored. + :exc:`OverflowError` is raised if ``date2.year`` would be smaller than + :const:`MINYEAR` or larger than :const:`MAXYEAR`. + +(2) + This isn't quite equivalent to date1 + (-timedelta), because -timedelta in + isolation can overflow in cases where date1 - timedelta does not. + ``timedelta.seconds`` and ``timedelta.microseconds`` are ignored. + +(3) + This is exact, and cannot overflow. timedelta.seconds and + timedelta.microseconds are 0, and date2 + timedelta == date1 after. + +(4) + In other words, ``date1 < date2`` if and only if ``date1.toordinal() < + date2.toordinal()``. In order to stop comparison from falling back to the + default scheme of comparing object addresses, date comparison normally raises + :exc:`TypeError` if the other comparand isn't also a :class:`date` object. + However, ``NotImplemented`` is returned instead if the other comparand has a + :meth:`timetuple` attribute. This hook gives other kinds of date objects a + chance at implementing mixed-type comparison. If not, when a :class:`date` + object is compared to an object of a different type, :exc:`TypeError` is raised + unless the comparison is ``==`` or ``!=``. The latter cases return + :const:`False` or :const:`True`, respectively. + +Dates can be used as dictionary keys. In Boolean contexts, all :class:`date` +objects are considered to be true. + +Instance methods: + + +.. method:: date.replace(year, month, day) + + Return a date with the same value, except for those members given new values by + whichever keyword arguments are specified. For example, if ``d == date(2002, + 12, 31)``, then ``d.replace(day=26) == date(2002, 12, 26)``. + + +.. method:: date.timetuple() + + Return a :class:`time.struct_time` such as returned by :func:`time.localtime`. + The hours, minutes and seconds are 0, and the DST flag is -1. ``d.timetuple()`` + is equivalent to ``time.struct_time((d.year, d.month, d.day, 0, 0, 0, + d.weekday(), d.toordinal() - date(d.year, 1, 1).toordinal() + 1, -1))`` + + +.. method:: date.toordinal() + + Return the proleptic Gregorian ordinal of the date, where January 1 of year 1 + has ordinal 1. For any :class:`date` object *d*, + ``date.fromordinal(d.toordinal()) == d``. + + +.. method:: date.weekday() + + Return the day of the week as an integer, where Monday is 0 and Sunday is 6. + For example, ``date(2002, 12, 4).weekday() == 2``, a Wednesday. See also + :meth:`isoweekday`. + + +.. method:: date.isoweekday() + + Return the day of the week as an integer, where Monday is 1 and Sunday is 7. + For example, ``date(2002, 12, 4).isoweekday() == 3``, a Wednesday. See also + :meth:`weekday`, :meth:`isocalendar`. + + +.. method:: date.isocalendar() + + Return a 3-tuple, (ISO year, ISO week number, ISO weekday). + + The ISO calendar is a widely used variant of the Gregorian calendar. See + http://www.phys.uu.nl/ vgent/calendar/isocalendar.htm for a good explanation. + + The ISO year consists of 52 or 53 full weeks, and where a week starts on a + Monday and ends on a Sunday. The first week of an ISO year is the first + (Gregorian) calendar week of a year containing a Thursday. This is called week + number 1, and the ISO year of that Thursday is the same as its Gregorian year. + + For example, 2004 begins on a Thursday, so the first week of ISO year 2004 + begins on Monday, 29 Dec 2003 and ends on Sunday, 4 Jan 2004, so that + ``date(2003, 12, 29).isocalendar() == (2004, 1, 1)`` and ``date(2004, 1, + 4).isocalendar() == (2004, 1, 7)``. + + +.. method:: date.isoformat() + + Return a string representing the date in ISO 8601 format, 'YYYY-MM-DD'. For + example, ``date(2002, 12, 4).isoformat() == '2002-12-04'``. + + +.. method:: date.__str__() + + For a date *d*, ``str(d)`` is equivalent to ``d.isoformat()``. + + +.. method:: date.ctime() + + Return a string representing the date, for example ``date(2002, 12, + 4).ctime() == 'Wed Dec 4 00:00:00 2002'``. ``d.ctime()`` is equivalent to + ``time.ctime(time.mktime(d.timetuple()))`` on platforms where the native C + :cfunc:`ctime` function (which :func:`time.ctime` invokes, but which + :meth:`date.ctime` does not invoke) conforms to the C standard. + + +.. method:: date.strftime(format) + + Return a string representing the date, controlled by an explicit format string. + Format codes referring to hours, minutes or seconds will see 0 values. See + section :ref:`strftime-behavior`. + + +.. _datetime-datetime: + +:class:`datetime` Objects +------------------------- + +A :class:`datetime` object is a single object containing all the information +from a :class:`date` object and a :class:`time` object. Like a :class:`date` +object, :class:`datetime` assumes the current Gregorian calendar extended in +both directions; like a time object, :class:`datetime` assumes there are exactly +3600\*24 seconds in every day. + +Constructor: + + +.. class:: datetime(year, month, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]]) + + The year, month and day arguments are required. *tzinfo* may be ``None``, or an + instance of a :class:`tzinfo` subclass. The remaining arguments may be ints or + longs, in the following ranges: + + * ``MINYEAR <= year <= MAXYEAR`` + * ``1 <= month <= 12`` + * ``1 <= day <= number of days in the given month and year`` + * ``0 <= hour < 24`` + * ``0 <= minute < 60`` + * ``0 <= second < 60`` + * ``0 <= microsecond < 1000000`` + + If an argument outside those ranges is given, :exc:`ValueError` is raised. + +Other constructors, all class methods: + + +.. method:: datetime.today() + + Return the current local datetime, with :attr:`tzinfo` ``None``. This is + equivalent to ``datetime.fromtimestamp(time.time())``. See also :meth:`now`, + :meth:`fromtimestamp`. + + +.. method:: datetime.now([tz]) + + Return the current local date and time. If optional argument *tz* is ``None`` + or not specified, this is like :meth:`today`, but, if possible, supplies more + precision than can be gotten from going through a :func:`time.time` timestamp + (for example, this may be possible on platforms supplying the C + :cfunc:`gettimeofday` function). + + Else *tz* must be an instance of a class :class:`tzinfo` subclass, and the + current date and time are converted to *tz*'s time zone. In this case the + result is equivalent to ``tz.fromutc(datetime.utcnow().replace(tzinfo=tz))``. + See also :meth:`today`, :meth:`utcnow`. + + +.. method:: datetime.utcnow() + + Return the current UTC date and time, with :attr:`tzinfo` ``None``. This is like + :meth:`now`, but returns the current UTC date and time, as a naive + :class:`datetime` object. See also :meth:`now`. + + +.. method:: datetime.fromtimestamp(timestamp[, tz]) + + Return the local date and time corresponding to the POSIX timestamp, such as is + returned by :func:`time.time`. If optional argument *tz* is ``None`` or not + specified, the timestamp is converted to the platform's local date and time, and + the returned :class:`datetime` object is naive. + + Else *tz* must be an instance of a class :class:`tzinfo` subclass, and the + timestamp is converted to *tz*'s time zone. In this case the result is + equivalent to + ``tz.fromutc(datetime.utcfromtimestamp(timestamp).replace(tzinfo=tz))``. + + :meth:`fromtimestamp` may raise :exc:`ValueError`, if the timestamp is out of + the range of values supported by the platform C :cfunc:`localtime` or + :cfunc:`gmtime` functions. It's common for this to be restricted to years in + 1970 through 2038. Note that on non-POSIX systems that include leap seconds in + their notion of a timestamp, leap seconds are ignored by :meth:`fromtimestamp`, + and then it's possible to have two timestamps differing by a second that yield + identical :class:`datetime` objects. See also :meth:`utcfromtimestamp`. + + +.. method:: datetime.utcfromtimestamp(timestamp) + + Return the UTC :class:`datetime` corresponding to the POSIX timestamp, with + :attr:`tzinfo` ``None``. This may raise :exc:`ValueError`, if the timestamp is + out of the range of values supported by the platform C :cfunc:`gmtime` function. + It's common for this to be restricted to years in 1970 through 2038. See also + :meth:`fromtimestamp`. + + +.. method:: datetime.fromordinal(ordinal) + + Return the :class:`datetime` corresponding to the proleptic Gregorian ordinal, + where January 1 of year 1 has ordinal 1. :exc:`ValueError` is raised unless ``1 + <= ordinal <= datetime.max.toordinal()``. The hour, minute, second and + microsecond of the result are all 0, and :attr:`tzinfo` is ``None``. + + +.. method:: datetime.combine(date, time) + + Return a new :class:`datetime` object whose date members are equal to the given + :class:`date` object's, and whose time and :attr:`tzinfo` members are equal to + the given :class:`time` object's. For any :class:`datetime` object *d*, ``d == + datetime.combine(d.date(), d.timetz())``. If date is a :class:`datetime` + object, its time and :attr:`tzinfo` members are ignored. + + +.. method:: datetime.strptime(date_string, format) + + Return a :class:`datetime` corresponding to *date_string*, parsed according to + *format*. This is equivalent to ``datetime(*(time.strptime(date_string, + format)[0:6]))``. :exc:`ValueError` is raised if the date_string and format + can't be parsed by :func:`time.strptime` or if it returns a value which isn't a + time tuple. + + .. versionadded:: 2.5 + +Class attributes: + + +.. attribute:: datetime.min + + The earliest representable :class:`datetime`, ``datetime(MINYEAR, 1, 1, + tzinfo=None)``. + + +.. attribute:: datetime.max + + The latest representable :class:`datetime`, ``datetime(MAXYEAR, 12, 31, 23, 59, + 59, 999999, tzinfo=None)``. + + +.. attribute:: datetime.resolution + + The smallest possible difference between non-equal :class:`datetime` objects, + ``timedelta(microseconds=1)``. + +Instance attributes (read-only): + + +.. attribute:: datetime.year + + Between :const:`MINYEAR` and :const:`MAXYEAR` inclusive. + + +.. attribute:: datetime.month + + Between 1 and 12 inclusive. + + +.. attribute:: datetime.day + + Between 1 and the number of days in the given month of the given year. + + +.. attribute:: datetime.hour + + In ``range(24)``. + + +.. attribute:: datetime.minute + + In ``range(60)``. + + +.. attribute:: datetime.second + + In ``range(60)``. + + +.. attribute:: datetime.microsecond + + In ``range(1000000)``. + + +.. attribute:: datetime.tzinfo + + The object passed as the *tzinfo* argument to the :class:`datetime` constructor, + or ``None`` if none was passed. + +Supported operations: + ++---------------------------------------+-------------------------------+ +| Operation | Result | ++=======================================+===============================+ +| ``datetime2 = datetime1 + timedelta`` | \(1) | ++---------------------------------------+-------------------------------+ +| ``datetime2 = datetime1 - timedelta`` | \(2) | ++---------------------------------------+-------------------------------+ +| ``timedelta = datetime1 - datetime2`` | \(3) | ++---------------------------------------+-------------------------------+ +| ``datetime1 < datetime2`` | Compares :class:`datetime` to | +| | :class:`datetime`. (4) | ++---------------------------------------+-------------------------------+ + +(1) + datetime2 is a duration of timedelta removed from datetime1, moving forward in + time if ``timedelta.days`` > 0, or backward if ``timedelta.days`` < 0. The + result has the same :attr:`tzinfo` member as the input datetime, and datetime2 - + datetime1 == timedelta after. :exc:`OverflowError` is raised if datetime2.year + would be smaller than :const:`MINYEAR` or larger than :const:`MAXYEAR`. Note + that no time zone adjustments are done even if the input is an aware object. + +(2) + Computes the datetime2 such that datetime2 + timedelta == datetime1. As for + addition, the result has the same :attr:`tzinfo` member as the input datetime, + and no time zone adjustments are done even if the input is aware. This isn't + quite equivalent to datetime1 + (-timedelta), because -timedelta in isolation + can overflow in cases where datetime1 - timedelta does not. + +(3) + Subtraction of a :class:`datetime` from a :class:`datetime` is defined only if + both operands are naive, or if both are aware. If one is aware and the other is + naive, :exc:`TypeError` is raised. + + If both are naive, or both are aware and have the same :attr:`tzinfo` member, + the :attr:`tzinfo` members are ignored, and the result is a :class:`timedelta` + object *t* such that ``datetime2 + t == datetime1``. No time zone adjustments + are done in this case. + + If both are aware and have different :attr:`tzinfo` members, ``a-b`` acts as if + *a* and *b* were first converted to naive UTC datetimes first. The result is + ``(a.replace(tzinfo=None) - a.utcoffset()) - (b.replace(tzinfo=None) - + b.utcoffset())`` except that the implementation never overflows. + +(4) + *datetime1* is considered less than *datetime2* when *datetime1* precedes + *datetime2* in time. + + If one comparand is naive and the other is aware, :exc:`TypeError` is raised. + If both comparands are aware, and have the same :attr:`tzinfo` member, the + common :attr:`tzinfo` member is ignored and the base datetimes are compared. If + both comparands are aware and have different :attr:`tzinfo` members, the + comparands are first adjusted by subtracting their UTC offsets (obtained from + ``self.utcoffset()``). + + .. note:: + + In order to stop comparison from falling back to the default scheme of comparing + object addresses, datetime comparison normally raises :exc:`TypeError` if the + other comparand isn't also a :class:`datetime` object. However, + ``NotImplemented`` is returned instead if the other comparand has a + :meth:`timetuple` attribute. This hook gives other kinds of date objects a + chance at implementing mixed-type comparison. If not, when a :class:`datetime` + object is compared to an object of a different type, :exc:`TypeError` is raised + unless the comparison is ``==`` or ``!=``. The latter cases return + :const:`False` or :const:`True`, respectively. + +:class:`datetime` objects can be used as dictionary keys. In Boolean contexts, +all :class:`datetime` objects are considered to be true. + +Instance methods: + + +.. method:: datetime.date() + + Return :class:`date` object with same year, month and day. + + +.. method:: datetime.time() + + Return :class:`time` object with same hour, minute, second and microsecond. + :attr:`tzinfo` is ``None``. See also method :meth:`timetz`. + + +.. method:: datetime.timetz() + + Return :class:`time` object with same hour, minute, second, microsecond, and + tzinfo members. See also method :meth:`time`. + + +.. method:: datetime.replace([year[, month[, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]]]]]) + + Return a datetime with the same members, except for those members given new + values by whichever keyword arguments are specified. Note that ``tzinfo=None`` + can be specified to create a naive datetime from an aware datetime with no + conversion of date and time members. + + +.. method:: datetime.astimezone(tz) + + Return a :class:`datetime` object with new :attr:`tzinfo` member *tz*, adjusting + the date and time members so the result is the same UTC time as *self*, but in + *tz*'s local time. + + *tz* must be an instance of a :class:`tzinfo` subclass, and its + :meth:`utcoffset` and :meth:`dst` methods must not return ``None``. *self* must + be aware (``self.tzinfo`` must not be ``None``, and ``self.utcoffset()`` must + not return ``None``). + + If ``self.tzinfo`` is *tz*, ``self.astimezone(tz)`` is equal to *self*: no + adjustment of date or time members is performed. Else the result is local time + in time zone *tz*, representing the same UTC time as *self*: after ``astz = + dt.astimezone(tz)``, ``astz - astz.utcoffset()`` will usually have the same date + and time members as ``dt - dt.utcoffset()``. The discussion of class + :class:`tzinfo` explains the cases at Daylight Saving Time transition boundaries + where this cannot be achieved (an issue only if *tz* models both standard and + daylight time). + + If you merely want to attach a time zone object *tz* to a datetime *dt* without + adjustment of date and time members, use ``dt.replace(tzinfo=tz)``. If you + merely want to remove the time zone object from an aware datetime *dt* without + conversion of date and time members, use ``dt.replace(tzinfo=None)``. + + Note that the default :meth:`tzinfo.fromutc` method can be overridden in a + :class:`tzinfo` subclass to affect the result returned by :meth:`astimezone`. + Ignoring error cases, :meth:`astimezone` acts like:: + + def astimezone(self, tz): + if self.tzinfo is tz: + return self + # Convert self to UTC, and attach the new time zone object. + utc = (self - self.utcoffset()).replace(tzinfo=tz) + # Convert from UTC to tz's local time. + return tz.fromutc(utc) + + +.. method:: datetime.utcoffset() + + If :attr:`tzinfo` is ``None``, returns ``None``, else returns + ``self.tzinfo.utcoffset(self)``, and raises an exception if the latter doesn't + return ``None``, or a :class:`timedelta` object representing a whole number of + minutes with magnitude less than one day. + + +.. method:: datetime.dst() + + If :attr:`tzinfo` is ``None``, returns ``None``, else returns + ``self.tzinfo.dst(self)``, and raises an exception if the latter doesn't return + ``None``, or a :class:`timedelta` object representing a whole number of minutes + with magnitude less than one day. + + +.. method:: datetime.tzname() + + If :attr:`tzinfo` is ``None``, returns ``None``, else returns + ``self.tzinfo.tzname(self)``, raises an exception if the latter doesn't return + ``None`` or a string object, + + +.. method:: datetime.timetuple() + + Return a :class:`time.struct_time` such as returned by :func:`time.localtime`. + ``d.timetuple()`` is equivalent to ``time.struct_time((d.year, d.month, d.day, + d.hour, d.minute, d.second, d.weekday(), d.toordinal() - date(d.year, 1, + 1).toordinal() + 1, dst))`` The :attr:`tm_isdst` flag of the result is set + according to the :meth:`dst` method: :attr:`tzinfo` is ``None`` or :meth:`dst` + returns ``None``, :attr:`tm_isdst` is set to ``-1``; else if :meth:`dst` + returns a non-zero value, :attr:`tm_isdst` is set to ``1``; else ``tm_isdst`` is + set to ``0``. + + +.. method:: datetime.utctimetuple() + + If :class:`datetime` instance *d* is naive, this is the same as + ``d.timetuple()`` except that :attr:`tm_isdst` is forced to 0 regardless of what + ``d.dst()`` returns. DST is never in effect for a UTC time. + + If *d* is aware, *d* is normalized to UTC time, by subtracting + ``d.utcoffset()``, and a :class:`time.struct_time` for the normalized time is + returned. :attr:`tm_isdst` is forced to 0. Note that the result's + :attr:`tm_year` member may be :const:`MINYEAR`\ -1 or :const:`MAXYEAR`\ +1, if + *d*.year was ``MINYEAR`` or ``MAXYEAR`` and UTC adjustment spills over a year + boundary. + + +.. method:: datetime.toordinal() + + Return the proleptic Gregorian ordinal of the date. The same as + ``self.date().toordinal()``. + + +.. method:: datetime.weekday() + + Return the day of the week as an integer, where Monday is 0 and Sunday is 6. + The same as ``self.date().weekday()``. See also :meth:`isoweekday`. + + +.. method:: datetime.isoweekday() + + Return the day of the week as an integer, where Monday is 1 and Sunday is 7. + The same as ``self.date().isoweekday()``. See also :meth:`weekday`, + :meth:`isocalendar`. + + +.. method:: datetime.isocalendar() + + Return a 3-tuple, (ISO year, ISO week number, ISO weekday). The same as + ``self.date().isocalendar()``. + + +.. method:: datetime.isoformat([sep]) + + Return a string representing the date and time in ISO 8601 format, + YYYY-MM-DDTHH:MM:SS.mmmmmm or, if :attr:`microsecond` is 0, + YYYY-MM-DDTHH:MM:SS + + If :meth:`utcoffset` does not return ``None``, a 6-character string is + appended, giving the UTC offset in (signed) hours and minutes: + YYYY-MM-DDTHH:MM:SS.mmmmmm+HH:MM or, if :attr:`microsecond` is 0 + YYYY-MM-DDTHH:MM:SS+HH:MM + + The optional argument *sep* (default ``'T'``) is a one-character separator, + placed between the date and time portions of the result. For example, :: + + >>> from datetime import tzinfo, timedelta, datetime + >>> class TZ(tzinfo): + ... def utcoffset(self, dt): return timedelta(minutes=-399) + ... + >>> datetime(2002, 12, 25, tzinfo=TZ()).isoformat(' ') + '2002-12-25 00:00:00-06:39' + + +.. method:: datetime.__str__() + + For a :class:`datetime` instance *d*, ``str(d)`` is equivalent to + ``d.isoformat(' ')``. + + +.. method:: datetime.ctime() + + Return a string representing the date and time, for example ``datetime(2002, 12, + 4, 20, 30, 40).ctime() == 'Wed Dec 4 20:30:40 2002'``. ``d.ctime()`` is + equivalent to ``time.ctime(time.mktime(d.timetuple()))`` on platforms where the + native C :cfunc:`ctime` function (which :func:`time.ctime` invokes, but which + :meth:`datetime.ctime` does not invoke) conforms to the C standard. + + +.. method:: datetime.strftime(format) + + Return a string representing the date and time, controlled by an explicit format + string. See section :ref:`strftime-behavior`. + + +.. _datetime-time: + +:class:`time` Objects +--------------------- + +A time object represents a (local) time of day, independent of any particular +day, and subject to adjustment via a :class:`tzinfo` object. + + +.. class:: time(hour[, minute[, second[, microsecond[, tzinfo]]]]) + + All arguments are optional. *tzinfo* may be ``None``, or an instance of a + :class:`tzinfo` subclass. The remaining arguments may be ints or longs, in the + following ranges: + + * ``0 <= hour < 24`` + * ``0 <= minute < 60`` + * ``0 <= second < 60`` + * ``0 <= microsecond < 1000000``. + + If an argument outside those ranges is given, :exc:`ValueError` is raised. All + default to ``0`` except *tzinfo*, which defaults to :const:`None`. + +Class attributes: + + +.. attribute:: time.min + + The earliest representable :class:`time`, ``time(0, 0, 0, 0)``. + + +.. attribute:: time.max + + The latest representable :class:`time`, ``time(23, 59, 59, 999999)``. + + +.. attribute:: time.resolution + + The smallest possible difference between non-equal :class:`time` objects, + ``timedelta(microseconds=1)``, although note that arithmetic on :class:`time` + objects is not supported. + +Instance attributes (read-only): + + +.. attribute:: time.hour + + In ``range(24)``. + + +.. attribute:: time.minute + + In ``range(60)``. + + +.. attribute:: time.second + + In ``range(60)``. + + +.. attribute:: time.microsecond + + In ``range(1000000)``. + + +.. attribute:: time.tzinfo + + The object passed as the tzinfo argument to the :class:`time` constructor, or + ``None`` if none was passed. + +Supported operations: + +* comparison of :class:`time` to :class:`time`, where *a* is considered less + than *b* when *a* precedes *b* in time. If one comparand is naive and the other + is aware, :exc:`TypeError` is raised. If both comparands are aware, and have + the same :attr:`tzinfo` member, the common :attr:`tzinfo` member is ignored and + the base times are compared. If both comparands are aware and have different + :attr:`tzinfo` members, the comparands are first adjusted by subtracting their + UTC offsets (obtained from ``self.utcoffset()``). In order to stop mixed-type + comparisons from falling back to the default comparison by object address, when + a :class:`time` object is compared to an object of a different type, + :exc:`TypeError` is raised unless the comparison is ``==`` or ``!=``. The + latter cases return :const:`False` or :const:`True`, respectively. + +* hash, use as dict key + +* efficient pickling + +* in Boolean contexts, a :class:`time` object is considered to be true if and + only if, after converting it to minutes and subtracting :meth:`utcoffset` (or + ``0`` if that's ``None``), the result is non-zero. + +Instance methods: + + +.. method:: time.replace([hour[, minute[, second[, microsecond[, tzinfo]]]]]) + + Return a :class:`time` with the same value, except for those members given new + values by whichever keyword arguments are specified. Note that ``tzinfo=None`` + can be specified to create a naive :class:`time` from an aware :class:`time`, + without conversion of the time members. + + +.. method:: time.isoformat() + + Return a string representing the time in ISO 8601 format, HH:MM:SS.mmmmmm or, if + self.microsecond is 0, HH:MM:SS If :meth:`utcoffset` does not return ``None``, a + 6-character string is appended, giving the UTC offset in (signed) hours and + minutes: HH:MM:SS.mmmmmm+HH:MM or, if self.microsecond is 0, HH:MM:SS+HH:MM + + +.. method:: time.__str__() + + For a time *t*, ``str(t)`` is equivalent to ``t.isoformat()``. + + +.. method:: time.strftime(format) + + Return a string representing the time, controlled by an explicit format string. + See section :ref:`strftime-behavior`. + + +.. method:: time.utcoffset() + + If :attr:`tzinfo` is ``None``, returns ``None``, else returns + ``self.tzinfo.utcoffset(None)``, and raises an exception if the latter doesn't + return ``None`` or a :class:`timedelta` object representing a whole number of + minutes with magnitude less than one day. + + +.. method:: time.dst() + + If :attr:`tzinfo` is ``None``, returns ``None``, else returns + ``self.tzinfo.dst(None)``, and raises an exception if the latter doesn't return + ``None``, or a :class:`timedelta` object representing a whole number of minutes + with magnitude less than one day. + + +.. method:: time.tzname() + + If :attr:`tzinfo` is ``None``, returns ``None``, else returns + ``self.tzinfo.tzname(None)``, or raises an exception if the latter doesn't + return ``None`` or a string object. + + +.. _datetime-tzinfo: + +:class:`tzinfo` Objects +----------------------- + +:class:`tzinfo` is an abstract base clase, meaning that this class should not be +instantiated directly. You need to derive a concrete subclass, and (at least) +supply implementations of the standard :class:`tzinfo` methods needed by the +:class:`datetime` methods you use. The :mod:`datetime` module does not supply +any concrete subclasses of :class:`tzinfo`. + +An instance of (a concrete subclass of) :class:`tzinfo` can be passed to the +constructors for :class:`datetime` and :class:`time` objects. The latter objects +view their members as being in local time, and the :class:`tzinfo` object +supports methods revealing offset of local time from UTC, the name of the time +zone, and DST offset, all relative to a date or time object passed to them. + +Special requirement for pickling: A :class:`tzinfo` subclass must have an +:meth:`__init__` method that can be called with no arguments, else it can be +pickled but possibly not unpickled again. This is a technical requirement that +may be relaxed in the future. + +A concrete subclass of :class:`tzinfo` may need to implement the following +methods. Exactly which methods are needed depends on the uses made of aware +:mod:`datetime` objects. If in doubt, simply implement all of them. + + +.. method:: tzinfo.utcoffset(self, dt) + + Return offset of local time from UTC, in minutes east of UTC. If local time is + west of UTC, this should be negative. Note that this is intended to be the + total offset from UTC; for example, if a :class:`tzinfo` object represents both + time zone and DST adjustments, :meth:`utcoffset` should return their sum. If + the UTC offset isn't known, return ``None``. Else the value returned must be a + :class:`timedelta` object specifying a whole number of minutes in the range + -1439 to 1439 inclusive (1440 = 24\*60; the magnitude of the offset must be less + than one day). Most implementations of :meth:`utcoffset` will probably look + like one of these two:: + + return CONSTANT # fixed-offset class + return CONSTANT + self.dst(dt) # daylight-aware class + + If :meth:`utcoffset` does not return ``None``, :meth:`dst` should not return + ``None`` either. + + The default implementation of :meth:`utcoffset` raises + :exc:`NotImplementedError`. + + +.. method:: tzinfo.dst(self, dt) + + Return the daylight saving time (DST) adjustment, in minutes east of UTC, or + ``None`` if DST information isn't known. Return ``timedelta(0)`` if DST is not + in effect. If DST is in effect, return the offset as a :class:`timedelta` object + (see :meth:`utcoffset` for details). Note that DST offset, if applicable, has + already been added to the UTC offset returned by :meth:`utcoffset`, so there's + no need to consult :meth:`dst` unless you're interested in obtaining DST info + separately. For example, :meth:`datetime.timetuple` calls its :attr:`tzinfo` + member's :meth:`dst` method to determine how the :attr:`tm_isdst` flag should be + set, and :meth:`tzinfo.fromutc` calls :meth:`dst` to account for DST changes + when crossing time zones. + + An instance *tz* of a :class:`tzinfo` subclass that models both standard and + daylight times must be consistent in this sense: + + ``tz.utcoffset(dt) - tz.dst(dt)`` + + must return the same result for every :class:`datetime` *dt* with ``dt.tzinfo == + tz`` For sane :class:`tzinfo` subclasses, this expression yields the time + zone's "standard offset", which should not depend on the date or the time, but + only on geographic location. The implementation of :meth:`datetime.astimezone` + relies on this, but cannot detect violations; it's the programmer's + responsibility to ensure it. If a :class:`tzinfo` subclass cannot guarantee + this, it may be able to override the default implementation of + :meth:`tzinfo.fromutc` to work correctly with :meth:`astimezone` regardless. + + Most implementations of :meth:`dst` will probably look like one of these two:: + + def dst(self): + # a fixed-offset class: doesn't account for DST + return timedelta(0) + + or :: + + def dst(self): + # Code to set dston and dstoff to the time zone's DST + # transition times based on the input dt.year, and expressed + # in standard local time. Then + + if dston <= dt.replace(tzinfo=None) < dstoff: + return timedelta(hours=1) + else: + return timedelta(0) + + The default implementation of :meth:`dst` raises :exc:`NotImplementedError`. + + +.. method:: tzinfo.tzname(self, dt) + + Return the time zone name corresponding to the :class:`datetime` object *dt*, as + a string. Nothing about string names is defined by the :mod:`datetime` module, + and there's no requirement that it mean anything in particular. For example, + "GMT", "UTC", "-500", "-5:00", "EDT", "US/Eastern", "America/New York" are all + valid replies. Return ``None`` if a string name isn't known. Note that this is + a method rather than a fixed string primarily because some :class:`tzinfo` + subclasses will wish to return different names depending on the specific value + of *dt* passed, especially if the :class:`tzinfo` class is accounting for + daylight time. + + The default implementation of :meth:`tzname` raises :exc:`NotImplementedError`. + +These methods are called by a :class:`datetime` or :class:`time` object, in +response to their methods of the same names. A :class:`datetime` object passes +itself as the argument, and a :class:`time` object passes ``None`` as the +argument. A :class:`tzinfo` subclass's methods should therefore be prepared to +accept a *dt* argument of ``None``, or of class :class:`datetime`. + +When ``None`` is passed, it's up to the class designer to decide the best +response. For example, returning ``None`` is appropriate if the class wishes to +say that time objects don't participate in the :class:`tzinfo` protocols. It +may be more useful for ``utcoffset(None)`` to return the standard UTC offset, as +there is no other convention for discovering the standard offset. + +When a :class:`datetime` object is passed in response to a :class:`datetime` +method, ``dt.tzinfo`` is the same object as *self*. :class:`tzinfo` methods can +rely on this, unless user code calls :class:`tzinfo` methods directly. The +intent is that the :class:`tzinfo` methods interpret *dt* as being in local +time, and not need worry about objects in other timezones. + +There is one more :class:`tzinfo` method that a subclass may wish to override: + + +.. method:: tzinfo.fromutc(self, dt) + + This is called from the default :class:`datetime.astimezone()` implementation. + When called from that, ``dt.tzinfo`` is *self*, and *dt*'s date and time members + are to be viewed as expressing a UTC time. The purpose of :meth:`fromutc` is to + adjust the date and time members, returning an equivalent datetime in *self*'s + local time. + + Most :class:`tzinfo` subclasses should be able to inherit the default + :meth:`fromutc` implementation without problems. It's strong enough to handle + fixed-offset time zones, and time zones accounting for both standard and + daylight time, and the latter even if the DST transition times differ in + different years. An example of a time zone the default :meth:`fromutc` + implementation may not handle correctly in all cases is one where the standard + offset (from UTC) depends on the specific date and time passed, which can happen + for political reasons. The default implementations of :meth:`astimezone` and + :meth:`fromutc` may not produce the result you want if the result is one of the + hours straddling the moment the standard offset changes. + + Skipping code for error cases, the default :meth:`fromutc` implementation acts + like:: + + def fromutc(self, dt): + # raise ValueError error if dt.tzinfo is not self + dtoff = dt.utcoffset() + dtdst = dt.dst() + # raise ValueError if dtoff is None or dtdst is None + delta = dtoff - dtdst # this is self's standard offset + if delta: + dt += delta # convert to standard local time + dtdst = dt.dst() + # raise ValueError if dtdst is None + if dtdst: + return dt + dtdst + else: + return dt + +Example :class:`tzinfo` classes: + +.. literalinclude:: ../includes/tzinfo-examples.py + + +Note that there are unavoidable subtleties twice per year in a :class:`tzinfo` +subclass accounting for both standard and daylight time, at the DST transition +points. For concreteness, consider US Eastern (UTC -0500), where EDT begins the +minute after 1:59 (EST) on the first Sunday in April, and ends the minute after +1:59 (EDT) on the last Sunday in October:: + + UTC 3:MM 4:MM 5:MM 6:MM 7:MM 8:MM + EST 22:MM 23:MM 0:MM 1:MM 2:MM 3:MM + EDT 23:MM 0:MM 1:MM 2:MM 3:MM 4:MM + + start 22:MM 23:MM 0:MM 1:MM 3:MM 4:MM + + end 23:MM 0:MM 1:MM 1:MM 2:MM 3:MM + +When DST starts (the "start" line), the local wall clock leaps from 1:59 to +3:00. A wall time of the form 2:MM doesn't really make sense on that day, so +``astimezone(Eastern)`` won't deliver a result with ``hour == 2`` on the day DST +begins. In order for :meth:`astimezone` to make this guarantee, the +:meth:`rzinfo.dst` method must consider times in the "missing hour" (2:MM for +Eastern) to be in daylight time. + +When DST ends (the "end" line), there's a potentially worse problem: there's an +hour that can't be spelled unambiguously in local wall time: the last hour of +daylight time. In Eastern, that's times of the form 5:MM UTC on the day +daylight time ends. The local wall clock leaps from 1:59 (daylight time) back +to 1:00 (standard time) again. Local times of the form 1:MM are ambiguous. +:meth:`astimezone` mimics the local clock's behavior by mapping two adjacent UTC +hours into the same local hour then. In the Eastern example, UTC times of the +form 5:MM and 6:MM both map to 1:MM when converted to Eastern. In order for +:meth:`astimezone` to make this guarantee, the :meth:`tzinfo.dst` method must +consider times in the "repeated hour" to be in standard time. This is easily +arranged, as in the example, by expressing DST switch times in the time zone's +standard local time. + +Applications that can't bear such ambiguities should avoid using hybrid +:class:`tzinfo` subclasses; there are no ambiguities when using UTC, or any +other fixed-offset :class:`tzinfo` subclass (such as a class representing only +EST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)). + + +.. _strftime-behavior: + +:meth:`strftime` Behavior +------------------------- + +:class:`date`, :class:`datetime`, and :class:`time` objects all support a +``strftime(format)`` method, to create a string representing the time under the +control of an explicit format string. Broadly speaking, ``d.strftime(fmt)`` +acts like the :mod:`time` module's ``time.strftime(fmt, d.timetuple())`` +although not all objects support a :meth:`timetuple` method. + +For :class:`time` objects, the format codes for year, month, and day should not +be used, as time objects have no such values. If they're used anyway, ``1900`` +is substituted for the year, and ``0`` for the month and day. + +For :class:`date` objects, the format codes for hours, minutes, and seconds +should not be used, as :class:`date` objects have no such values. If they're +used anyway, ``0`` is substituted for them. + +For a naive object, the ``%z`` and ``%Z`` format codes are replaced by empty +strings. + +For an aware object: + +``%z`` + :meth:`utcoffset` is transformed into a 5-character string of the form +HHMM or + -HHMM, where HH is a 2-digit string giving the number of UTC offset hours, and + MM is a 2-digit string giving the number of UTC offset minutes. For example, if + :meth:`utcoffset` returns ``timedelta(hours=-3, minutes=-30)``, ``%z`` is + replaced with the string ``'-0330'``. + +``%Z`` + If :meth:`tzname` returns ``None``, ``%Z`` is replaced by an empty string. + Otherwise ``%Z`` is replaced by the returned value, which must be a string. + +The full set of format codes supported varies across platforms, because Python +calls the platform C library's :func:`strftime` function, and platform +variations are common. The documentation for Python's :mod:`time` module lists +the format codes that the C standard (1989 version) requires, and those work on +all platforms with a standard C implementation. Note that the 1999 version of +the C standard added additional format codes. + +The exact range of years for which :meth:`strftime` works also varies across +platforms. Regardless of platform, years before 1900 cannot be used. + +.. % %% This example is obsolete, since strptime is now supported by datetime. +.. % +.. % \subsection{Examples} +.. % +.. % \subsubsection{Creating Datetime Objects from Formatted Strings} +.. % +.. % The \class{datetime} class does not directly support parsing formatted time +.. % strings. You can use \function{time.strptime} to do the parsing and create +.. % a \class{datetime} object from the tuple it returns: +.. % +.. % \begin{verbatim} +.. % >>> s = "2005-12-06T12:13:14" +.. % >>> from datetime import datetime +.. % >>> from time import strptime +.. % >>> datetime(*strptime(s, "%Y-%m-%dT%H:%M:%S")[0:6]) +.. % datetime.datetime(2005, 12, 6, 12, 13, 14) +.. % \end{verbatim} +.. % + diff --git a/Doc/library/dbhash.rst b/Doc/library/dbhash.rst new file mode 100644 index 0000000..b5c9590 --- /dev/null +++ b/Doc/library/dbhash.rst @@ -0,0 +1,114 @@ + +:mod:`dbhash` --- DBM-style interface to the BSD database library +================================================================= + +.. module:: dbhash + :synopsis: DBM-style interface to the BSD database library. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. index:: module: bsddb + +The :mod:`dbhash` module provides a function to open databases using the BSD +``db`` library. This module mirrors the interface of the other Python database +modules that provide access to DBM-style databases. The :mod:`bsddb` module is +required to use :mod:`dbhash`. + +This module provides an exception and a function: + + +.. exception:: error + + Exception raised on database errors other than :exc:`KeyError`. It is a synonym + for :exc:`bsddb.error`. + + +.. function:: open(path[, flag[, mode]]) + + Open a ``db`` database and return the database object. The *path* argument is + the name of the database file. + + The *flag* argument can be: + + +---------+-------------------------------------------+ + | Value | Meaning | + +=========+===========================================+ + | ``'r'`` | Open existing database for reading only | + | | (default) | + +---------+-------------------------------------------+ + | ``'w'`` | Open existing database for reading and | + | | writing | + +---------+-------------------------------------------+ + | ``'c'`` | Open database for reading and writing, | + | | creating it if it doesn't exist | + +---------+-------------------------------------------+ + | ``'n'`` | Always create a new, empty database, open | + | | for reading and writing | + +---------+-------------------------------------------+ + + For platforms on which the BSD ``db`` library supports locking, an ``'l'`` + can be appended to indicate that locking should be used. + + The optional *mode* parameter is used to indicate the Unix permission bits that + should be set if a new database must be created; this will be masked by the + current umask value for the process. + + +.. seealso:: + + Module :mod:`anydbm` + Generic interface to ``dbm``\ -style databases. + + Module :mod:`bsddb` + Lower-level interface to the BSD ``db`` library. + + Module :mod:`whichdb` + Utility module used to determine the type of an existing database. + + +.. _dbhash-objects: + +Database Objects +---------------- + +The database objects returned by :func:`open` provide the methods common to all +the DBM-style databases and mapping objects. The following methods are +available in addition to the standard methods. + + +.. method:: dbhash.first() + + It's possible to loop over every key/value pair in the database using this + method and the :meth:`next` method. The traversal is ordered by the databases + internal hash values, and won't be sorted by the key values. This method + returns the starting key. + + +.. method:: dbhash.last() + + Return the last key/value pair in a database traversal. This may be used to + begin a reverse-order traversal; see :meth:`previous`. + + +.. method:: dbhash.next() + + Returns the key next key/value pair in a database traversal. The following code + prints every key in the database ``db``, without having to create a list in + memory that contains them all:: + + print db.first() + for i in range(1, len(db)): + print db.next() + + +.. method:: dbhash.previous() + + Returns the previous key/value pair in a forward-traversal of the database. In + conjunction with :meth:`last`, this may be used to implement a reverse-order + traversal. + + +.. method:: dbhash.sync() + + This method forces any unwritten data to be written to the disk. + diff --git a/Doc/library/dbm.rst b/Doc/library/dbm.rst new file mode 100644 index 0000000..52923e8 --- /dev/null +++ b/Doc/library/dbm.rst @@ -0,0 +1,74 @@ + +:mod:`dbm` --- Simple "database" interface +========================================== + +.. module:: dbm + :platform: Unix + :synopsis: The standard "database" interface, based on ndbm. + + +The :mod:`dbm` module provides an interface to the Unix "(n)dbm" library. Dbm +objects behave like mappings (dictionaries), except that keys and values are +always strings. Printing a dbm object doesn't print the keys and values, and the +:meth:`items` and :meth:`values` methods are not supported. + +This module can be used with the "classic" ndbm interface, the BSD DB +compatibility interface, or the GNU GDBM compatibility interface. On Unix, the +:program:`configure` script will attempt to locate the appropriate header file +to simplify building this module. + +The module defines the following: + + +.. exception:: error + + Raised on dbm-specific errors, such as I/O errors. :exc:`KeyError` is raised for + general mapping errors like specifying an incorrect key. + + +.. data:: library + + Name of the ``ndbm`` implementation library used. + + +.. function:: open(filename[, flag[, mode]]) + + Open a dbm database and return a dbm object. The *filename* argument is the + name of the database file (without the :file:`.dir` or :file:`.pag` extensions; + note that the BSD DB implementation of the interface will append the extension + :file:`.db` and only create one file). + + The optional *flag* argument must be one of these values: + + +---------+-------------------------------------------+ + | Value | Meaning | + +=========+===========================================+ + | ``'r'`` | Open existing database for reading only | + | | (default) | + +---------+-------------------------------------------+ + | ``'w'`` | Open existing database for reading and | + | | writing | + +---------+-------------------------------------------+ + | ``'c'`` | Open database for reading and writing, | + | | creating it if it doesn't exist | + +---------+-------------------------------------------+ + | ``'n'`` | Always create a new, empty database, open | + | | for reading and writing | + +---------+-------------------------------------------+ + + The optional *mode* argument is the Unix mode of the file, used only when the + database has to be created. It defaults to octal ``0666`` (and will be + modified by the prevailing umask). + + +.. seealso:: + + Module :mod:`anydbm` + Generic interface to ``dbm``\ -style databases. + + Module :mod:`gdbm` + Similar interface to the GNU GDBM library. + + Module :mod:`whichdb` + Utility module used to determine the type of an existing database. + diff --git a/Doc/library/decimal.rst b/Doc/library/decimal.rst new file mode 100644 index 0000000..1d17109 --- /dev/null +++ b/Doc/library/decimal.rst @@ -0,0 +1,1289 @@ + +:mod:`decimal` --- Decimal floating point arithmetic +==================================================== + +.. module:: decimal + :synopsis: Implementation of the General Decimal Arithmetic Specification. + + +.. moduleauthor:: Eric Price <eprice at tjhsst.edu> +.. moduleauthor:: Facundo Batista <facundo at taniquetil.com.ar> +.. moduleauthor:: Raymond Hettinger <python at rcn.com> +.. moduleauthor:: Aahz <aahz at pobox.com> +.. moduleauthor:: Tim Peters <tim.one at comcast.net> + + +.. sectionauthor:: Raymond D. Hettinger <python at rcn.com> + + +.. versionadded:: 2.4 + +The :mod:`decimal` module provides support for decimal floating point +arithmetic. It offers several advantages over the :class:`float()` datatype: + +* Decimal numbers can be represented exactly. In contrast, numbers like + :const:`1.1` do not have an exact representation in binary floating point. End + users typically would not expect :const:`1.1` to display as + :const:`1.1000000000000001` as it does with binary floating point. + +* The exactness carries over into arithmetic. In decimal floating point, ``0.1 + + 0.1 + 0.1 - 0.3`` is exactly equal to zero. In binary floating point, result + is :const:`5.5511151231257827e-017`. While near to zero, the differences + prevent reliable equality testing and differences can accumulate. For this + reason, decimal would be preferred in accounting applications which have strict + equality invariants. + +* The decimal module incorporates a notion of significant places so that ``1.30 + + 1.20`` is :const:`2.50`. The trailing zero is kept to indicate significance. + This is the customary presentation for monetary applications. For + multiplication, the "schoolbook" approach uses all the figures in the + multiplicands. For instance, ``1.3 * 1.2`` gives :const:`1.56` while ``1.30 * + 1.20`` gives :const:`1.5600`. + +* Unlike hardware based binary floating point, the decimal module has a user + settable precision (defaulting to 28 places) which can be as large as needed for + a given problem:: + + >>> getcontext().prec = 6 + >>> Decimal(1) / Decimal(7) + Decimal("0.142857") + >>> getcontext().prec = 28 + >>> Decimal(1) / Decimal(7) + Decimal("0.1428571428571428571428571429") + +* Both binary and decimal floating point are implemented in terms of published + standards. While the built-in float type exposes only a modest portion of its + capabilities, the decimal module exposes all required parts of the standard. + When needed, the programmer has full control over rounding and signal handling. + +The module design is centered around three concepts: the decimal number, the +context for arithmetic, and signals. + +A decimal number is immutable. It has a sign, coefficient digits, and an +exponent. To preserve significance, the coefficient digits do not truncate +trailing zeroes. Decimals also include special values such as +:const:`Infinity`, :const:`-Infinity`, and :const:`NaN`. The standard also +differentiates :const:`-0` from :const:`+0`. + +The context for arithmetic is an environment specifying precision, rounding +rules, limits on exponents, flags indicating the results of operations, and trap +enablers which determine whether signals are treated as exceptions. Rounding +options include :const:`ROUND_CEILING`, :const:`ROUND_DOWN`, +:const:`ROUND_FLOOR`, :const:`ROUND_HALF_DOWN`, :const:`ROUND_HALF_EVEN`, +:const:`ROUND_HALF_UP`, and :const:`ROUND_UP`. + +Signals are groups of exceptional conditions arising during the course of +computation. Depending on the needs of the application, signals may be ignored, +considered as informational, or treated as exceptions. The signals in the +decimal module are: :const:`Clamped`, :const:`InvalidOperation`, +:const:`DivisionByZero`, :const:`Inexact`, :const:`Rounded`, :const:`Subnormal`, +:const:`Overflow`, and :const:`Underflow`. + +For each signal there is a flag and a trap enabler. When a signal is +encountered, its flag is incremented from zero and, then, if the trap enabler is +set to one, an exception is raised. Flags are sticky, so the user needs to +reset them before monitoring a calculation. + + +.. seealso:: + + IBM's General Decimal Arithmetic Specification, `The General Decimal Arithmetic + Specification <http://www2.hursley.ibm.com/decimal/decarith.html>`_. + + IEEE standard 854-1987, `Unofficial IEEE 854 Text + <http://www.cs.berkeley.edu/~ejr/projects/754/private/drafts/854-1987/dir.html>`_. + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-tutorial: + +Quick-start Tutorial +-------------------- + +The usual start to using decimals is importing the module, viewing the current +context with :func:`getcontext` and, if necessary, setting new values for +precision, rounding, or enabled traps:: + + >>> from decimal import * + >>> getcontext() + Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, + capitals=1, flags=[], traps=[Overflow, InvalidOperation, + DivisionByZero]) + + >>> getcontext().prec = 7 # Set a new precision + +Decimal instances can be constructed from integers, strings, or tuples. To +create a Decimal from a :class:`float`, first convert it to a string. This +serves as an explicit reminder of the details of the conversion (including +representation error). Decimal numbers include special values such as +:const:`NaN` which stands for "Not a number", positive and negative +:const:`Infinity`, and :const:`-0`. :: + + >>> Decimal(10) + Decimal("10") + >>> Decimal("3.14") + Decimal("3.14") + >>> Decimal((0, (3, 1, 4), -2)) + Decimal("3.14") + >>> Decimal(str(2.0 ** 0.5)) + Decimal("1.41421356237") + >>> Decimal("NaN") + Decimal("NaN") + >>> Decimal("-Infinity") + Decimal("-Infinity") + +The significance of a new Decimal is determined solely by the number of digits +input. Context precision and rounding only come into play during arithmetic +operations. :: + + >>> getcontext().prec = 6 + >>> Decimal('3.0') + Decimal("3.0") + >>> Decimal('3.1415926535') + Decimal("3.1415926535") + >>> Decimal('3.1415926535') + Decimal('2.7182818285') + Decimal("5.85987") + >>> getcontext().rounding = ROUND_UP + >>> Decimal('3.1415926535') + Decimal('2.7182818285') + Decimal("5.85988") + +Decimals interact well with much of the rest of Python. Here is a small decimal +floating point flying circus:: + + >>> data = map(Decimal, '1.34 1.87 3.45 2.35 1.00 0.03 9.25'.split()) + >>> max(data) + Decimal("9.25") + >>> min(data) + Decimal("0.03") + >>> sorted(data) + [Decimal("0.03"), Decimal("1.00"), Decimal("1.34"), Decimal("1.87"), + Decimal("2.35"), Decimal("3.45"), Decimal("9.25")] + >>> sum(data) + Decimal("19.29") + >>> a,b,c = data[:3] + >>> str(a) + '1.34' + >>> float(a) + 1.3400000000000001 + >>> round(a, 1) # round() first converts to binary floating point + 1.3 + >>> int(a) + 1 + >>> a * 5 + Decimal("6.70") + >>> a * b + Decimal("2.5058") + >>> c % a + Decimal("0.77") + +The :meth:`quantize` method rounds a number to a fixed exponent. This method is +useful for monetary applications that often round results to a fixed number of +places:: + + >>> Decimal('7.325').quantize(Decimal('.01'), rounding=ROUND_DOWN) + Decimal("7.32") + >>> Decimal('7.325').quantize(Decimal('1.'), rounding=ROUND_UP) + Decimal("8") + +As shown above, the :func:`getcontext` function accesses the current context and +allows the settings to be changed. This approach meets the needs of most +applications. + +For more advanced work, it may be useful to create alternate contexts using the +Context() constructor. To make an alternate active, use the :func:`setcontext` +function. + +In accordance with the standard, the :mod:`Decimal` module provides two ready to +use standard contexts, :const:`BasicContext` and :const:`ExtendedContext`. The +former is especially useful for debugging because many of the traps are +enabled:: + + >>> myothercontext = Context(prec=60, rounding=ROUND_HALF_DOWN) + >>> setcontext(myothercontext) + >>> Decimal(1) / Decimal(7) + Decimal("0.142857142857142857142857142857142857142857142857142857142857") + + >>> ExtendedContext + Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, + capitals=1, flags=[], traps=[]) + >>> setcontext(ExtendedContext) + >>> Decimal(1) / Decimal(7) + Decimal("0.142857143") + >>> Decimal(42) / Decimal(0) + Decimal("Infinity") + + >>> setcontext(BasicContext) + >>> Decimal(42) / Decimal(0) + Traceback (most recent call last): + File "<pyshell#143>", line 1, in -toplevel- + Decimal(42) / Decimal(0) + DivisionByZero: x / 0 + +Contexts also have signal flags for monitoring exceptional conditions +encountered during computations. The flags remain set until explicitly cleared, +so it is best to clear the flags before each set of monitored computations by +using the :meth:`clear_flags` method. :: + + >>> setcontext(ExtendedContext) + >>> getcontext().clear_flags() + >>> Decimal(355) / Decimal(113) + Decimal("3.14159292") + >>> getcontext() + Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, + capitals=1, flags=[Inexact, Rounded], traps=[]) + +The *flags* entry shows that the rational approximation to :const:`Pi` was +rounded (digits beyond the context precision were thrown away) and that the +result is inexact (some of the discarded digits were non-zero). + +Individual traps are set using the dictionary in the :attr:`traps` field of a +context:: + + >>> Decimal(1) / Decimal(0) + Decimal("Infinity") + >>> getcontext().traps[DivisionByZero] = 1 + >>> Decimal(1) / Decimal(0) + Traceback (most recent call last): + File "<pyshell#112>", line 1, in -toplevel- + Decimal(1) / Decimal(0) + DivisionByZero: x / 0 + +Most programs adjust the current context only once, at the beginning of the +program. And, in many applications, data is converted to :class:`Decimal` with +a single cast inside a loop. With context set and decimals created, the bulk of +the program manipulates the data no differently than with other Python numeric +types. + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-decimal: + +Decimal objects +--------------- + + +.. class:: Decimal([value [, context]]) + + Constructs a new :class:`Decimal` object based from *value*. + + *value* can be an integer, string, tuple, or another :class:`Decimal` object. If + no *value* is given, returns ``Decimal("0")``. If *value* is a string, it + should conform to the decimal numeric string syntax:: + + sign ::= '+' | '-' + digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' + indicator ::= 'e' | 'E' + digits ::= digit [digit]... + decimal-part ::= digits '.' [digits] | ['.'] digits + exponent-part ::= indicator [sign] digits + infinity ::= 'Infinity' | 'Inf' + nan ::= 'NaN' [digits] | 'sNaN' [digits] + numeric-value ::= decimal-part [exponent-part] | infinity + numeric-string ::= [sign] numeric-value | [sign] nan + + If *value* is a :class:`tuple`, it should have three components, a sign + (:const:`0` for positive or :const:`1` for negative), a :class:`tuple` of + digits, and an integer exponent. For example, ``Decimal((0, (1, 4, 1, 4), -3))`` + returns ``Decimal("1.414")``. + + The *context* precision does not affect how many digits are stored. That is + determined exclusively by the number of digits in *value*. For example, + ``Decimal("3.00000")`` records all five zeroes even if the context precision is + only three. + + The purpose of the *context* argument is determining what to do if *value* is a + malformed string. If the context traps :const:`InvalidOperation`, an exception + is raised; otherwise, the constructor returns a new Decimal with the value of + :const:`NaN`. + + Once constructed, :class:`Decimal` objects are immutable. + +Decimal floating point objects share many properties with the other builtin +numeric types such as :class:`float` and :class:`int`. All of the usual math +operations and special methods apply. Likewise, decimal objects can be copied, +pickled, printed, used as dictionary keys, used as set elements, compared, +sorted, and coerced to another type (such as :class:`float` or :class:`long`). + +In addition to the standard numeric properties, decimal floating point objects +also have a number of specialized methods: + + +.. method:: Decimal.adjusted() + + Return the adjusted exponent after shifting out the coefficient's rightmost + digits until only the lead digit remains: ``Decimal("321e+5").adjusted()`` + returns seven. Used for determining the position of the most significant digit + with respect to the decimal point. + + +.. method:: Decimal.as_tuple() + + Returns a tuple representation of the number: ``(sign, digittuple, exponent)``. + + +.. method:: Decimal.compare(other[, context]) + + Compares like :meth:`__cmp__` but returns a decimal instance:: + + a or b is a NaN ==> Decimal("NaN") + a < b ==> Decimal("-1") + a == b ==> Decimal("0") + a > b ==> Decimal("1") + + +.. method:: Decimal.max(other[, context]) + + Like ``max(self, other)`` except that the context rounding rule is applied + before returning and that :const:`NaN` values are either signalled or ignored + (depending on the context and whether they are signaling or quiet). + + +.. method:: Decimal.min(other[, context]) + + Like ``min(self, other)`` except that the context rounding rule is applied + before returning and that :const:`NaN` values are either signalled or ignored + (depending on the context and whether they are signaling or quiet). + + +.. method:: Decimal.normalize([context]) + + Normalize the number by stripping the rightmost trailing zeroes and converting + any result equal to :const:`Decimal("0")` to :const:`Decimal("0e0")`. Used for + producing canonical values for members of an equivalence class. For example, + ``Decimal("32.100")`` and ``Decimal("0.321000e+2")`` both normalize to the + equivalent value ``Decimal("32.1")``. + + +.. method:: Decimal.quantize(exp [, rounding[, context[, watchexp]]]) + + Quantize makes the exponent the same as *exp*. Searches for a rounding method + in *rounding*, then in *context*, and then in the current context. + + If *watchexp* is set (default), then an error is returned whenever the resulting + exponent is greater than :attr:`Emax` or less than :attr:`Etiny`. + + +.. method:: Decimal.remainder_near(other[, context]) + + Computes the modulo as either a positive or negative value depending on which is + closest to zero. For instance, ``Decimal(10).remainder_near(6)`` returns + ``Decimal("-2")`` which is closer to zero than ``Decimal("4")``. + + If both are equally close, the one chosen will have the same sign as *self*. + + +.. method:: Decimal.same_quantum(other[, context]) + + Test whether self and other have the same exponent or whether both are + :const:`NaN`. + + +.. method:: Decimal.sqrt([context]) + + Return the square root to full precision. + + +.. method:: Decimal.to_eng_string([context]) + + Convert to an engineering-type string. + + Engineering notation has an exponent which is a multiple of 3, so there are up + to 3 digits left of the decimal place. For example, converts + ``Decimal('123E+1')`` to ``Decimal("1.23E+3")`` + + +.. method:: Decimal.to_integral([rounding[, context]]) + + Rounds to the nearest integer without signaling :const:`Inexact` or + :const:`Rounded`. If given, applies *rounding*; otherwise, uses the rounding + method in either the supplied *context* or the current context. + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-context: + +Context objects +--------------- + +Contexts are environments for arithmetic operations. They govern precision, set +rules for rounding, determine which signals are treated as exceptions, and limit +the range for exponents. + +Each thread has its own current context which is accessed or changed using the +:func:`getcontext` and :func:`setcontext` functions: + + +.. function:: getcontext() + + Return the current context for the active thread. + + +.. function:: setcontext(c) + + Set the current context for the active thread to *c*. + +Beginning with Python 2.5, you can also use the :keyword:`with` statement and +the :func:`localcontext` function to temporarily change the active context. + + +.. function:: localcontext([c]) + + Return a context manager that will set the current context for the active thread + to a copy of *c* on entry to the with-statement and restore the previous context + when exiting the with-statement. If no context is specified, a copy of the + current context is used. + + .. versionadded:: 2.5 + + For example, the following code sets the current decimal precision to 42 places, + performs a calculation, and then automatically restores the previous context:: + + from __future__ import with_statement + from decimal import localcontext + + with localcontext() as ctx: + ctx.prec = 42 # Perform a high precision calculation + s = calculate_something() + s = +s # Round the final result back to the default precision + +New contexts can also be created using the :class:`Context` constructor +described below. In addition, the module provides three pre-made contexts: + + +.. class:: BasicContext + + This is a standard context defined by the General Decimal Arithmetic + Specification. Precision is set to nine. Rounding is set to + :const:`ROUND_HALF_UP`. All flags are cleared. All traps are enabled (treated + as exceptions) except :const:`Inexact`, :const:`Rounded`, and + :const:`Subnormal`. + + Because many of the traps are enabled, this context is useful for debugging. + + +.. class:: ExtendedContext + + This is a standard context defined by the General Decimal Arithmetic + Specification. Precision is set to nine. Rounding is set to + :const:`ROUND_HALF_EVEN`. All flags are cleared. No traps are enabled (so that + exceptions are not raised during computations). + + Because the trapped are disabled, this context is useful for applications that + prefer to have result value of :const:`NaN` or :const:`Infinity` instead of + raising exceptions. This allows an application to complete a run in the + presence of conditions that would otherwise halt the program. + + +.. class:: DefaultContext + + This context is used by the :class:`Context` constructor as a prototype for new + contexts. Changing a field (such a precision) has the effect of changing the + default for new contexts creating by the :class:`Context` constructor. + + This context is most useful in multi-threaded environments. Changing one of the + fields before threads are started has the effect of setting system-wide + defaults. Changing the fields after threads have started is not recommended as + it would require thread synchronization to prevent race conditions. + + In single threaded environments, it is preferable to not use this context at + all. Instead, simply create contexts explicitly as described below. + + The default values are precision=28, rounding=ROUND_HALF_EVEN, and enabled traps + for Overflow, InvalidOperation, and DivisionByZero. + +In addition to the three supplied contexts, new contexts can be created with the +:class:`Context` constructor. + + +.. class:: Context(prec=None, rounding=None, traps=None, flags=None, Emin=None, Emax=None, capitals=1) + + Creates a new context. If a field is not specified or is :const:`None`, the + default values are copied from the :const:`DefaultContext`. If the *flags* + field is not specified or is :const:`None`, all flags are cleared. + + The *prec* field is a positive integer that sets the precision for arithmetic + operations in the context. + + The *rounding* option is one of: + + * :const:`ROUND_CEILING` (towards :const:`Infinity`), + * :const:`ROUND_DOWN` (towards zero), + * :const:`ROUND_FLOOR` (towards :const:`-Infinity`), + * :const:`ROUND_HALF_DOWN` (to nearest with ties going towards zero), + * :const:`ROUND_HALF_EVEN` (to nearest with ties going to nearest even integer), + * :const:`ROUND_HALF_UP` (to nearest with ties going away from zero), or + * :const:`ROUND_UP` (away from zero). + + The *traps* and *flags* fields list any signals to be set. Generally, new + contexts should only set traps and leave the flags clear. + + The *Emin* and *Emax* fields are integers specifying the outer limits allowable + for exponents. + + The *capitals* field is either :const:`0` or :const:`1` (the default). If set to + :const:`1`, exponents are printed with a capital :const:`E`; otherwise, a + lowercase :const:`e` is used: :const:`Decimal('6.02e+23')`. + +The :class:`Context` class defines several general purpose methods as well as a +large number of methods for doing arithmetic directly in a given context. + + +.. method:: Context.clear_flags() + + Resets all of the flags to :const:`0`. + + +.. method:: Context.copy() + + Return a duplicate of the context. + + +.. method:: Context.create_decimal(num) + + Creates a new Decimal instance from *num* but using *self* as context. Unlike + the :class:`Decimal` constructor, the context precision, rounding method, flags, + and traps are applied to the conversion. + + This is useful because constants are often given to a greater precision than is + needed by the application. Another benefit is that rounding immediately + eliminates unintended effects from digits beyond the current precision. In the + following example, using unrounded inputs means that adding zero to a sum can + change the result:: + + >>> getcontext().prec = 3 + >>> Decimal("3.4445") + Decimal("1.0023") + Decimal("4.45") + >>> Decimal("3.4445") + Decimal(0) + Decimal("1.0023") + Decimal("4.44") + + +.. method:: Context.Etiny() + + Returns a value equal to ``Emin - prec + 1`` which is the minimum exponent value + for subnormal results. When underflow occurs, the exponent is set to + :const:`Etiny`. + + +.. method:: Context.Etop() + + Returns a value equal to ``Emax - prec + 1``. + +The usual approach to working with decimals is to create :class:`Decimal` +instances and then apply arithmetic operations which take place within the +current context for the active thread. An alternate approach is to use context +methods for calculating within a specific context. The methods are similar to +those for the :class:`Decimal` class and are only briefly recounted here. + + +.. method:: Context.abs(x) + + Returns the absolute value of *x*. + + +.. method:: Context.add(x, y) + + Return the sum of *x* and *y*. + + +.. method:: Context.compare(x, y) + + Compares values numerically. + + Like :meth:`__cmp__` but returns a decimal instance:: + + a or b is a NaN ==> Decimal("NaN") + a < b ==> Decimal("-1") + a == b ==> Decimal("0") + a > b ==> Decimal("1") + + +.. method:: Context.divide(x, y) + + Return *x* divided by *y*. + + +.. method:: Context.divmod(x, y) + + Divides two numbers and returns the integer part of the result. + + +.. method:: Context.max(x, y) + + Compare two values numerically and return the maximum. + + If they are numerically equal then the left-hand operand is chosen as the + result. + + +.. method:: Context.min(x, y) + + Compare two values numerically and return the minimum. + + If they are numerically equal then the left-hand operand is chosen as the + result. + + +.. method:: Context.minus(x) + + Minus corresponds to the unary prefix minus operator in Python. + + +.. method:: Context.multiply(x, y) + + Return the product of *x* and *y*. + + +.. method:: Context.normalize(x) + + Normalize reduces an operand to its simplest form. + + Essentially a :meth:`plus` operation with all trailing zeros removed from the + result. + + +.. method:: Context.plus(x) + + Plus corresponds to the unary prefix plus operator in Python. This operation + applies the context precision and rounding, so it is *not* an identity + operation. + + +.. method:: Context.power(x, y[, modulo]) + + Return ``x ** y`` to the *modulo* if given. + + The right-hand operand must be a whole number whose integer part (after any + exponent has been applied) has no more than 9 digits and whose fractional part + (if any) is all zeros before any rounding. The operand may be positive, + negative, or zero; if negative, the absolute value of the power is used, and the + left-hand operand is inverted (divided into 1) before use. + + If the increased precision needed for the intermediate calculations exceeds the + capabilities of the implementation then an :const:`InvalidOperation` condition + is signaled. + + If, when raising to a negative power, an underflow occurs during the division + into 1, the operation is not halted at that point but continues. + + +.. method:: Context.quantize(x, y) + + Returns a value equal to *x* after rounding and having the exponent of *y*. + + Unlike other operations, if the length of the coefficient after the quantize + operation would be greater than precision, then an :const:`InvalidOperation` is + signaled. This guarantees that, unless there is an error condition, the + quantized exponent is always equal to that of the right-hand operand. + + Also unlike other operations, quantize never signals Underflow, even if the + result is subnormal and inexact. + + +.. method:: Context.remainder(x, y) + + Returns the remainder from integer division. + + The sign of the result, if non-zero, is the same as that of the original + dividend. + + +.. method:: Context.remainder_near(x, y) + + Computed the modulo as either a positive or negative value depending on which is + closest to zero. For instance, ``Decimal(10).remainder_near(6)`` returns + ``Decimal("-2")`` which is closer to zero than ``Decimal("4")``. + + If both are equally close, the one chosen will have the same sign as *self*. + + +.. method:: Context.same_quantum(x, y) + + Test whether *x* and *y* have the same exponent or whether both are + :const:`NaN`. + + +.. method:: Context.sqrt(x) + + Return the square root of *x* to full precision. + + +.. method:: Context.subtract(x, y) + + Return the difference between *x* and *y*. + + +.. method:: Context.to_eng_string() + + Convert to engineering-type string. + + Engineering notation has an exponent which is a multiple of 3, so there are up + to 3 digits left of the decimal place. For example, converts + ``Decimal('123E+1')`` to ``Decimal("1.23E+3")`` + + +.. method:: Context.to_integral(x) + + Rounds to the nearest integer without signaling :const:`Inexact` or + :const:`Rounded`. + + +.. method:: Context.to_sci_string(x) + + Converts a number to a string using scientific notation. + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-signals: + +Signals +------- + +Signals represent conditions that arise during computation. Each corresponds to +one context flag and one context trap enabler. + +The context flag is incremented whenever the condition is encountered. After the +computation, flags may be checked for informational purposes (for instance, to +determine whether a computation was exact). After checking the flags, be sure to +clear all flags before starting the next computation. + +If the context's trap enabler is set for the signal, then the condition causes a +Python exception to be raised. For example, if the :class:`DivisionByZero` trap +is set, then a :exc:`DivisionByZero` exception is raised upon encountering the +condition. + + +.. class:: Clamped + + Altered an exponent to fit representation constraints. + + Typically, clamping occurs when an exponent falls outside the context's + :attr:`Emin` and :attr:`Emax` limits. If possible, the exponent is reduced to + fit by adding zeroes to the coefficient. + + +.. class:: DecimalException + + Base class for other signals and a subclass of :exc:`ArithmeticError`. + + +.. class:: DivisionByZero + + Signals the division of a non-infinite number by zero. + + Can occur with division, modulo division, or when raising a number to a negative + power. If this signal is not trapped, returns :const:`Infinity` or + :const:`-Infinity` with the sign determined by the inputs to the calculation. + + +.. class:: Inexact + + Indicates that rounding occurred and the result is not exact. + + Signals when non-zero digits were discarded during rounding. The rounded result + is returned. The signal flag or trap is used to detect when results are + inexact. + + +.. class:: InvalidOperation + + An invalid operation was performed. + + Indicates that an operation was requested that does not make sense. If not + trapped, returns :const:`NaN`. Possible causes include:: + + Infinity - Infinity + 0 * Infinity + Infinity / Infinity + x % 0 + Infinity % x + x._rescale( non-integer ) + sqrt(-x) and x > 0 + 0 ** 0 + x ** (non-integer) + x ** Infinity + + +.. class:: Overflow + + Numerical overflow. + + Indicates the exponent is larger than :attr:`Emax` after rounding has occurred. + If not trapped, the result depends on the rounding mode, either pulling inward + to the largest representable finite number or rounding outward to + :const:`Infinity`. In either case, :class:`Inexact` and :class:`Rounded` are + also signaled. + + +.. class:: Rounded + + Rounding occurred though possibly no information was lost. + + Signaled whenever rounding discards digits; even if those digits are zero (such + as rounding :const:`5.00` to :const:`5.0`). If not trapped, returns the result + unchanged. This signal is used to detect loss of significant digits. + + +.. class:: Subnormal + + Exponent was lower than :attr:`Emin` prior to rounding. + + Occurs when an operation result is subnormal (the exponent is too small). If not + trapped, returns the result unchanged. + + +.. class:: Underflow + + Numerical underflow with result rounded to zero. + + Occurs when a subnormal result is pushed to zero by rounding. :class:`Inexact` + and :class:`Subnormal` are also signaled. + +The following table summarizes the hierarchy of signals:: + + exceptions.ArithmeticError(exceptions.Exception) + DecimalException + Clamped + DivisionByZero(DecimalException, exceptions.ZeroDivisionError) + Inexact + Overflow(Inexact, Rounded) + Underflow(Inexact, Rounded, Subnormal) + InvalidOperation + Rounded + Subnormal + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-notes: + +Floating Point Notes +-------------------- + + +Mitigating round-off error with increased precision +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The use of decimal floating point eliminates decimal representation error +(making it possible to represent :const:`0.1` exactly); however, some operations +can still incur round-off error when non-zero digits exceed the fixed precision. + +The effects of round-off error can be amplified by the addition or subtraction +of nearly offsetting quantities resulting in loss of significance. Knuth +provides two instructive examples where rounded floating point arithmetic with +insufficient precision causes the breakdown of the associative and distributive +properties of addition:: + + # Examples from Seminumerical Algorithms, Section 4.2.2. + >>> from decimal import Decimal, getcontext + >>> getcontext().prec = 8 + + >>> u, v, w = Decimal(11111113), Decimal(-11111111), Decimal('7.51111111') + >>> (u + v) + w + Decimal("9.5111111") + >>> u + (v + w) + Decimal("10") + + >>> u, v, w = Decimal(20000), Decimal(-6), Decimal('6.0000003') + >>> (u*v) + (u*w) + Decimal("0.01") + >>> u * (v+w) + Decimal("0.0060000") + +The :mod:`decimal` module makes it possible to restore the identities by +expanding the precision sufficiently to avoid loss of significance:: + + >>> getcontext().prec = 20 + >>> u, v, w = Decimal(11111113), Decimal(-11111111), Decimal('7.51111111') + >>> (u + v) + w + Decimal("9.51111111") + >>> u + (v + w) + Decimal("9.51111111") + >>> + >>> u, v, w = Decimal(20000), Decimal(-6), Decimal('6.0000003') + >>> (u*v) + (u*w) + Decimal("0.0060000") + >>> u * (v+w) + Decimal("0.0060000") + + +Special values +^^^^^^^^^^^^^^ + +The number system for the :mod:`decimal` module provides special values +including :const:`NaN`, :const:`sNaN`, :const:`-Infinity`, :const:`Infinity`, +and two zeroes, :const:`+0` and :const:`-0`. + +Infinities can be constructed directly with: ``Decimal('Infinity')``. Also, +they can arise from dividing by zero when the :exc:`DivisionByZero` signal is +not trapped. Likewise, when the :exc:`Overflow` signal is not trapped, infinity +can result from rounding beyond the limits of the largest representable number. + +The infinities are signed (affine) and can be used in arithmetic operations +where they get treated as very large, indeterminate numbers. For instance, +adding a constant to infinity gives another infinite result. + +Some operations are indeterminate and return :const:`NaN`, or if the +:exc:`InvalidOperation` signal is trapped, raise an exception. For example, +``0/0`` returns :const:`NaN` which means "not a number". This variety of +:const:`NaN` is quiet and, once created, will flow through other computations +always resulting in another :const:`NaN`. This behavior can be useful for a +series of computations that occasionally have missing inputs --- it allows the +calculation to proceed while flagging specific results as invalid. + +A variant is :const:`sNaN` which signals rather than remaining quiet after every +operation. This is a useful return value when an invalid result needs to +interrupt a calculation for special handling. + +The signed zeros can result from calculations that underflow. They keep the sign +that would have resulted if the calculation had been carried out to greater +precision. Since their magnitude is zero, both positive and negative zeros are +treated as equal and their sign is informational. + +In addition to the two signed zeros which are distinct yet equal, there are +various representations of zero with differing precisions yet equivalent in +value. This takes a bit of getting used to. For an eye accustomed to +normalized floating point representations, it is not immediately obvious that +the following calculation returns a value equal to zero:: + + >>> 1 / Decimal('Infinity') + Decimal("0E-1000000026") + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-threads: + +Working with threads +-------------------- + +The :func:`getcontext` function accesses a different :class:`Context` object for +each thread. Having separate thread contexts means that threads may make +changes (such as ``getcontext.prec=10``) without interfering with other threads. + +Likewise, the :func:`setcontext` function automatically assigns its target to +the current thread. + +If :func:`setcontext` has not been called before :func:`getcontext`, then +:func:`getcontext` will automatically create a new context for use in the +current thread. + +The new context is copied from a prototype context called *DefaultContext*. To +control the defaults so that each thread will use the same values throughout the +application, directly modify the *DefaultContext* object. This should be done +*before* any threads are started so that there won't be a race condition between +threads calling :func:`getcontext`. For example:: + + # Set applicationwide defaults for all threads about to be launched + DefaultContext.prec = 12 + DefaultContext.rounding = ROUND_DOWN + DefaultContext.traps = ExtendedContext.traps.copy() + DefaultContext.traps[InvalidOperation] = 1 + setcontext(DefaultContext) + + # Afterwards, the threads can be started + t1.start() + t2.start() + t3.start() + . . . + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-recipes: + +Recipes +------- + +Here are a few recipes that serve as utility functions and that demonstrate ways +to work with the :class:`Decimal` class:: + + def moneyfmt(value, places=2, curr='', sep=',', dp='.', + pos='', neg='-', trailneg=''): + """Convert Decimal to a money formatted string. + + places: required number of places after the decimal point + curr: optional currency symbol before the sign (may be blank) + sep: optional grouping separator (comma, period, space, or blank) + dp: decimal point indicator (comma or period) + only specify as blank when places is zero + pos: optional sign for positive numbers: '+', space or blank + neg: optional sign for negative numbers: '-', '(', space or blank + trailneg:optional trailing minus indicator: '-', ')', space or blank + + >>> d = Decimal('-1234567.8901') + >>> moneyfmt(d, curr='$') + '-$1,234,567.89' + >>> moneyfmt(d, places=0, sep='.', dp='', neg='', trailneg='-') + '1.234.568-' + >>> moneyfmt(d, curr='$', neg='(', trailneg=')') + '($1,234,567.89)' + >>> moneyfmt(Decimal(123456789), sep=' ') + '123 456 789.00' + >>> moneyfmt(Decimal('-0.02'), neg='<', trailneg='>') + '<.02>' + + """ + q = Decimal((0, (1,), -places)) # 2 places --> '0.01' + sign, digits, exp = value.quantize(q).as_tuple() + assert exp == -places + result = [] + digits = map(str, digits) + build, next = result.append, digits.pop + if sign: + build(trailneg) + for i in range(places): + if digits: + build(next()) + else: + build('0') + build(dp) + i = 0 + while digits: + build(next()) + i += 1 + if i == 3 and digits: + i = 0 + build(sep) + build(curr) + if sign: + build(neg) + else: + build(pos) + result.reverse() + return ''.join(result) + + def pi(): + """Compute Pi to the current precision. + + >>> print pi() + 3.141592653589793238462643383 + + """ + getcontext().prec += 2 # extra digits for intermediate steps + three = Decimal(3) # substitute "three=3.0" for regular floats + lasts, t, s, n, na, d, da = 0, three, 3, 1, 0, 0, 24 + while s != lasts: + lasts = s + n, na = n+na, na+8 + d, da = d+da, da+32 + t = (t * n) / d + s += t + getcontext().prec -= 2 + return +s # unary plus applies the new precision + + def exp(x): + """Return e raised to the power of x. Result type matches input type. + + >>> print exp(Decimal(1)) + 2.718281828459045235360287471 + >>> print exp(Decimal(2)) + 7.389056098930650227230427461 + >>> print exp(2.0) + 7.38905609893 + >>> print exp(2+0j) + (7.38905609893+0j) + + """ + getcontext().prec += 2 + i, lasts, s, fact, num = 0, 0, 1, 1, 1 + while s != lasts: + lasts = s + i += 1 + fact *= i + num *= x + s += num / fact + getcontext().prec -= 2 + return +s + + def cos(x): + """Return the cosine of x as measured in radians. + + >>> print cos(Decimal('0.5')) + 0.8775825618903727161162815826 + >>> print cos(0.5) + 0.87758256189 + >>> print cos(0.5+0j) + (0.87758256189+0j) + + """ + getcontext().prec += 2 + i, lasts, s, fact, num, sign = 0, 0, 1, 1, 1, 1 + while s != lasts: + lasts = s + i += 2 + fact *= i * (i-1) + num *= x * x + sign *= -1 + s += num / fact * sign + getcontext().prec -= 2 + return +s + + def sin(x): + """Return the sine of x as measured in radians. + + >>> print sin(Decimal('0.5')) + 0.4794255386042030002732879352 + >>> print sin(0.5) + 0.479425538604 + >>> print sin(0.5+0j) + (0.479425538604+0j) + + """ + getcontext().prec += 2 + i, lasts, s, fact, num, sign = 1, 0, x, 1, x, 1 + while s != lasts: + lasts = s + i += 2 + fact *= i * (i-1) + num *= x * x + sign *= -1 + s += num / fact * sign + getcontext().prec -= 2 + return +s + + +.. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +.. _decimal-faq: + +Decimal FAQ +----------- + +Q. It is cumbersome to type ``decimal.Decimal('1234.5')``. Is there a way to +minimize typing when using the interactive interpreter? + +\A. Some users abbreviate the constructor to just a single letter:: + + >>> D = decimal.Decimal + >>> D('1.23') + D('3.45') + Decimal("4.68") + +Q. In a fixed-point application with two decimal places, some inputs have many +places and need to be rounded. Others are not supposed to have excess digits +and need to be validated. What methods should be used? + +A. The :meth:`quantize` method rounds to a fixed number of decimal places. If +the :const:`Inexact` trap is set, it is also useful for validation:: + + >>> TWOPLACES = Decimal(10) ** -2 # same as Decimal('0.01') + + >>> # Round to two places + >>> Decimal("3.214").quantize(TWOPLACES) + Decimal("3.21") + + >>> # Validate that a number does not exceed two places + >>> Decimal("3.21").quantize(TWOPLACES, context=Context(traps=[Inexact])) + Decimal("3.21") + + >>> Decimal("3.214").quantize(TWOPLACES, context=Context(traps=[Inexact])) + Traceback (most recent call last): + ... + Inexact: Changed in rounding + +Q. Once I have valid two place inputs, how do I maintain that invariant +throughout an application? + +A. Some operations like addition and subtraction automatically preserve fixed +point. Others, like multiplication and division, change the number of decimal +places and need to be followed-up with a :meth:`quantize` step. + +Q. There are many ways to express the same value. The numbers :const:`200`, +:const:`200.000`, :const:`2E2`, and :const:`.02E+4` all have the same value at +various precisions. Is there a way to transform them to a single recognizable +canonical value? + +A. The :meth:`normalize` method maps all equivalent values to a single +representative:: + + >>> values = map(Decimal, '200 200.000 2E2 .02E+4'.split()) + >>> [v.normalize() for v in values] + [Decimal("2E+2"), Decimal("2E+2"), Decimal("2E+2"), Decimal("2E+2")] + +Q. Some decimal values always print with exponential notation. Is there a way +to get a non-exponential representation? + +A. For some values, exponential notation is the only way to express the number +of significant places in the coefficient. For example, expressing +:const:`5.0E+3` as :const:`5000` keeps the value constant but cannot show the +original's two-place significance. + +Q. Is there a way to convert a regular float to a :class:`Decimal`? + +A. Yes, all binary floating point numbers can be exactly expressed as a +Decimal. An exact conversion may take more precision than intuition would +suggest, so trapping :const:`Inexact` will signal a need for more precision:: + + def floatToDecimal(f): + "Convert a floating point number to a Decimal with no loss of information" + # Transform (exactly) a float to a mantissa (0.5 <= abs(m) < 1.0) and an + # exponent. Double the mantissa until it is an integer. Use the integer + # mantissa and exponent to compute an equivalent Decimal. If this cannot + # be done exactly, then retry with more precision. + + mantissa, exponent = math.frexp(f) + while mantissa != int(mantissa): + mantissa *= 2.0 + exponent -= 1 + mantissa = int(mantissa) + + oldcontext = getcontext() + setcontext(Context(traps=[Inexact])) + try: + while True: + try: + return mantissa * Decimal(2) ** exponent + except Inexact: + getcontext().prec += 1 + finally: + setcontext(oldcontext) + +Q. Why isn't the :func:`floatToDecimal` routine included in the module? + +A. There is some question about whether it is advisable to mix binary and +decimal floating point. Also, its use requires some care to avoid the +representation issues associated with binary floating point:: + + >>> floatToDecimal(1.1) + Decimal("1.100000000000000088817841970012523233890533447265625") + +Q. Within a complex calculation, how can I make sure that I haven't gotten a +spurious result because of insufficient precision or rounding anomalies. + +A. The decimal module makes it easy to test results. A best practice is to +re-run calculations using greater precision and with various rounding modes. +Widely differing results indicate insufficient precision, rounding mode issues, +ill-conditioned inputs, or a numerically unstable algorithm. + +Q. I noticed that context precision is applied to the results of operations but +not to the inputs. Is there anything to watch out for when mixing values of +different precisions? + +A. Yes. The principle is that all values are considered to be exact and so is +the arithmetic on those values. Only the results are rounded. The advantage +for inputs is that "what you type is what you get". A disadvantage is that the +results can look odd if you forget that the inputs haven't been rounded:: + + >>> getcontext().prec = 3 + >>> Decimal('3.104') + D('2.104') + Decimal("5.21") + >>> Decimal('3.104') + D('0.000') + D('2.104') + Decimal("5.20") + +The solution is either to increase precision or to force rounding of inputs +using the unary plus operation:: + + >>> getcontext().prec = 3 + >>> +Decimal('1.23456789') # unary plus triggers rounding + Decimal("1.23") + +Alternatively, inputs can be rounded upon creation using the +:meth:`Context.create_decimal` method:: + + >>> Context(prec=5, rounding=ROUND_DOWN).create_decimal('1.2345678') + Decimal("1.2345") + diff --git a/Doc/library/development.rst b/Doc/library/development.rst new file mode 100644 index 0000000..be8c33d --- /dev/null +++ b/Doc/library/development.rst @@ -0,0 +1,22 @@ + +.. _development: + +***************** +Development Tools +***************** + +The modules described in this chapter help you write software. For example, the +:mod:`pydoc` module takes a module and generates documentation based on the +module's contents. The :mod:`doctest` and :mod:`unittest` modules contains +frameworks for writing unit tests that automatically exercise code and verify +that the expected output is produced. + +The list of modules described in this chapter is: + + +.. toctree:: + + pydoc.rst + doctest.rst + unittest.rst + test.rst diff --git a/Doc/library/difflib.rst b/Doc/library/difflib.rst new file mode 100644 index 0000000..95b83e6 --- /dev/null +++ b/Doc/library/difflib.rst @@ -0,0 +1,644 @@ + +:mod:`difflib` --- Helpers for computing deltas +=============================================== + +.. module:: difflib + :synopsis: Helpers for computing differences between objects. +.. moduleauthor:: Tim Peters <tim_one@users.sourceforge.net> +.. sectionauthor:: Tim Peters <tim_one@users.sourceforge.net> + + +.. % LaTeXification by Fred L. Drake, Jr. <fdrake@acm.org>. + +.. versionadded:: 2.1 + + +.. class:: SequenceMatcher + + This is a flexible class for comparing pairs of sequences of any type, so long + as the sequence elements are hashable. The basic algorithm predates, and is a + little fancier than, an algorithm published in the late 1980's by Ratcliff and + Obershelp under the hyperbolic name "gestalt pattern matching." The idea is to + find the longest contiguous matching subsequence that contains no "junk" + elements (the Ratcliff and Obershelp algorithm doesn't address junk). The same + idea is then applied recursively to the pieces of the sequences to the left and + to the right of the matching subsequence. This does not yield minimal edit + sequences, but does tend to yield matches that "look right" to people. + + **Timing:** The basic Ratcliff-Obershelp algorithm is cubic time in the worst + case and quadratic time in the expected case. :class:`SequenceMatcher` is + quadratic time for the worst case and has expected-case behavior dependent in a + complicated way on how many elements the sequences have in common; best case + time is linear. + + +.. class:: Differ + + This is a class for comparing sequences of lines of text, and producing + human-readable differences or deltas. Differ uses :class:`SequenceMatcher` + both to compare sequences of lines, and to compare sequences of characters + within similar (near-matching) lines. + + Each line of a :class:`Differ` delta begins with a two-letter code: + + +----------+-------------------------------------------+ + | Code | Meaning | + +==========+===========================================+ + | ``'- '`` | line unique to sequence 1 | + +----------+-------------------------------------------+ + | ``'+ '`` | line unique to sequence 2 | + +----------+-------------------------------------------+ + | ``' '`` | line common to both sequences | + +----------+-------------------------------------------+ + | ``'? '`` | line not present in either input sequence | + +----------+-------------------------------------------+ + + Lines beginning with '``?``' attempt to guide the eye to intraline differences, + and were not present in either input sequence. These lines can be confusing if + the sequences contain tab characters. + + +.. class:: HtmlDiff + + This class can be used to create an HTML table (or a complete HTML file + containing the table) showing a side by side, line by line comparison of text + with inter-line and intra-line change highlights. The table can be generated in + either full or contextual difference mode. + + The constructor for this class is: + + + .. function:: __init__([tabsize][, wrapcolumn][, linejunk][, charjunk]) + + Initializes instance of :class:`HtmlDiff`. + + *tabsize* is an optional keyword argument to specify tab stop spacing and + defaults to ``8``. + + *wrapcolumn* is an optional keyword to specify column number where lines are + broken and wrapped, defaults to ``None`` where lines are not wrapped. + + *linejunk* and *charjunk* are optional keyword arguments passed into ``ndiff()`` + (used by :class:`HtmlDiff` to generate the side by side HTML differences). See + ``ndiff()`` documentation for argument default values and descriptions. + + The following methods are public: + + + .. function:: make_file(fromlines, tolines [, fromdesc][, todesc][, context][, numlines]) + + Compares *fromlines* and *tolines* (lists of strings) and returns a string which + is a complete HTML file containing a table showing line by line differences with + inter-line and intra-line changes highlighted. + + *fromdesc* and *todesc* are optional keyword arguments to specify from/to file + column header strings (both default to an empty string). + + *context* and *numlines* are both optional keyword arguments. Set *context* to + ``True`` when contextual differences are to be shown, else the default is + ``False`` to show the full files. *numlines* defaults to ``5``. When *context* + is ``True`` *numlines* controls the number of context lines which surround the + difference highlights. When *context* is ``False`` *numlines* controls the + number of lines which are shown before a difference highlight when using the + "next" hyperlinks (setting to zero would cause the "next" hyperlinks to place + the next difference highlight at the top of the browser without any leading + context). + + + .. function:: make_table(fromlines, tolines [, fromdesc][, todesc][, context][, numlines]) + + Compares *fromlines* and *tolines* (lists of strings) and returns a string which + is a complete HTML table showing line by line differences with inter-line and + intra-line changes highlighted. + + The arguments for this method are the same as those for the :meth:`make_file` + method. + + :file:`Tools/scripts/diff.py` is a command-line front-end to this class and + contains a good example of its use. + + .. versionadded:: 2.4 + + +.. function:: context_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm]) + + Compare *a* and *b* (lists of strings); return a delta (a generator generating + the delta lines) in context diff format. + + Context diffs are a compact way of showing just the lines that have changed plus + a few lines of context. The changes are shown in a before/after style. The + number of context lines is set by *n* which defaults to three. + + By default, the diff control lines (those with ``***`` or ``---``) are created + with a trailing newline. This is helpful so that inputs created from + :func:`file.readlines` result in diffs that are suitable for use with + :func:`file.writelines` since both the inputs and outputs have trailing + newlines. + + For inputs that do not have trailing newlines, set the *lineterm* argument to + ``""`` so that the output will be uniformly newline free. + + The context diff format normally has a header for filenames and modification + times. Any or all of these may be specified using strings for *fromfile*, + *tofile*, *fromfiledate*, and *tofiledate*. The modification times are normally + expressed in the format returned by :func:`time.ctime`. If not specified, the + strings default to blanks. + + :file:`Tools/scripts/diff.py` is a command-line front-end for this function. + + .. versionadded:: 2.3 + + +.. function:: get_close_matches(word, possibilities[, n][, cutoff]) + + Return a list of the best "good enough" matches. *word* is a sequence for which + close matches are desired (typically a string), and *possibilities* is a list of + sequences against which to match *word* (typically a list of strings). + + Optional argument *n* (default ``3``) is the maximum number of close matches to + return; *n* must be greater than ``0``. + + Optional argument *cutoff* (default ``0.6``) is a float in the range [0, 1]. + Possibilities that don't score at least that similar to *word* are ignored. + + The best (no more than *n*) matches among the possibilities are returned in a + list, sorted by similarity score, most similar first. :: + + >>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy']) + ['apple', 'ape'] + >>> import keyword + >>> get_close_matches('wheel', keyword.kwlist) + ['while'] + >>> get_close_matches('apple', keyword.kwlist) + [] + >>> get_close_matches('accept', keyword.kwlist) + ['except'] + + +.. function:: ndiff(a, b[, linejunk][, charjunk]) + + Compare *a* and *b* (lists of strings); return a :class:`Differ`\ -style delta + (a generator generating the delta lines). + + Optional keyword parameters *linejunk* and *charjunk* are for filter functions + (or ``None``): + + *linejunk*: A function that accepts a single string argument, and returns true + if the string is junk, or false if not. The default is (``None``), starting with + Python 2.3. Before then, the default was the module-level function + :func:`IS_LINE_JUNK`, which filters out lines without visible characters, except + for at most one pound character (``'#'``). As of Python 2.3, the underlying + :class:`SequenceMatcher` class does a dynamic analysis of which lines are so + frequent as to constitute noise, and this usually works better than the pre-2.3 + default. + + *charjunk*: A function that accepts a character (a string of length 1), and + returns if the character is junk, or false if not. The default is module-level + function :func:`IS_CHARACTER_JUNK`, which filters out whitespace characters (a + blank or tab; note: bad idea to include newline in this!). + + :file:`Tools/scripts/ndiff.py` is a command-line front-end to this function. :: + + >>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1), + ... 'ore\ntree\nemu\n'.splitlines(1)) + >>> print ''.join(diff), + - one + ? ^ + + ore + ? ^ + - two + - three + ? - + + tree + + emu + + +.. function:: restore(sequence, which) + + Return one of the two sequences that generated a delta. + + Given a *sequence* produced by :meth:`Differ.compare` or :func:`ndiff`, extract + lines originating from file 1 or 2 (parameter *which*), stripping off line + prefixes. + + Example:: + + >>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1), + ... 'ore\ntree\nemu\n'.splitlines(1)) + >>> diff = list(diff) # materialize the generated delta into a list + >>> print ''.join(restore(diff, 1)), + one + two + three + >>> print ''.join(restore(diff, 2)), + ore + tree + emu + + +.. function:: unified_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm]) + + Compare *a* and *b* (lists of strings); return a delta (a generator generating + the delta lines) in unified diff format. + + Unified diffs are a compact way of showing just the lines that have changed plus + a few lines of context. The changes are shown in a inline style (instead of + separate before/after blocks). The number of context lines is set by *n* which + defaults to three. + + By default, the diff control lines (those with ``---``, ``+++``, or ``@@``) are + created with a trailing newline. This is helpful so that inputs created from + :func:`file.readlines` result in diffs that are suitable for use with + :func:`file.writelines` since both the inputs and outputs have trailing + newlines. + + For inputs that do not have trailing newlines, set the *lineterm* argument to + ``""`` so that the output will be uniformly newline free. + + The context diff format normally has a header for filenames and modification + times. Any or all of these may be specified using strings for *fromfile*, + *tofile*, *fromfiledate*, and *tofiledate*. The modification times are normally + expressed in the format returned by :func:`time.ctime`. If not specified, the + strings default to blanks. + + :file:`Tools/scripts/diff.py` is a command-line front-end for this function. + + .. versionadded:: 2.3 + + +.. function:: IS_LINE_JUNK(line) + + Return true for ignorable lines. The line *line* is ignorable if *line* is + blank or contains a single ``'#'``, otherwise it is not ignorable. Used as a + default for parameter *linejunk* in :func:`ndiff` before Python 2.3. + + +.. function:: IS_CHARACTER_JUNK(ch) + + Return true for ignorable characters. The character *ch* is ignorable if *ch* + is a space or tab, otherwise it is not ignorable. Used as a default for + parameter *charjunk* in :func:`ndiff`. + + +.. seealso:: + + `Pattern Matching: The Gestalt Approach <http://www.ddj.com/184407970?pgno=5>`_ + Discussion of a similar algorithm by John W. Ratcliff and D. E. Metzener. This + was published in `Dr. Dobb's Journal <http://www.ddj.com/>`_ in July, 1988. + + +.. _sequence-matcher: + +SequenceMatcher Objects +----------------------- + +The :class:`SequenceMatcher` class has this constructor: + + +.. class:: SequenceMatcher([isjunk[, a[, b]]]) + + Optional argument *isjunk* must be ``None`` (the default) or a one-argument + function that takes a sequence element and returns true if and only if the + element is "junk" and should be ignored. Passing ``None`` for *isjunk* is + equivalent to passing ``lambda x: 0``; in other words, no elements are ignored. + For example, pass:: + + lambda x: x in " \t" + + if you're comparing lines as sequences of characters, and don't want to synch up + on blanks or hard tabs. + + The optional arguments *a* and *b* are sequences to be compared; both default to + empty strings. The elements of both sequences must be hashable. + +:class:`SequenceMatcher` objects have the following methods: + + +.. method:: SequenceMatcher.set_seqs(a, b) + + Set the two sequences to be compared. + +:class:`SequenceMatcher` computes and caches detailed information about the +second sequence, so if you want to compare one sequence against many sequences, +use :meth:`set_seq2` to set the commonly used sequence once and call +:meth:`set_seq1` repeatedly, once for each of the other sequences. + + +.. method:: SequenceMatcher.set_seq1(a) + + Set the first sequence to be compared. The second sequence to be compared is + not changed. + + +.. method:: SequenceMatcher.set_seq2(b) + + Set the second sequence to be compared. The first sequence to be compared is + not changed. + + +.. method:: SequenceMatcher.find_longest_match(alo, ahi, blo, bhi) + + Find longest matching block in ``a[alo:ahi]`` and ``b[blo:bhi]``. + + If *isjunk* was omitted or ``None``, :meth:`get_longest_match` returns ``(i, j, + k)`` such that ``a[i:i+k]`` is equal to ``b[j:j+k]``, where ``alo <= i <= i+k <= + ahi`` and ``blo <= j <= j+k <= bhi``. For all ``(i', j', k')`` meeting those + conditions, the additional conditions ``k >= k'``, ``i <= i'``, and if ``i == + i'``, ``j <= j'`` are also met. In other words, of all maximal matching blocks, + return one that starts earliest in *a*, and of all those maximal matching blocks + that start earliest in *a*, return the one that starts earliest in *b*. :: + + >>> s = SequenceMatcher(None, " abcd", "abcd abcd") + >>> s.find_longest_match(0, 5, 0, 9) + (0, 4, 5) + + If *isjunk* was provided, first the longest matching block is determined as + above, but with the additional restriction that no junk element appears in the + block. Then that block is extended as far as possible by matching (only) junk + elements on both sides. So the resulting block never matches on junk except as + identical junk happens to be adjacent to an interesting match. + + Here's the same example as before, but considering blanks to be junk. That + prevents ``' abcd'`` from matching the ``' abcd'`` at the tail end of the second + sequence directly. Instead only the ``'abcd'`` can match, and matches the + leftmost ``'abcd'`` in the second sequence:: + + >>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd") + >>> s.find_longest_match(0, 5, 0, 9) + (1, 0, 4) + + If no blocks match, this returns ``(alo, blo, 0)``. + + +.. method:: SequenceMatcher.get_matching_blocks() + + Return list of triples describing matching subsequences. Each triple is of the + form ``(i, j, n)``, and means that ``a[i:i+n] == b[j:j+n]``. The triples are + monotonically increasing in *i* and *j*. + + The last triple is a dummy, and has the value ``(len(a), len(b), 0)``. It is + the only triple with ``n == 0``. If ``(i, j, n)`` and ``(i', j', n')`` are + adjacent triples in the list, and the second is not the last triple in the list, + then ``i+n != i'`` or ``j+n != j'``; in other words, adjacent triples always + describe non-adjacent equal blocks. + + .. % Explain why a dummy is used! + + .. versionchanged:: 2.5 + The guarantee that adjacent triples always describe non-adjacent blocks was + implemented. + + :: + + >>> s = SequenceMatcher(None, "abxcd", "abcd") + >>> s.get_matching_blocks() + [(0, 0, 2), (3, 2, 2), (5, 4, 0)] + + +.. method:: SequenceMatcher.get_opcodes() + + Return list of 5-tuples describing how to turn *a* into *b*. Each tuple is of + the form ``(tag, i1, i2, j1, j2)``. The first tuple has ``i1 == j1 == 0``, and + remaining tuples have *i1* equal to the *i2* from the preceding tuple, and, + likewise, *j1* equal to the previous *j2*. + + The *tag* values are strings, with these meanings: + + +---------------+---------------------------------------------+ + | Value | Meaning | + +===============+=============================================+ + | ``'replace'`` | ``a[i1:i2]`` should be replaced by | + | | ``b[j1:j2]``. | + +---------------+---------------------------------------------+ + | ``'delete'`` | ``a[i1:i2]`` should be deleted. Note that | + | | ``j1 == j2`` in this case. | + +---------------+---------------------------------------------+ + | ``'insert'`` | ``b[j1:j2]`` should be inserted at | + | | ``a[i1:i1]``. Note that ``i1 == i2`` in | + | | this case. | + +---------------+---------------------------------------------+ + | ``'equal'`` | ``a[i1:i2] == b[j1:j2]`` (the sub-sequences | + | | are equal). | + +---------------+---------------------------------------------+ + + For example:: + + >>> a = "qabxcd" + >>> b = "abycdf" + >>> s = SequenceMatcher(None, a, b) + >>> for tag, i1, i2, j1, j2 in s.get_opcodes(): + ... print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % + ... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2])) + delete a[0:1] (q) b[0:0] () + equal a[1:3] (ab) b[0:2] (ab) + replace a[3:4] (x) b[2:3] (y) + equal a[4:6] (cd) b[3:5] (cd) + insert a[6:6] () b[5:6] (f) + + +.. method:: SequenceMatcher.get_grouped_opcodes([n]) + + Return a generator of groups with up to *n* lines of context. + + Starting with the groups returned by :meth:`get_opcodes`, this method splits out + smaller change clusters and eliminates intervening ranges which have no changes. + + The groups are returned in the same format as :meth:`get_opcodes`. + + .. versionadded:: 2.3 + + +.. method:: SequenceMatcher.ratio() + + Return a measure of the sequences' similarity as a float in the range [0, 1]. + + Where T is the total number of elements in both sequences, and M is the number + of matches, this is 2.0\*M / T. Note that this is ``1.0`` if the sequences are + identical, and ``0.0`` if they have nothing in common. + + This is expensive to compute if :meth:`get_matching_blocks` or + :meth:`get_opcodes` hasn't already been called, in which case you may want to + try :meth:`quick_ratio` or :meth:`real_quick_ratio` first to get an upper bound. + + +.. method:: SequenceMatcher.quick_ratio() + + Return an upper bound on :meth:`ratio` relatively quickly. + + This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is + faster to compute. + + +.. method:: SequenceMatcher.real_quick_ratio() + + Return an upper bound on :meth:`ratio` very quickly. + + This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is + faster to compute than either :meth:`ratio` or :meth:`quick_ratio`. + +The three methods that return the ratio of matching to total characters can give +different results due to differing levels of approximation, although +:meth:`quick_ratio` and :meth:`real_quick_ratio` are always at least as large as +:meth:`ratio`:: + + >>> s = SequenceMatcher(None, "abcd", "bcde") + >>> s.ratio() + 0.75 + >>> s.quick_ratio() + 0.75 + >>> s.real_quick_ratio() + 1.0 + + +.. _sequencematcher-examples: + +SequenceMatcher Examples +------------------------ + +This example compares two strings, considering blanks to be "junk:" :: + + >>> s = SequenceMatcher(lambda x: x == " ", + ... "private Thread currentThread;", + ... "private volatile Thread currentThread;") + +:meth:`ratio` returns a float in [0, 1], measuring the similarity of the +sequences. As a rule of thumb, a :meth:`ratio` value over 0.6 means the +sequences are close matches:: + + >>> print round(s.ratio(), 3) + 0.866 + +If you're only interested in where the sequences match, +:meth:`get_matching_blocks` is handy:: + + >>> for block in s.get_matching_blocks(): + ... print "a[%d] and b[%d] match for %d elements" % block + a[0] and b[0] match for 8 elements + a[8] and b[17] match for 6 elements + a[14] and b[23] match for 15 elements + a[29] and b[38] match for 0 elements + +Note that the last tuple returned by :meth:`get_matching_blocks` is always a +dummy, ``(len(a), len(b), 0)``, and this is the only case in which the last +tuple element (number of elements matched) is ``0``. + +If you want to know how to change the first sequence into the second, use +:meth:`get_opcodes`:: + + >>> for opcode in s.get_opcodes(): + ... print "%6s a[%d:%d] b[%d:%d]" % opcode + equal a[0:8] b[0:8] + insert a[8:8] b[8:17] + equal a[8:14] b[17:23] + equal a[14:29] b[23:38] + +See also the function :func:`get_close_matches` in this module, which shows how +simple code building on :class:`SequenceMatcher` can be used to do useful work. + + +.. _differ-objects: + +Differ Objects +-------------- + +Note that :class:`Differ`\ -generated deltas make no claim to be **minimal** +diffs. To the contrary, minimal diffs are often counter-intuitive, because they +synch up anywhere possible, sometimes accidental matches 100 pages apart. +Restricting synch points to contiguous matches preserves some notion of +locality, at the occasional cost of producing a longer diff. + +The :class:`Differ` class has this constructor: + + +.. class:: Differ([linejunk[, charjunk]]) + + Optional keyword parameters *linejunk* and *charjunk* are for filter functions + (or ``None``): + + *linejunk*: A function that accepts a single string argument, and returns true + if the string is junk. The default is ``None``, meaning that no line is + considered junk. + + *charjunk*: A function that accepts a single character argument (a string of + length 1), and returns true if the character is junk. The default is ``None``, + meaning that no character is considered junk. + +:class:`Differ` objects are used (deltas generated) via a single method: + + +.. method:: Differ.compare(a, b) + + Compare two sequences of lines, and generate the delta (a sequence of lines). + + Each sequence must contain individual single-line strings ending with newlines. + Such sequences can be obtained from the :meth:`readlines` method of file-like + objects. The delta generated also consists of newline-terminated strings, ready + to be printed as-is via the :meth:`writelines` method of a file-like object. + + +.. _differ-examples: + +Differ Example +-------------- + +This example compares two texts. First we set up the texts, sequences of +individual single-line strings ending with newlines (such sequences can also be +obtained from the :meth:`readlines` method of file-like objects):: + + >>> text1 = ''' 1. Beautiful is better than ugly. + ... 2. Explicit is better than implicit. + ... 3. Simple is better than complex. + ... 4. Complex is better than complicated. + ... '''.splitlines(1) + >>> len(text1) + 4 + >>> text1[0][-1] + '\n' + >>> text2 = ''' 1. Beautiful is better than ugly. + ... 3. Simple is better than complex. + ... 4. Complicated is better than complex. + ... 5. Flat is better than nested. + ... '''.splitlines(1) + +Next we instantiate a Differ object:: + + >>> d = Differ() + +Note that when instantiating a :class:`Differ` object we may pass functions to +filter out line and character "junk." See the :meth:`Differ` constructor for +details. + +Finally, we compare the two:: + + >>> result = list(d.compare(text1, text2)) + +``result`` is a list of strings, so let's pretty-print it:: + + >>> from pprint import pprint + >>> pprint(result) + [' 1. Beautiful is better than ugly.\n', + '- 2. Explicit is better than implicit.\n', + '- 3. Simple is better than complex.\n', + '+ 3. Simple is better than complex.\n', + '? ++ \n', + '- 4. Complex is better than complicated.\n', + '? ^ ---- ^ \n', + '+ 4. Complicated is better than complex.\n', + '? ++++ ^ ^ \n', + '+ 5. Flat is better than nested.\n'] + +As a single multi-line string it looks like this:: + + >>> import sys + >>> sys.stdout.writelines(result) + 1. Beautiful is better than ugly. + - 2. Explicit is better than implicit. + - 3. Simple is better than complex. + + 3. Simple is better than complex. + ? ++ + - 4. Complex is better than complicated. + ? ^ ---- ^ + + 4. Complicated is better than complex. + ? ++++ ^ ^ + + 5. Flat is better than nested. + diff --git a/Doc/library/dircache.rst b/Doc/library/dircache.rst new file mode 100644 index 0000000..28aa667 --- /dev/null +++ b/Doc/library/dircache.rst @@ -0,0 +1,56 @@ + +:mod:`dircache` --- Cached directory listings +============================================= + +.. module:: dircache + :synopsis: Return directory listing, with cache mechanism. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`dircache` module defines a function for reading directory listing +using a cache, and cache invalidation using the *mtime* of the directory. +Additionally, it defines a function to annotate directories by appending a +slash. + +The :mod:`dircache` module defines the following functions: + + +.. function:: reset() + + Resets the directory cache. + + +.. function:: listdir(path) + + Return a directory listing of *path*, as gotten from :func:`os.listdir`. Note + that unless *path* changes, further call to :func:`listdir` will not re-read the + directory structure. + + Note that the list returned should be regarded as read-only. (Perhaps a future + version should change it to return a tuple?) + + +.. function:: opendir(path) + + Same as :func:`listdir`. Defined for backwards compatibility. + + +.. function:: annotate(head, list) + + Assume *list* is a list of paths relative to *head*, and append, in place, a + ``'/'`` to each path which points to a directory. + +:: + + >>> import dircache + >>> a = dircache.listdir('/') + >>> a = a[:] # Copy the return value so we can change 'a' + >>> a + ['bin', 'boot', 'cdrom', 'dev', 'etc', 'floppy', 'home', 'initrd', 'lib', 'lost+ + found', 'mnt', 'proc', 'root', 'sbin', 'tmp', 'usr', 'var', 'vmlinuz'] + >>> dircache.annotate('/', a) + >>> a + ['bin/', 'boot/', 'cdrom/', 'dev/', 'etc/', 'floppy/', 'home/', 'initrd/', 'lib/ + ', 'lost+found/', 'mnt/', 'proc/', 'root/', 'sbin/', 'tmp/', 'usr/', 'var/', 'vm + linuz'] + diff --git a/Doc/library/dis.rst b/Doc/library/dis.rst new file mode 100644 index 0000000..5f28473 --- /dev/null +++ b/Doc/library/dis.rst @@ -0,0 +1,775 @@ + +:mod:`dis` --- Disassembler for Python byte code +================================================ + +.. module:: dis + :synopsis: Disassembler for Python byte code. + + +The :mod:`dis` module supports the analysis of Python byte code by disassembling +it. Since there is no Python assembler, this module defines the Python assembly +language. The Python byte code which this module takes as an input is defined +in the file :file:`Include/opcode.h` and used by the compiler and the +interpreter. + +Example: Given the function :func:`myfunc`:: + + def myfunc(alist): + return len(alist) + +the following command can be used to get the disassembly of :func:`myfunc`:: + + >>> dis.dis(myfunc) + 2 0 LOAD_GLOBAL 0 (len) + 3 LOAD_FAST 0 (alist) + 6 CALL_FUNCTION 1 + 9 RETURN_VALUE + +(The "2" is a line number). + +The :mod:`dis` module defines the following functions and constants: + + +.. function:: dis([bytesource]) + + Disassemble the *bytesource* object. *bytesource* can denote either a module, a + class, a method, a function, or a code object. For a module, it disassembles + all functions. For a class, it disassembles all methods. For a single code + sequence, it prints one line per byte code instruction. If no object is + provided, it disassembles the last traceback. + + +.. function:: distb([tb]) + + Disassembles the top-of-stack function of a traceback, using the last traceback + if none was passed. The instruction causing the exception is indicated. + + +.. function:: disassemble(code[, lasti]) + + Disassembles a code object, indicating the last instruction if *lasti* was + provided. The output is divided in the following columns: + + #. the line number, for the first instruction of each line + #. the current instruction, indicated as ``-->``, + #. a labelled instruction, indicated with ``>>``, + #. the address of the instruction, + #. the operation code name, + #. operation parameters, and + #. interpretation of the parameters in parentheses. + + The parameter interpretation recognizes local and global variable names, + constant values, branch targets, and compare operators. + + +.. function:: disco(code[, lasti]) + + A synonym for disassemble. It is more convenient to type, and kept for + compatibility with earlier Python releases. + + +.. data:: opname + + Sequence of operation names, indexable using the byte code. + + +.. data:: opmap + + Dictionary mapping byte codes to operation names. + + +.. data:: cmp_op + + Sequence of all compare operation names. + + +.. data:: hasconst + + Sequence of byte codes that have a constant parameter. + + +.. data:: hasfree + + Sequence of byte codes that access a free variable. + + +.. data:: hasname + + Sequence of byte codes that access an attribute by name. + + +.. data:: hasjrel + + Sequence of byte codes that have a relative jump target. + + +.. data:: hasjabs + + Sequence of byte codes that have an absolute jump target. + + +.. data:: haslocal + + Sequence of byte codes that access a local variable. + + +.. data:: hascompare + + Sequence of byte codes of Boolean operations. + + +.. _bytecodes: + +Python Byte Code Instructions +----------------------------- + +The Python compiler currently generates the following byte code instructions. + + +.. opcode:: STOP_CODE () + + Indicates end-of-code to the compiler, not used by the interpreter. + + +.. opcode:: NOP () + + Do nothing code. Used as a placeholder by the bytecode optimizer. + + +.. opcode:: POP_TOP () + + Removes the top-of-stack (TOS) item. + + +.. opcode:: ROT_TWO () + + Swaps the two top-most stack items. + + +.. opcode:: ROT_THREE () + + Lifts second and third stack item one position up, moves top down to position + three. + + +.. opcode:: ROT_FOUR () + + Lifts second, third and forth stack item one position up, moves top down to + position four. + + +.. opcode:: DUP_TOP () + + Duplicates the reference on top of the stack. + +Unary Operations take the top of the stack, apply the operation, and push the +result back on the stack. + + +.. opcode:: UNARY_POSITIVE () + + Implements ``TOS = +TOS``. + + +.. opcode:: UNARY_NEGATIVE () + + Implements ``TOS = -TOS``. + + +.. opcode:: UNARY_NOT () + + Implements ``TOS = not TOS``. + + +.. opcode:: UNARY_INVERT () + + Implements ``TOS = ~TOS``. + + +.. opcode:: GET_ITER () + + Implements ``TOS = iter(TOS)``. + +Binary operations remove the top of the stack (TOS) and the second top-most +stack item (TOS1) from the stack. They perform the operation, and put the +result back on the stack. + + +.. opcode:: BINARY_POWER () + + Implements ``TOS = TOS1 ** TOS``. + + +.. opcode:: BINARY_MULTIPLY () + + Implements ``TOS = TOS1 * TOS``. + + +.. opcode:: BINARY_FLOOR_DIVIDE () + + Implements ``TOS = TOS1 // TOS``. + + +.. opcode:: BINARY_TRUE_DIVIDE () + + Implements ``TOS = TOS1 / TOS`` when ``from __future__ import division`` is in + effect. + + +.. opcode:: BINARY_MODULO () + + Implements ``TOS = TOS1 % TOS``. + + +.. opcode:: BINARY_ADD () + + Implements ``TOS = TOS1 + TOS``. + + +.. opcode:: BINARY_SUBTRACT () + + Implements ``TOS = TOS1 - TOS``. + + +.. opcode:: BINARY_SUBSCR () + + Implements ``TOS = TOS1[TOS]``. + + +.. opcode:: BINARY_LSHIFT () + + Implements ``TOS = TOS1 << TOS``. + + +.. opcode:: BINARY_RSHIFT () + + Implements ``TOS = TOS1 >> TOS``. + + +.. opcode:: BINARY_AND () + + Implements ``TOS = TOS1 & TOS``. + + +.. opcode:: BINARY_XOR () + + Implements ``TOS = TOS1 ^ TOS``. + + +.. opcode:: BINARY_OR () + + Implements ``TOS = TOS1 | TOS``. + +In-place operations are like binary operations, in that they remove TOS and +TOS1, and push the result back on the stack, but the operation is done in-place +when TOS1 supports it, and the resulting TOS may be (but does not have to be) +the original TOS1. + + +.. opcode:: INPLACE_POWER () + + Implements in-place ``TOS = TOS1 ** TOS``. + + +.. opcode:: INPLACE_MULTIPLY () + + Implements in-place ``TOS = TOS1 * TOS``. + + +.. opcode:: INPLACE_FLOOR_DIVIDE () + + Implements in-place ``TOS = TOS1 // TOS``. + + +.. opcode:: INPLACE_TRUE_DIVIDE () + + Implements in-place ``TOS = TOS1 / TOS`` when ``from __future__ import + division`` is in effect. + + +.. opcode:: INPLACE_MODULO () + + Implements in-place ``TOS = TOS1 % TOS``. + + +.. opcode:: INPLACE_ADD () + + Implements in-place ``TOS = TOS1 + TOS``. + + +.. opcode:: INPLACE_SUBTRACT () + + Implements in-place ``TOS = TOS1 - TOS``. + + +.. opcode:: INPLACE_LSHIFT () + + Implements in-place ``TOS = TOS1 << TOS``. + + +.. opcode:: INPLACE_RSHIFT () + + Implements in-place ``TOS = TOS1 >> TOS``. + + +.. opcode:: INPLACE_AND () + + Implements in-place ``TOS = TOS1 & TOS``. + + +.. opcode:: INPLACE_XOR () + + Implements in-place ``TOS = TOS1 ^ TOS``. + + +.. opcode:: INPLACE_OR () + + Implements in-place ``TOS = TOS1 | TOS``. + +The slice opcodes take up to three parameters. + + +.. opcode:: SLICE+0 () + + Implements ``TOS = TOS[:]``. + + +.. opcode:: SLICE+1 () + + Implements ``TOS = TOS1[TOS:]``. + + +.. opcode:: SLICE+2 () + + Implements ``TOS = TOS1[:TOS]``. + + +.. opcode:: SLICE+3 () + + Implements ``TOS = TOS2[TOS1:TOS]``. + +Slice assignment needs even an additional parameter. As any statement, they put +nothing on the stack. + + +.. opcode:: STORE_SLICE+0 () + + Implements ``TOS[:] = TOS1``. + + +.. opcode:: STORE_SLICE+1 () + + Implements ``TOS1[TOS:] = TOS2``. + + +.. opcode:: STORE_SLICE+2 () + + Implements ``TOS1[:TOS] = TOS2``. + + +.. opcode:: STORE_SLICE+3 () + + Implements ``TOS2[TOS1:TOS] = TOS3``. + + +.. opcode:: DELETE_SLICE+0 () + + Implements ``del TOS[:]``. + + +.. opcode:: DELETE_SLICE+1 () + + Implements ``del TOS1[TOS:]``. + + +.. opcode:: DELETE_SLICE+2 () + + Implements ``del TOS1[:TOS]``. + + +.. opcode:: DELETE_SLICE+3 () + + Implements ``del TOS2[TOS1:TOS]``. + + +.. opcode:: STORE_SUBSCR () + + Implements ``TOS1[TOS] = TOS2``. + + +.. opcode:: DELETE_SUBSCR () + + Implements ``del TOS1[TOS]``. + +Miscellaneous opcodes. + + +.. opcode:: PRINT_EXPR () + + Implements the expression statement for the interactive mode. TOS is removed + from the stack and printed. In non-interactive mode, an expression statement is + terminated with ``POP_STACK``. + + +.. opcode:: BREAK_LOOP () + + Terminates a loop due to a :keyword:`break` statement. + + +.. opcode:: CONTINUE_LOOP (target) + + Continues a loop due to a :keyword:`continue` statement. *target* is the + address to jump to (which should be a ``FOR_ITER`` instruction). + + +.. opcode:: SET_ADD () + + Calls ``set.add(TOS1, TOS)``. Used to implement set comprehensions. + + +.. opcode:: LIST_APPEND () + + Calls ``list.append(TOS1, TOS)``. Used to implement list comprehensions. + + +.. opcode:: LOAD_LOCALS () + + Pushes a reference to the locals of the current scope on the stack. This is used + in the code for a class definition: After the class body is evaluated, the + locals are passed to the class definition. + + +.. opcode:: RETURN_VALUE () + + Returns with TOS to the caller of the function. + + +.. opcode:: YIELD_VALUE () + + Pops ``TOS`` and yields it from a generator. + + +.. opcode:: IMPORT_STAR () + + Loads all symbols not starting with ``'_'`` directly from the module TOS to the + local namespace. The module is popped after loading all names. This opcode + implements ``from module import *``. + + +.. opcode:: POP_BLOCK () + + Removes one block from the block stack. Per frame, there is a stack of blocks, + denoting nested loops, try statements, and such. + + +.. opcode:: END_FINALLY () + + Terminates a :keyword:`finally` clause. The interpreter recalls whether the + exception has to be re-raised, or whether the function returns, and continues + with the outer-next block. + + +.. opcode:: BUILD_CLASS () + + Creates a new class object. TOS is the methods dictionary, TOS1 the tuple of + the names of the base classes, and TOS2 the class name. + +All of the following opcodes expect arguments. An argument is two bytes, with +the more significant byte last. + + +.. opcode:: STORE_NAME (namei) + + Implements ``name = TOS``. *namei* is the index of *name* in the attribute + :attr:`co_names` of the code object. The compiler tries to use ``STORE_LOCAL`` + or ``STORE_GLOBAL`` if possible. + + +.. opcode:: DELETE_NAME (namei) + + Implements ``del name``, where *namei* is the index into :attr:`co_names` + attribute of the code object. + + +.. opcode:: UNPACK_SEQUENCE (count) + + Unpacks TOS into *count* individual values, which are put onto the stack + right-to-left. + +.. % \begin{opcodedesc}{UNPACK_LIST}{count} +.. % This opcode is obsolete. +.. % \end{opcodedesc} +.. % \begin{opcodedesc}{UNPACK_ARG}{count} +.. % This opcode is obsolete. +.. % \end{opcodedesc} + + +.. opcode:: DUP_TOPX (count) + + Duplicate *count* items, keeping them in the same order. Due to implementation + limits, *count* should be between 1 and 5 inclusive. + + +.. opcode:: STORE_ATTR (namei) + + Implements ``TOS.name = TOS1``, where *namei* is the index of name in + :attr:`co_names`. + + +.. opcode:: DELETE_ATTR (namei) + + Implements ``del TOS.name``, using *namei* as index into :attr:`co_names`. + + +.. opcode:: STORE_GLOBAL (namei) + + Works as ``STORE_NAME``, but stores the name as a global. + + +.. opcode:: DELETE_GLOBAL (namei) + + Works as ``DELETE_NAME``, but deletes a global name. + +.. % \begin{opcodedesc}{UNPACK_VARARG}{argc} +.. % This opcode is obsolete. +.. % \end{opcodedesc} + + +.. opcode:: LOAD_CONST (consti) + + Pushes ``co_consts[consti]`` onto the stack. + + +.. opcode:: LOAD_NAME (namei) + + Pushes the value associated with ``co_names[namei]`` onto the stack. + + +.. opcode:: BUILD_TUPLE (count) + + Creates a tuple consuming *count* items from the stack, and pushes the resulting + tuple onto the stack. + + +.. opcode:: BUILD_LIST (count) + + Works as ``BUILD_TUPLE``, but creates a list. + + +.. opcode:: BUILD_SET (count) + + Works as ``BUILD_TUPLE``, but creates a set. + + +.. opcode:: BUILD_MAP (zero) + + Pushes a new empty dictionary object onto the stack. The argument is ignored + and set to zero by the compiler. + + +.. opcode:: LOAD_ATTR (namei) + + Replaces TOS with ``getattr(TOS, co_names[namei])``. + + +.. opcode:: COMPARE_OP (opname) + + Performs a Boolean operation. The operation name can be found in + ``cmp_op[opname]``. + + +.. opcode:: IMPORT_NAME (namei) + + Imports the module ``co_names[namei]``. The module object is pushed onto the + stack. The current namespace is not affected: for a proper import statement, a + subsequent ``STORE_FAST`` instruction modifies the namespace. + + +.. opcode:: IMPORT_FROM (namei) + + Loads the attribute ``co_names[namei]`` from the module found in TOS. The + resulting object is pushed onto the stack, to be subsequently stored by a + ``STORE_FAST`` instruction. + + +.. opcode:: JUMP_FORWARD (delta) + + Increments byte code counter by *delta*. + + +.. opcode:: JUMP_IF_TRUE (delta) + + If TOS is true, increment the byte code counter by *delta*. TOS is left on the + stack. + + +.. opcode:: JUMP_IF_FALSE (delta) + + If TOS is false, increment the byte code counter by *delta*. TOS is not + changed. + + +.. opcode:: JUMP_ABSOLUTE (target) + + Set byte code counter to *target*. + + +.. opcode:: FOR_ITER (delta) + + ``TOS`` is an iterator. Call its :meth:`__next__` method. If this yields a new + value, push it on the stack (leaving the iterator below it). If the iterator + indicates it is exhausted ``TOS`` is popped, and the byte code counter is + incremented by *delta*. + +.. % \begin{opcodedesc}{FOR_LOOP}{delta} +.. % This opcode is obsolete. +.. % \end{opcodedesc} +.. % \begin{opcodedesc}{LOAD_LOCAL}{namei} +.. % This opcode is obsolete. +.. % \end{opcodedesc} + + +.. opcode:: LOAD_GLOBAL (namei) + + Loads the global named ``co_names[namei]`` onto the stack. + +.. % \begin{opcodedesc}{SET_FUNC_ARGS}{argc} +.. % This opcode is obsolete. +.. % \end{opcodedesc} + + +.. opcode:: SETUP_LOOP (delta) + + Pushes a block for a loop onto the block stack. The block spans from the + current instruction with a size of *delta* bytes. + + +.. opcode:: SETUP_EXCEPT (delta) + + Pushes a try block from a try-except clause onto the block stack. *delta* points + to the first except block. + + +.. opcode:: SETUP_FINALLY (delta) + + Pushes a try block from a try-except clause onto the block stack. *delta* points + to the finally block. + + +.. opcode:: LOAD_FAST (var_num) + + Pushes a reference to the local ``co_varnames[var_num]`` onto the stack. + + +.. opcode:: STORE_FAST (var_num) + + Stores TOS into the local ``co_varnames[var_num]``. + + +.. opcode:: DELETE_FAST (var_num) + + Deletes local ``co_varnames[var_num]``. + + +.. opcode:: LOAD_CLOSURE (i) + + Pushes a reference to the cell contained in slot *i* of the cell and free + variable storage. The name of the variable is ``co_cellvars[i]`` if *i* is + less than the length of *co_cellvars*. Otherwise it is ``co_freevars[i - + len(co_cellvars)]``. + + +.. opcode:: LOAD_DEREF (i) + + Loads the cell contained in slot *i* of the cell and free variable storage. + Pushes a reference to the object the cell contains on the stack. + + +.. opcode:: STORE_DEREF (i) + + Stores TOS into the cell contained in slot *i* of the cell and free variable + storage. + + +.. opcode:: SET_LINENO (lineno) + + This opcode is obsolete. + + +.. opcode:: RAISE_VARARGS (argc) + + Raises an exception. *argc* indicates the number of parameters to the raise + statement, ranging from 0 to 3. The handler will find the traceback as TOS2, + the parameter as TOS1, and the exception as TOS. + + +.. opcode:: CALL_FUNCTION (argc) + + Calls a function. The low byte of *argc* indicates the number of positional + parameters, the high byte the number of keyword parameters. On the stack, the + opcode finds the keyword parameters first. For each keyword argument, the value + is on top of the key. Below the keyword parameters, the positional parameters + are on the stack, with the right-most parameter on top. Below the parameters, + the function object to call is on the stack. + + +.. opcode:: MAKE_FUNCTION (argc) + + Pushes a new function object on the stack. TOS is the code associated with the + function. The function object is defined to have *argc* default parameters, + which are found below TOS. + + +.. opcode:: MAKE_CLOSURE (argc) + + Creates a new function object, sets its *__closure__* slot, and pushes it on the + stack. TOS is the code associated with the function. If the code object has N + free variables, the next N items on the stack are the cells for these variables. + The function also has *argc* default parameters, where are found before the + cells. + + +.. opcode:: BUILD_SLICE (argc) + + .. index:: builtin: slice + + Pushes a slice object on the stack. *argc* must be 2 or 3. If it is 2, + ``slice(TOS1, TOS)`` is pushed; if it is 3, ``slice(TOS2, TOS1, TOS)`` is + pushed. See the ``slice()`` built-in function for more information. + + +.. opcode:: EXTENDED_ARG (ext) + + Prefixes any opcode which has an argument too big to fit into the default two + bytes. *ext* holds two additional bytes which, taken together with the + subsequent opcode's argument, comprise a four-byte argument, *ext* being the two + most-significant bytes. + + +.. opcode:: CALL_FUNCTION_VAR (argc) + + Calls a function. *argc* is interpreted as in ``CALL_FUNCTION``. The top element + on the stack contains the variable argument list, followed by keyword and + positional arguments. + + +.. opcode:: CALL_FUNCTION_KW (argc) + + Calls a function. *argc* is interpreted as in ``CALL_FUNCTION``. The top element + on the stack contains the keyword arguments dictionary, followed by explicit + keyword and positional arguments. + + +.. opcode:: CALL_FUNCTION_VAR_KW (argc) + + Calls a function. *argc* is interpreted as in ``CALL_FUNCTION``. The top + element on the stack contains the keyword arguments dictionary, followed by the + variable-arguments tuple, followed by explicit keyword and positional arguments. + + +.. opcode:: HAVE_ARGUMENT () + + This is not really an opcode. It identifies the dividing line between opcodes + which don't take arguments ``< HAVE_ARGUMENT`` and those which do ``>= + HAVE_ARGUMENT``. + diff --git a/Doc/library/distutils.rst b/Doc/library/distutils.rst new file mode 100644 index 0000000..534faab --- /dev/null +++ b/Doc/library/distutils.rst @@ -0,0 +1,30 @@ + +:mod:`distutils` --- Building and installing Python modules +=========================================================== + +.. module:: distutils + :synopsis: Support for building and installing Python modules into an existing Python + installation. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`distutils` package provides support for building and installing +additional modules into a Python installation. The new modules may be either +100%-pure Python, or may be extension modules written in C, or may be +collections of Python packages which include modules coded in both Python and C. + +This package is discussed in two separate chapters: + + +.. seealso:: + + :ref:`distutils-index` + The manual for developers and packagers of Python modules. This describes how + to prepare :mod:`distutils`\ -based packages so that they may be easily + installed into an existing Python installation. + + :ref:`install-index` + An "administrators" manual which includes information on installing modules into + an existing Python installation. You do not need to be a Python programmer to + read this manual. + diff --git a/Doc/library/dl.rst b/Doc/library/dl.rst new file mode 100644 index 0000000..ff42619 --- /dev/null +++ b/Doc/library/dl.rst @@ -0,0 +1,111 @@ + +:mod:`dl` --- Call C functions in shared objects +================================================ + +.. module:: dl + :platform: Unix + :synopsis: Call C functions in shared objects. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +.. % ?????????? Anyone???????????? + +The :mod:`dl` module defines an interface to the :cfunc:`dlopen` function, which +is the most common interface on Unix platforms for handling dynamically linked +libraries. It allows the program to call arbitrary functions in such a library. + +.. warning:: + + The :mod:`dl` module bypasses the Python type system and error handling. If + used incorrectly it may cause segmentation faults, crashes or other incorrect + behaviour. + +.. note:: + + This module will not work unless ``sizeof(int) == sizeof(long) == sizeof(char + *)`` If this is not the case, :exc:`SystemError` will be raised on import. + +The :mod:`dl` module defines the following function: + + +.. function:: open(name[, mode=RTLD_LAZY]) + + Open a shared object file, and return a handle. Mode signifies late binding + (:const:`RTLD_LAZY`) or immediate binding (:const:`RTLD_NOW`). Default is + :const:`RTLD_LAZY`. Note that some systems do not support :const:`RTLD_NOW`. + + Return value is a :class:`dlobject`. + +The :mod:`dl` module defines the following constants: + + +.. data:: RTLD_LAZY + + Useful as an argument to :func:`open`. + + +.. data:: RTLD_NOW + + Useful as an argument to :func:`open`. Note that on systems which do not + support immediate binding, this constant will not appear in the module. For + maximum portability, use :func:`hasattr` to determine if the system supports + immediate binding. + +The :mod:`dl` module defines the following exception: + + +.. exception:: error + + Exception raised when an error has occurred inside the dynamic loading and + linking routines. + +Example:: + + >>> import dl, time + >>> a=dl.open('/lib/libc.so.6') + >>> a.call('time'), time.time() + (929723914, 929723914.498) + +This example was tried on a Debian GNU/Linux system, and is a good example of +the fact that using this module is usually a bad alternative. + + +.. _dl-objects: + +Dl Objects +---------- + +Dl objects, as returned by :func:`open` above, have the following methods: + + +.. method:: dl.close() + + Free all resources, except the memory. + + +.. method:: dl.sym(name) + + Return the pointer for the function named *name*, as a number, if it exists in + the referenced shared object, otherwise ``None``. This is useful in code like:: + + >>> if a.sym('time'): + ... a.call('time') + ... else: + ... time.time() + + (Note that this function will return a non-zero number, as zero is the *NULL* + pointer) + + +.. method:: dl.call(name[, arg1[, arg2...]]) + + Call the function named *name* in the referenced shared object. The arguments + must be either Python integers, which will be passed as is, Python strings, to + which a pointer will be passed, or ``None``, which will be passed as *NULL*. + Note that strings should only be passed to functions as :ctype:`const char\*`, + as Python will not like its string mutated. + + There must be at most 10 arguments, and arguments not given will be treated as + ``None``. The function's return value must be a C :ctype:`long`, which is a + Python integer. + diff --git a/Doc/library/doctest.rst b/Doc/library/doctest.rst new file mode 100644 index 0000000..23f96e4 --- /dev/null +++ b/Doc/library/doctest.rst @@ -0,0 +1,1869 @@ +:mod:`doctest` --- Test interactive Python examples +=================================================== + +.. module:: doctest + :synopsis: Test pieces of code within docstrings. +.. moduleauthor:: Tim Peters <tim@python.org> +.. sectionauthor:: Tim Peters <tim@python.org> +.. sectionauthor:: Moshe Zadka <moshez@debian.org> +.. sectionauthor:: Edward Loper <edloper@users.sourceforge.net> + + +The :mod:`doctest` module searches for pieces of text that look like interactive +Python sessions, and then executes those sessions to verify that they work +exactly as shown. There are several common ways to use doctest: + +* To check that a module's docstrings are up-to-date by verifying that all + interactive examples still work as documented. + +* To perform regression testing by verifying that interactive examples from a + test file or a test object work as expected. + +* To write tutorial documentation for a package, liberally illustrated with + input-output examples. Depending on whether the examples or the expository text + are emphasized, this has the flavor of "literate testing" or "executable + documentation". + +Here's a complete but small example module:: + + """ + This is the "example" module. + + The example module supplies one function, factorial(). For example, + + >>> factorial(5) + 120 + """ + + def factorial(n): + """Return the factorial of n, an exact integer >= 0. + + If the result is small enough to fit in an int, return an int. + Else return a long. + + >>> [factorial(n) for n in range(6)] + [1, 1, 2, 6, 24, 120] + >>> [factorial(long(n)) for n in range(6)] + [1, 1, 2, 6, 24, 120] + >>> factorial(30) + 265252859812191058636308480000000L + >>> factorial(30L) + 265252859812191058636308480000000L + >>> factorial(-1) + Traceback (most recent call last): + ... + ValueError: n must be >= 0 + + Factorials of floats are OK, but the float must be an exact integer: + >>> factorial(30.1) + Traceback (most recent call last): + ... + ValueError: n must be exact integer + >>> factorial(30.0) + 265252859812191058636308480000000L + + It must also not be ridiculously large: + >>> factorial(1e100) + Traceback (most recent call last): + ... + OverflowError: n too large + """ + + +.. % allow LaTeX to break here. + +:: + + import math + if not n >= 0: + raise ValueError("n must be >= 0") + if math.floor(n) != n: + raise ValueError("n must be exact integer") + if n+1 == n: # catch a value like 1e300 + raise OverflowError("n too large") + result = 1 + factor = 2 + while factor <= n: + result *= factor + factor += 1 + return result + + def _test(): + import doctest + doctest.testmod() + + if __name__ == "__main__": + _test() + +If you run :file:`example.py` directly from the command line, :mod:`doctest` +works its magic:: + + $ python example.py + $ + +There's no output! That's normal, and it means all the examples worked. Pass +:option:`-v` to the script, and :mod:`doctest` prints a detailed log of what +it's trying, and prints a summary at the end:: + + $ python example.py -v + Trying: + factorial(5) + Expecting: + 120 + ok + Trying: + [factorial(n) for n in range(6)] + Expecting: + [1, 1, 2, 6, 24, 120] + ok + Trying: + [factorial(long(n)) for n in range(6)] + Expecting: + [1, 1, 2, 6, 24, 120] + ok + +And so on, eventually ending with:: + + Trying: + factorial(1e100) + Expecting: + Traceback (most recent call last): + ... + OverflowError: n too large + ok + 1 items had no tests: + __main__._test + 2 items passed all tests: + 1 tests in __main__ + 8 tests in __main__.factorial + 9 tests in 3 items. + 9 passed and 0 failed. + Test passed. + $ + +That's all you need to know to start making productive use of :mod:`doctest`! +Jump in. The following sections provide full details. Note that there are many +examples of doctests in the standard Python test suite and libraries. +Especially useful examples can be found in the standard test file +:file:`Lib/test/test_doctest.py`. + + +.. _doctest-simple-testmod: + +Simple Usage: Checking Examples in Docstrings +--------------------------------------------- + +The simplest way to start using doctest (but not necessarily the way you'll +continue to do it) is to end each module :mod:`M` with:: + + def _test(): + import doctest + doctest.testmod() + + if __name__ == "__main__": + _test() + +:mod:`doctest` then examines docstrings in module :mod:`M`. + +Running the module as a script causes the examples in the docstrings to get +executed and verified:: + + python M.py + +This won't display anything unless an example fails, in which case the failing +example(s) and the cause(s) of the failure(s) are printed to stdout, and the +final line of output is ``***Test Failed*** N failures.``, where *N* is the +number of examples that failed. + +Run it with the :option:`-v` switch instead:: + + python M.py -v + +and a detailed report of all examples tried is printed to standard output, along +with assorted summaries at the end. + +You can force verbose mode by passing ``verbose=True`` to :func:`testmod`, or +prohibit it by passing ``verbose=False``. In either of those cases, +``sys.argv`` is not examined by :func:`testmod` (so passing :option:`-v` or not +has no effect). + +Since Python 2.6, there is also a command line shortcut for running +:func:`testmod`. You can instruct the Python interpreter to run the doctest +module directly from the standard library and pass the module name(s) on the +command line:: + + python -m doctest -v example.py + +This will import :file:`example.py` as a standalone module and run +:func:`testmod` on it. Note that this may not work correctly if the file is +part of a package and imports other submodules from that package. + +For more information on :func:`testmod`, see section :ref:`doctest-basic-api`. + + +.. _doctest-simple-testfile: + +Simple Usage: Checking Examples in a Text File +---------------------------------------------- + +Another simple application of doctest is testing interactive examples in a text +file. This can be done with the :func:`testfile` function:: + + import doctest + doctest.testfile("example.txt") + +That short script executes and verifies any interactive Python examples +contained in the file :file:`example.txt`. The file content is treated as if it +were a single giant docstring; the file doesn't need to contain a Python +program! For example, perhaps :file:`example.txt` contains this:: + + The ``example`` module + ====================== + + Using ``factorial`` + ------------------- + + This is an example text file in reStructuredText format. First import + ``factorial`` from the ``example`` module: + + >>> from example import factorial + + Now use it: + + >>> factorial(6) + 120 + +Running ``doctest.testfile("example.txt")`` then finds the error in this +documentation:: + + File "./example.txt", line 14, in example.txt + Failed example: + factorial(6) + Expected: + 120 + Got: + 720 + +As with :func:`testmod`, :func:`testfile` won't display anything unless an +example fails. If an example does fail, then the failing example(s) and the +cause(s) of the failure(s) are printed to stdout, using the same format as +:func:`testmod`. + +By default, :func:`testfile` looks for files in the calling module's directory. +See section :ref:`doctest-basic-api` for a description of the optional arguments +that can be used to tell it to look for files in other locations. + +Like :func:`testmod`, :func:`testfile`'s verbosity can be set with the +:option:`-v` command-line switch or with the optional keyword argument +*verbose*. + +Since Python 2.6, there is also a command line shortcut for running +:func:`testfile`. You can instruct the Python interpreter to run the doctest +module directly from the standard library and pass the file name(s) on the +command line:: + + python -m doctest -v example.txt + +Because the file name does not end with :file:`.py`, :mod:`doctest` infers that +it must be run with :func:`testfile`, not :func:`testmod`. + +For more information on :func:`testfile`, see section :ref:`doctest-basic-api`. + + +.. _doctest-how-it-works: + +How It Works +------------ + +This section examines in detail how doctest works: which docstrings it looks at, +how it finds interactive examples, what execution context it uses, how it +handles exceptions, and how option flags can be used to control its behavior. +This is the information that you need to know to write doctest examples; for +information about actually running doctest on these examples, see the following +sections. + + +.. _doctest-which-docstrings: + +Which Docstrings Are Examined? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The module docstring, and all function, class and method docstrings are +searched. Objects imported into the module are not searched. + +In addition, if ``M.__test__`` exists and "is true", it must be a dict, and each +entry maps a (string) name to a function object, class object, or string. +Function and class object docstrings found from ``M.__test__`` are searched, and +strings are treated as if they were docstrings. In output, a key ``K`` in +``M.__test__`` appears with name :: + + <name of M>.__test__.K + +Any classes found are recursively searched similarly, to test docstrings in +their contained methods and nested classes. + +.. versionchanged:: 2.4 + A "private name" concept is deprecated and no longer documented. + + +.. _doctest-finding-examples: + +How are Docstring Examples Recognized? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In most cases a copy-and-paste of an interactive console session works fine, but +doctest isn't trying to do an exact emulation of any specific Python shell. All +hard tab characters are expanded to spaces, using 8-column tab stops. If you +don't believe tabs should mean that, too bad: don't use hard tabs, or write +your own :class:`DocTestParser` class. + +.. versionchanged:: 2.4 + Expanding tabs to spaces is new; previous versions tried to preserve hard tabs, + with confusing results. + +:: + + >>> # comments are ignored + >>> x = 12 + >>> x + 12 + >>> if x == 13: + ... print "yes" + ... else: + ... print "no" + ... print "NO" + ... print "NO!!!" + ... + no + NO + NO!!! + >>> + +Any expected output must immediately follow the final ``'>>> '`` or ``'... '`` +line containing the code, and the expected output (if any) extends to the next +``'>>> '`` or all-whitespace line. + +The fine print: + +* Expected output cannot contain an all-whitespace line, since such a line is + taken to signal the end of expected output. If expected output does contain a + blank line, put ``<BLANKLINE>`` in your doctest example each place a blank line + is expected. + + .. versionchanged:: 2.4 + ``<BLANKLINE>`` was added; there was no way to use expected output containing + empty lines in previous versions. + +* Output to stdout is captured, but not output to stderr (exception tracebacks + are captured via a different means). + +* If you continue a line via backslashing in an interactive session, or for any + other reason use a backslash, you should use a raw docstring, which will + preserve your backslashes exactly as you type them:: + + >>> def f(x): + ... r'''Backslashes in a raw docstring: m\n''' + >>> print f.__doc__ + Backslashes in a raw docstring: m\n + + Otherwise, the backslash will be interpreted as part of the string. For example, + the "\\" above would be interpreted as a newline character. Alternatively, you + can double each backslash in the doctest version (and not use a raw string):: + + >>> def f(x): + ... '''Backslashes in a raw docstring: m\\n''' + >>> print f.__doc__ + Backslashes in a raw docstring: m\n + +* The starting column doesn't matter:: + + >>> assert "Easy!" + >>> import math + >>> math.floor(1.9) + 1.0 + + and as many leading whitespace characters are stripped from the expected output + as appeared in the initial ``'>>> '`` line that started the example. + + +.. _doctest-execution-context: + +What's the Execution Context? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +By default, each time :mod:`doctest` finds a docstring to test, it uses a +*shallow copy* of :mod:`M`'s globals, so that running tests doesn't change the +module's real globals, and so that one test in :mod:`M` can't leave behind +crumbs that accidentally allow another test to work. This means examples can +freely use any names defined at top-level in :mod:`M`, and names defined earlier +in the docstring being run. Examples cannot see names defined in other +docstrings. + +You can force use of your own dict as the execution context by passing +``globs=your_dict`` to :func:`testmod` or :func:`testfile` instead. + + +.. _doctest-exceptions: + +What About Exceptions? +^^^^^^^^^^^^^^^^^^^^^^ + +No problem, provided that the traceback is the only output produced by the +example: just paste in the traceback. [#]_ Since tracebacks contain details +that are likely to change rapidly (for example, exact file paths and line +numbers), this is one case where doctest works hard to be flexible in what it +accepts. + +Simple example:: + + >>> [1, 2, 3].remove(42) + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: list.remove(x): x not in list + +That doctest succeeds if :exc:`ValueError` is raised, with the ``list.remove(x): +x not in list`` detail as shown. + +The expected output for an exception must start with a traceback header, which +may be either of the following two lines, indented the same as the first line of +the example:: + + Traceback (most recent call last): + Traceback (innermost last): + +The traceback header is followed by an optional traceback stack, whose contents +are ignored by doctest. The traceback stack is typically omitted, or copied +verbatim from an interactive session. + +The traceback stack is followed by the most interesting part: the line(s) +containing the exception type and detail. This is usually the last line of a +traceback, but can extend across multiple lines if the exception has a +multi-line detail:: + + >>> raise ValueError('multi\n line\ndetail') + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: multi + line + detail + +The last three lines (starting with :exc:`ValueError`) are compared against the +exception's type and detail, and the rest are ignored. + +Best practice is to omit the traceback stack, unless it adds significant +documentation value to the example. So the last example is probably better as:: + + >>> raise ValueError('multi\n line\ndetail') + Traceback (most recent call last): + ... + ValueError: multi + line + detail + +Note that tracebacks are treated very specially. In particular, in the +rewritten example, the use of ``...`` is independent of doctest's +:const:`ELLIPSIS` option. The ellipsis in that example could be left out, or +could just as well be three (or three hundred) commas or digits, or an indented +transcript of a Monty Python skit. + +Some details you should read once, but won't need to remember: + +* Doctest can't guess whether your expected output came from an exception + traceback or from ordinary printing. So, e.g., an example that expects + ``ValueError: 42 is prime`` will pass whether :exc:`ValueError` is actually + raised or if the example merely prints that traceback text. In practice, + ordinary output rarely begins with a traceback header line, so this doesn't + create real problems. + +* Each line of the traceback stack (if present) must be indented further than + the first line of the example, *or* start with a non-alphanumeric character. + The first line following the traceback header indented the same and starting + with an alphanumeric is taken to be the start of the exception detail. Of + course this does the right thing for genuine tracebacks. + +* When the :const:`IGNORE_EXCEPTION_DETAIL` doctest option is is specified, + everything following the leftmost colon is ignored. + +* The interactive shell omits the traceback header line for some + :exc:`SyntaxError`\ s. But doctest uses the traceback header line to + distinguish exceptions from non-exceptions. So in the rare case where you need + to test a :exc:`SyntaxError` that omits the traceback header, you will need to + manually add the traceback header line to your test example. + +* For some :exc:`SyntaxError`\ s, Python displays the character position of the + syntax error, using a ``^`` marker:: + + >>> 1 1 + File "<stdin>", line 1 + 1 1 + ^ + SyntaxError: invalid syntax + + Since the lines showing the position of the error come before the exception type + and detail, they are not checked by doctest. For example, the following test + would pass, even though it puts the ``^`` marker in the wrong location:: + + >>> 1 1 + Traceback (most recent call last): + File "<stdin>", line 1 + 1 1 + ^ + SyntaxError: invalid syntax + +.. versionchanged:: 2.4 + The ability to handle a multi-line exception detail, and the + :const:`IGNORE_EXCEPTION_DETAIL` doctest option, were added. + + +.. _doctest-options: + +Option Flags and Directives +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A number of option flags control various aspects of doctest's behavior. +Symbolic names for the flags are supplied as module constants, which can be +or'ed together and passed to various functions. The names can also be used in +doctest directives (see below). + +The first group of options define test semantics, controlling aspects of how +doctest decides whether actual output matches an example's expected output: + + +.. data:: DONT_ACCEPT_TRUE_FOR_1 + + By default, if an expected output block contains just ``1``, an actual output + block containing just ``1`` or just ``True`` is considered to be a match, and + similarly for ``0`` versus ``False``. When :const:`DONT_ACCEPT_TRUE_FOR_1` is + specified, neither substitution is allowed. The default behavior caters to that + Python changed the return type of many functions from integer to boolean; + doctests expecting "little integer" output still work in these cases. This + option will probably go away, but not for several years. + + +.. data:: DONT_ACCEPT_BLANKLINE + + By default, if an expected output block contains a line containing only the + string ``<BLANKLINE>``, then that line will match a blank line in the actual + output. Because a genuinely blank line delimits the expected output, this is + the only way to communicate that a blank line is expected. When + :const:`DONT_ACCEPT_BLANKLINE` is specified, this substitution is not allowed. + + +.. data:: NORMALIZE_WHITESPACE + + When specified, all sequences of whitespace (blanks and newlines) are treated as + equal. Any sequence of whitespace within the expected output will match any + sequence of whitespace within the actual output. By default, whitespace must + match exactly. :const:`NORMALIZE_WHITESPACE` is especially useful when a line of + expected output is very long, and you want to wrap it across multiple lines in + your source. + + +.. data:: ELLIPSIS + + When specified, an ellipsis marker (``...``) in the expected output can match + any substring in the actual output. This includes substrings that span line + boundaries, and empty substrings, so it's best to keep usage of this simple. + Complicated uses can lead to the same kinds of "oops, it matched too much!" + surprises that ``.*`` is prone to in regular expressions. + + +.. data:: IGNORE_EXCEPTION_DETAIL + + When specified, an example that expects an exception passes if an exception of + the expected type is raised, even if the exception detail does not match. For + example, an example expecting ``ValueError: 42`` will pass if the actual + exception raised is ``ValueError: 3*14``, but will fail, e.g., if + :exc:`TypeError` is raised. + + Note that a similar effect can be obtained using :const:`ELLIPSIS`, and + :const:`IGNORE_EXCEPTION_DETAIL` may go away when Python releases prior to 2.4 + become uninteresting. Until then, :const:`IGNORE_EXCEPTION_DETAIL` is the only + clear way to write a doctest that doesn't care about the exception detail yet + continues to pass under Python releases prior to 2.4 (doctest directives appear + to be comments to them). For example, :: + + >>> (1, 2)[3] = 'moo' #doctest: +IGNORE_EXCEPTION_DETAIL + Traceback (most recent call last): + File "<stdin>", line 1, in ? + TypeError: object doesn't support item assignment + + passes under Python 2.4 and Python 2.3. The detail changed in 2.4, to say "does + not" instead of "doesn't". + + +.. data:: SKIP + + When specified, do not run the example at all. This can be useful in contexts + where doctest examples serve as both documentation and test cases, and an + example should be included for documentation purposes, but should not be + checked. E.g., the example's output might be random; or the example might + depend on resources which would be unavailable to the test driver. + + The SKIP flag can also be used for temporarily "commenting out" examples. + + +.. data:: COMPARISON_FLAGS + + A bitmask or'ing together all the comparison flags above. + +The second group of options controls how test failures are reported: + + +.. data:: REPORT_UDIFF + + When specified, failures that involve multi-line expected and actual outputs are + displayed using a unified diff. + + +.. data:: REPORT_CDIFF + + When specified, failures that involve multi-line expected and actual outputs + will be displayed using a context diff. + + +.. data:: REPORT_NDIFF + + When specified, differences are computed by ``difflib.Differ``, using the same + algorithm as the popular :file:`ndiff.py` utility. This is the only method that + marks differences within lines as well as across lines. For example, if a line + of expected output contains digit ``1`` where actual output contains letter + ``l``, a line is inserted with a caret marking the mismatching column positions. + + +.. data:: REPORT_ONLY_FIRST_FAILURE + + When specified, display the first failing example in each doctest, but suppress + output for all remaining examples. This will prevent doctest from reporting + correct examples that break because of earlier failures; but it might also hide + incorrect examples that fail independently of the first failure. When + :const:`REPORT_ONLY_FIRST_FAILURE` is specified, the remaining examples are + still run, and still count towards the total number of failures reported; only + the output is suppressed. + + +.. data:: REPORTING_FLAGS + + A bitmask or'ing together all the reporting flags above. + +"Doctest directives" may be used to modify the option flags for individual +examples. Doctest directives are expressed as a special Python comment +following an example's source code: + +.. productionlist:: doctest + directive: "#" "doctest:" `directive_options` + directive_options: `directive_option` ("," `directive_option`)\* + directive_option: `on_or_off` `directive_option_name` + on_or_off: "+" \| "-" + directive_option_name: "DONT_ACCEPT_BLANKLINE" \| "NORMALIZE_WHITESPACE" \| ... + +Whitespace is not allowed between the ``+`` or ``-`` and the directive option +name. The directive option name can be any of the option flag names explained +above. + +An example's doctest directives modify doctest's behavior for that single +example. Use ``+`` to enable the named behavior, or ``-`` to disable it. + +For example, this test passes:: + + >>> print range(20) #doctest: +NORMALIZE_WHITESPACE + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, + 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] + +Without the directive it would fail, both because the actual output doesn't have +two blanks before the single-digit list elements, and because the actual output +is on a single line. This test also passes, and also requires a directive to do +so:: + + >>> print range(20) # doctest:+ELLIPSIS + [0, 1, ..., 18, 19] + +Multiple directives can be used on a single physical line, separated by commas:: + + >>> print range(20) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE + [0, 1, ..., 18, 19] + +If multiple directive comments are used for a single example, then they are +combined:: + + >>> print range(20) # doctest: +ELLIPSIS + ... # doctest: +NORMALIZE_WHITESPACE + [0, 1, ..., 18, 19] + +As the previous example shows, you can add ``...`` lines to your example +containing only directives. This can be useful when an example is too long for +a directive to comfortably fit on the same line:: + + >>> print range(5) + range(10,20) + range(30,40) + range(50,60) + ... # doctest: +ELLIPSIS + [0, ..., 4, 10, ..., 19, 30, ..., 39, 50, ..., 59] + +Note that since all options are disabled by default, and directives apply only +to the example they appear in, enabling options (via ``+`` in a directive) is +usually the only meaningful choice. However, option flags can also be passed to +functions that run doctests, establishing different defaults. In such cases, +disabling an option via ``-`` in a directive can be useful. + +.. versionchanged:: 2.4 + Constants :const:`DONT_ACCEPT_BLANKLINE`, :const:`NORMALIZE_WHITESPACE`, + :const:`ELLIPSIS`, :const:`IGNORE_EXCEPTION_DETAIL`, :const:`REPORT_UDIFF`, + :const:`REPORT_CDIFF`, :const:`REPORT_NDIFF`, + :const:`REPORT_ONLY_FIRST_FAILURE`, :const:`COMPARISON_FLAGS` and + :const:`REPORTING_FLAGS` were added; by default ``<BLANKLINE>`` in expected + output matches an empty line in actual output; and doctest directives were + added. + +.. versionchanged:: 2.5 + Constant :const:`SKIP` was added. + +There's also a way to register new option flag names, although this isn't useful +unless you intend to extend :mod:`doctest` internals via subclassing: + + +.. function:: register_optionflag(name) + + Create a new option flag with a given name, and return the new flag's integer + value. :func:`register_optionflag` can be used when subclassing + :class:`OutputChecker` or :class:`DocTestRunner` to create new options that are + supported by your subclasses. :func:`register_optionflag` should always be + called using the following idiom:: + + MY_FLAG = register_optionflag('MY_FLAG') + + .. versionadded:: 2.4 + + +.. _doctest-warnings: + +Warnings +^^^^^^^^ + +:mod:`doctest` is serious about requiring exact matches in expected output. If +even a single character doesn't match, the test fails. This will probably +surprise you a few times, as you learn exactly what Python does and doesn't +guarantee about output. For example, when printing a dict, Python doesn't +guarantee that the key-value pairs will be printed in any particular order, so a +test like + +.. % Hey! What happened to Monty Python examples? +.. % Tim: ask Guido -- it's his example! + +:: + + >>> foo() + {"Hermione": "hippogryph", "Harry": "broomstick"} + +is vulnerable! One workaround is to do :: + + >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"} + True + +instead. Another is to do :: + + >>> d = foo().items() + >>> d.sort() + >>> d + [('Harry', 'broomstick'), ('Hermione', 'hippogryph')] + +There are others, but you get the idea. + +Another bad idea is to print things that embed an object address, like :: + + >>> id(1.0) # certain to fail some of the time + 7948648 + >>> class C: pass + >>> C() # the default repr() for instances embeds an address + <__main__.C instance at 0x00AC18F0> + +The :const:`ELLIPSIS` directive gives a nice approach for the last example:: + + >>> C() #doctest: +ELLIPSIS + <__main__.C instance at 0x...> + +Floating-point numbers are also subject to small output variations across +platforms, because Python defers to the platform C library for float formatting, +and C libraries vary widely in quality here. :: + + >>> 1./7 # risky + 0.14285714285714285 + >>> print 1./7 # safer + 0.142857142857 + >>> print round(1./7, 6) # much safer + 0.142857 + +Numbers of the form ``I/2.**J`` are safe across all platforms, and I often +contrive doctest examples to produce numbers of that form:: + + >>> 3./4 # utterly safe + 0.75 + +Simple fractions are also easier for people to understand, and that makes for +better documentation. + + +.. _doctest-basic-api: + +Basic API +--------- + +The functions :func:`testmod` and :func:`testfile` provide a simple interface to +doctest that should be sufficient for most basic uses. For a less formal +introduction to these two functions, see sections :ref:`doctest-simple-testmod` +and :ref:`doctest-simple-testfile`. + + +.. function:: testfile(filename[, module_relative][, name][, package][, globs][, verbose][, report][, optionflags][, extraglobs][, raise_on_error][, parser][, encoding]) + + All arguments except *filename* are optional, and should be specified in keyword + form. + + Test examples in the file named *filename*. Return ``(failure_count, + test_count)``. + + Optional argument *module_relative* specifies how the filename should be + interpreted: + + * If *module_relative* is ``True`` (the default), then *filename* specifies an + OS-independent module-relative path. By default, this path is relative to the + calling module's directory; but if the *package* argument is specified, then it + is relative to that package. To ensure OS-independence, *filename* should use + ``/`` characters to separate path segments, and may not be an absolute path + (i.e., it may not begin with ``/``). + + * If *module_relative* is ``False``, then *filename* specifies an OS-specific + path. The path may be absolute or relative; relative paths are resolved with + respect to the current working directory. + + Optional argument *name* gives the name of the test; by default, or if ``None``, + ``os.path.basename(filename)`` is used. + + Optional argument *package* is a Python package or the name of a Python package + whose directory should be used as the base directory for a module-relative + filename. If no package is specified, then the calling module's directory is + used as the base directory for module-relative filenames. It is an error to + specify *package* if *module_relative* is ``False``. + + Optional argument *globs* gives a dict to be used as the globals when executing + examples. A new shallow copy of this dict is created for the doctest, so its + examples start with a clean slate. By default, or if ``None``, a new empty dict + is used. + + Optional argument *extraglobs* gives a dict merged into the globals used to + execute examples. This works like :meth:`dict.update`: if *globs* and + *extraglobs* have a common key, the associated value in *extraglobs* appears in + the combined dict. By default, or if ``None``, no extra globals are used. This + is an advanced feature that allows parameterization of doctests. For example, a + doctest can be written for a base class, using a generic name for the class, + then reused to test any number of subclasses by passing an *extraglobs* dict + mapping the generic name to the subclass to be tested. + + Optional argument *verbose* prints lots of stuff if true, and prints only + failures if false; by default, or if ``None``, it's true if and only if ``'-v'`` + is in ``sys.argv``. + + Optional argument *report* prints a summary at the end when true, else prints + nothing at the end. In verbose mode, the summary is detailed, else the summary + is very brief (in fact, empty if all tests passed). + + Optional argument *optionflags* or's together option flags. See section + :ref:`doctest-options`. + + Optional argument *raise_on_error* defaults to false. If true, an exception is + raised upon the first failure or unexpected exception in an example. This + allows failures to be post-mortem debugged. Default behavior is to continue + running examples. + + Optional argument *parser* specifies a :class:`DocTestParser` (or subclass) that + should be used to extract tests from the files. It defaults to a normal parser + (i.e., ``DocTestParser()``). + + Optional argument *encoding* specifies an encoding that should be used to + convert the file to unicode. + + .. versionadded:: 2.4 + + .. versionchanged:: 2.5 + The parameter *encoding* was added. + + +.. function:: testmod([m][, name][, globs][, verbose][, report][, optionflags][, extraglobs][, raise_on_error][, exclude_empty]) + + All arguments are optional, and all except for *m* should be specified in + keyword form. + + Test examples in docstrings in functions and classes reachable from module *m* + (or module :mod:`__main__` if *m* is not supplied or is ``None``), starting with + ``m.__doc__``. + + Also test examples reachable from dict ``m.__test__``, if it exists and is not + ``None``. ``m.__test__`` maps names (strings) to functions, classes and + strings; function and class docstrings are searched for examples; strings are + searched directly, as if they were docstrings. + + Only docstrings attached to objects belonging to module *m* are searched. + + Return ``(failure_count, test_count)``. + + Optional argument *name* gives the name of the module; by default, or if + ``None``, ``m.__name__`` is used. + + Optional argument *exclude_empty* defaults to false. If true, objects for which + no doctests are found are excluded from consideration. The default is a backward + compatibility hack, so that code still using :meth:`doctest.master.summarize` in + conjunction with :func:`testmod` continues to get output for objects with no + tests. The *exclude_empty* argument to the newer :class:`DocTestFinder` + constructor defaults to true. + + Optional arguments *extraglobs*, *verbose*, *report*, *optionflags*, + *raise_on_error*, and *globs* are the same as for function :func:`testfile` + above, except that *globs* defaults to ``m.__dict__``. + + .. versionchanged:: 2.3 + The parameter *optionflags* was added. + + .. versionchanged:: 2.4 + The parameters *extraglobs*, *raise_on_error* and *exclude_empty* were added. + + .. versionchanged:: 2.5 + The optional argument *isprivate*, deprecated in 2.4, was removed. + +There's also a function to run the doctests associated with a single object. +This function is provided for backward compatibility. There are no plans to +deprecate it, but it's rarely useful: + + +.. function:: run_docstring_examples(f, globs[, verbose][, name][, compileflags][, optionflags]) + + Test examples associated with object *f*; for example, *f* may be a module, + function, or class object. + + A shallow copy of dictionary argument *globs* is used for the execution context. + + Optional argument *name* is used in failure messages, and defaults to + ``"NoName"``. + + If optional argument *verbose* is true, output is generated even if there are no + failures. By default, output is generated only in case of an example failure. + + Optional argument *compileflags* gives the set of flags that should be used by + the Python compiler when running the examples. By default, or if ``None``, + flags are deduced corresponding to the set of future features found in *globs*. + + Optional argument *optionflags* works as for function :func:`testfile` above. + + +.. _doctest-unittest-api: + +Unittest API +------------ + +As your collection of doctest'ed modules grows, you'll want a way to run all +their doctests systematically. Prior to Python 2.4, :mod:`doctest` had a barely +documented :class:`Tester` class that supplied a rudimentary way to combine +doctests from multiple modules. :class:`Tester` was feeble, and in practice most +serious Python testing frameworks build on the :mod:`unittest` module, which +supplies many flexible ways to combine tests from multiple sources. So, in +Python 2.4, :mod:`doctest`'s :class:`Tester` class is deprecated, and +:mod:`doctest` provides two functions that can be used to create :mod:`unittest` +test suites from modules and text files containing doctests. These test suites +can then be run using :mod:`unittest` test runners:: + + import unittest + import doctest + import my_module_with_doctests, and_another + + suite = unittest.TestSuite() + for mod in my_module_with_doctests, and_another: + suite.addTest(doctest.DocTestSuite(mod)) + runner = unittest.TextTestRunner() + runner.run(suite) + +There are two main functions for creating :class:`unittest.TestSuite` instances +from text files and modules with doctests: + + +.. function:: DocFileSuite([module_relative][, package][, setUp][, tearDown][, globs][, optionflags][, parser][, encoding]) + + Convert doctest tests from one or more text files to a + :class:`unittest.TestSuite`. + + The returned :class:`unittest.TestSuite` is to be run by the unittest framework + and runs the interactive examples in each file. If an example in any file + fails, then the synthesized unit test fails, and a :exc:`failureException` + exception is raised showing the name of the file containing the test and a + (sometimes approximate) line number. + + Pass one or more paths (as strings) to text files to be examined. + + Options may be provided as keyword arguments: + + Optional argument *module_relative* specifies how the filenames in *paths* + should be interpreted: + + * If *module_relative* is ``True`` (the default), then each filename specifies + an OS-independent module-relative path. By default, this path is relative to + the calling module's directory; but if the *package* argument is specified, then + it is relative to that package. To ensure OS-independence, each filename should + use ``/`` characters to separate path segments, and may not be an absolute path + (i.e., it may not begin with ``/``). + + * If *module_relative* is ``False``, then each filename specifies an OS-specific + path. The path may be absolute or relative; relative paths are resolved with + respect to the current working directory. + + Optional argument *package* is a Python package or the name of a Python package + whose directory should be used as the base directory for module-relative + filenames. If no package is specified, then the calling module's directory is + used as the base directory for module-relative filenames. It is an error to + specify *package* if *module_relative* is ``False``. + + Optional argument *setUp* specifies a set-up function for the test suite. This + is called before running the tests in each file. The *setUp* function will be + passed a :class:`DocTest` object. The setUp function can access the test + globals as the *globs* attribute of the test passed. + + Optional argument *tearDown* specifies a tear-down function for the test suite. + This is called after running the tests in each file. The *tearDown* function + will be passed a :class:`DocTest` object. The setUp function can access the + test globals as the *globs* attribute of the test passed. + + Optional argument *globs* is a dictionary containing the initial global + variables for the tests. A new copy of this dictionary is created for each + test. By default, *globs* is a new empty dictionary. + + Optional argument *optionflags* specifies the default doctest options for the + tests, created by or-ing together individual option flags. See section + :ref:`doctest-options`. See function :func:`set_unittest_reportflags` below for + a better way to set reporting options. + + Optional argument *parser* specifies a :class:`DocTestParser` (or subclass) that + should be used to extract tests from the files. It defaults to a normal parser + (i.e., ``DocTestParser()``). + + Optional argument *encoding* specifies an encoding that should be used to + convert the file to unicode. + + .. versionadded:: 2.4 + + .. versionchanged:: 2.5 + The global ``__file__`` was added to the globals provided to doctests loaded + from a text file using :func:`DocFileSuite`. + + .. versionchanged:: 2.5 + The parameter *encoding* was added. + + +.. function:: DocTestSuite([module][, globs][, extraglobs][, test_finder][, setUp][, tearDown][, checker]) + + Convert doctest tests for a module to a :class:`unittest.TestSuite`. + + The returned :class:`unittest.TestSuite` is to be run by the unittest framework + and runs each doctest in the module. If any of the doctests fail, then the + synthesized unit test fails, and a :exc:`failureException` exception is raised + showing the name of the file containing the test and a (sometimes approximate) + line number. + + Optional argument *module* provides the module to be tested. It can be a module + object or a (possibly dotted) module name. If not specified, the module calling + this function is used. + + Optional argument *globs* is a dictionary containing the initial global + variables for the tests. A new copy of this dictionary is created for each + test. By default, *globs* is a new empty dictionary. + + Optional argument *extraglobs* specifies an extra set of global variables, which + is merged into *globs*. By default, no extra globals are used. + + Optional argument *test_finder* is the :class:`DocTestFinder` object (or a + drop-in replacement) that is used to extract doctests from the module. + + Optional arguments *setUp*, *tearDown*, and *optionflags* are the same as for + function :func:`DocFileSuite` above. + + .. versionadded:: 2.3 + + .. versionchanged:: 2.4 + The parameters *globs*, *extraglobs*, *test_finder*, *setUp*, *tearDown*, and + *optionflags* were added; this function now uses the same search technique as + :func:`testmod`. + +Under the covers, :func:`DocTestSuite` creates a :class:`unittest.TestSuite` out +of :class:`doctest.DocTestCase` instances, and :class:`DocTestCase` is a +subclass of :class:`unittest.TestCase`. :class:`DocTestCase` isn't documented +here (it's an internal detail), but studying its code can answer questions about +the exact details of :mod:`unittest` integration. + +Similarly, :func:`DocFileSuite` creates a :class:`unittest.TestSuite` out of +:class:`doctest.DocFileCase` instances, and :class:`DocFileCase` is a subclass +of :class:`DocTestCase`. + +So both ways of creating a :class:`unittest.TestSuite` run instances of +:class:`DocTestCase`. This is important for a subtle reason: when you run +:mod:`doctest` functions yourself, you can control the :mod:`doctest` options in +use directly, by passing option flags to :mod:`doctest` functions. However, if +you're writing a :mod:`unittest` framework, :mod:`unittest` ultimately controls +when and how tests get run. The framework author typically wants to control +:mod:`doctest` reporting options (perhaps, e.g., specified by command line +options), but there's no way to pass options through :mod:`unittest` to +:mod:`doctest` test runners. + +For this reason, :mod:`doctest` also supports a notion of :mod:`doctest` +reporting flags specific to :mod:`unittest` support, via this function: + + +.. function:: set_unittest_reportflags(flags) + + Set the :mod:`doctest` reporting flags to use. + + Argument *flags* or's together option flags. See section + :ref:`doctest-options`. Only "reporting flags" can be used. + + This is a module-global setting, and affects all future doctests run by module + :mod:`unittest`: the :meth:`runTest` method of :class:`DocTestCase` looks at + the option flags specified for the test case when the :class:`DocTestCase` + instance was constructed. If no reporting flags were specified (which is the + typical and expected case), :mod:`doctest`'s :mod:`unittest` reporting flags are + or'ed into the option flags, and the option flags so augmented are passed to the + :class:`DocTestRunner` instance created to run the doctest. If any reporting + flags were specified when the :class:`DocTestCase` instance was constructed, + :mod:`doctest`'s :mod:`unittest` reporting flags are ignored. + + The value of the :mod:`unittest` reporting flags in effect before the function + was called is returned by the function. + + .. versionadded:: 2.4 + + +.. _doctest-advanced-api: + +Advanced API +------------ + +The basic API is a simple wrapper that's intended to make doctest easy to use. +It is fairly flexible, and should meet most users' needs; however, if you +require more fine-grained control over testing, or wish to extend doctest's +capabilities, then you should use the advanced API. + +The advanced API revolves around two container classes, which are used to store +the interactive examples extracted from doctest cases: + +* :class:`Example`: A single python statement, paired with its expected output. + +* :class:`DocTest`: A collection of :class:`Example`\ s, typically extracted + from a single docstring or text file. + +Additional processing classes are defined to find, parse, and run, and check +doctest examples: + +* :class:`DocTestFinder`: Finds all docstrings in a given module, and uses a + :class:`DocTestParser` to create a :class:`DocTest` from every docstring that + contains interactive examples. + +* :class:`DocTestParser`: Creates a :class:`DocTest` object from a string (such + as an object's docstring). + +* :class:`DocTestRunner`: Executes the examples in a :class:`DocTest`, and uses + an :class:`OutputChecker` to verify their output. + +* :class:`OutputChecker`: Compares the actual output from a doctest example with + the expected output, and decides whether they match. + +The relationships among these processing classes are summarized in the following +diagram:: + + list of: + +------+ +---------+ + |module| --DocTestFinder-> | DocTest | --DocTestRunner-> results + +------+ | ^ +---------+ | ^ (printed) + | | | Example | | | + v | | ... | v | + DocTestParser | Example | OutputChecker + +---------+ + + +.. _doctest-doctest: + +DocTest Objects +^^^^^^^^^^^^^^^ + + +.. class:: DocTest(examples, globs, name, filename, lineno, docstring) + + A collection of doctest examples that should be run in a single namespace. The + constructor arguments are used to initialize the member variables of the same + names. + + .. versionadded:: 2.4 + +:class:`DocTest` defines the following member variables. They are initialized +by the constructor, and should not be modified directly. + + +.. attribute:: DocTest.examples + + A list of :class:`Example` objects encoding the individual interactive Python + examples that should be run by this test. + + +.. attribute:: DocTest.globs + + The namespace (aka globals) that the examples should be run in. This is a + dictionary mapping names to values. Any changes to the namespace made by the + examples (such as binding new variables) will be reflected in :attr:`globs` + after the test is run. + + +.. attribute:: DocTest.name + + A string name identifying the :class:`DocTest`. Typically, this is the name of + the object or file that the test was extracted from. + + +.. attribute:: DocTest.filename + + The name of the file that this :class:`DocTest` was extracted from; or ``None`` + if the filename is unknown, or if the :class:`DocTest` was not extracted from a + file. + + +.. attribute:: DocTest.lineno + + The line number within :attr:`filename` where this :class:`DocTest` begins, or + ``None`` if the line number is unavailable. This line number is zero-based with + respect to the beginning of the file. + + +.. attribute:: DocTest.docstring + + The string that the test was extracted from, or 'None' if the string is + unavailable, or if the test was not extracted from a string. + + +.. _doctest-example: + +Example Objects +^^^^^^^^^^^^^^^ + + +.. class:: Example(source, want[, exc_msg][, lineno][, indent][, options]) + + A single interactive example, consisting of a Python statement and its expected + output. The constructor arguments are used to initialize the member variables + of the same names. + + .. versionadded:: 2.4 + +:class:`Example` defines the following member variables. They are initialized +by the constructor, and should not be modified directly. + + +.. attribute:: Example.source + + A string containing the example's source code. This source code consists of a + single Python statement, and always ends with a newline; the constructor adds a + newline when necessary. + + +.. attribute:: Example.want + + The expected output from running the example's source code (either from stdout, + or a traceback in case of exception). :attr:`want` ends with a newline unless + no output is expected, in which case it's an empty string. The constructor adds + a newline when necessary. + + +.. attribute:: Example.exc_msg + + The exception message generated by the example, if the example is expected to + generate an exception; or ``None`` if it is not expected to generate an + exception. This exception message is compared against the return value of + :func:`traceback.format_exception_only`. :attr:`exc_msg` ends with a newline + unless it's ``None``. The constructor adds a newline if needed. + + +.. attribute:: Example.lineno + + The line number within the string containing this example where the example + begins. This line number is zero-based with respect to the beginning of the + containing string. + + +.. attribute:: Example.indent + + The example's indentation in the containing string, i.e., the number of space + characters that precede the example's first prompt. + + +.. attribute:: Example.options + + A dictionary mapping from option flags to ``True`` or ``False``, which is used + to override default options for this example. Any option flags not contained in + this dictionary are left at their default value (as specified by the + :class:`DocTestRunner`'s :attr:`optionflags`). By default, no options are set. + + +.. _doctest-doctestfinder: + +DocTestFinder objects +^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: DocTestFinder([verbose][, parser][, recurse][, exclude_empty]) + + A processing class used to extract the :class:`DocTest`\ s that are relevant to + a given object, from its docstring and the docstrings of its contained objects. + :class:`DocTest`\ s can currently be extracted from the following object types: + modules, functions, classes, methods, staticmethods, classmethods, and + properties. + + The optional argument *verbose* can be used to display the objects searched by + the finder. It defaults to ``False`` (no output). + + The optional argument *parser* specifies the :class:`DocTestParser` object (or a + drop-in replacement) that is used to extract doctests from docstrings. + + If the optional argument *recurse* is false, then :meth:`DocTestFinder.find` + will only examine the given object, and not any contained objects. + + If the optional argument *exclude_empty* is false, then + :meth:`DocTestFinder.find` will include tests for objects with empty docstrings. + + .. versionadded:: 2.4 + +:class:`DocTestFinder` defines the following method: + + +.. method:: DocTestFinder.find(obj[, name][, module][, globs][, extraglobs]) + + Return a list of the :class:`DocTest`\ s that are defined by *obj*'s docstring, + or by any of its contained objects' docstrings. + + The optional argument *name* specifies the object's name; this name will be used + to construct names for the returned :class:`DocTest`\ s. If *name* is not + specified, then ``obj.__name__`` is used. + + The optional parameter *module* is the module that contains the given object. + If the module is not specified or is None, then the test finder will attempt to + automatically determine the correct module. The object's module is used: + + * As a default namespace, if *globs* is not specified. + + * To prevent the DocTestFinder from extracting DocTests from objects that are + imported from other modules. (Contained objects with modules other than + *module* are ignored.) + + * To find the name of the file containing the object. + + * To help find the line number of the object within its file. + + If *module* is ``False``, no attempt to find the module will be made. This is + obscure, of use mostly in testing doctest itself: if *module* is ``False``, or + is ``None`` but cannot be found automatically, then all objects are considered + to belong to the (non-existent) module, so all contained objects will + (recursively) be searched for doctests. + + The globals for each :class:`DocTest` is formed by combining *globs* and + *extraglobs* (bindings in *extraglobs* override bindings in *globs*). A new + shallow copy of the globals dictionary is created for each :class:`DocTest`. If + *globs* is not specified, then it defaults to the module's *__dict__*, if + specified, or ``{}`` otherwise. If *extraglobs* is not specified, then it + defaults to ``{}``. + + +.. _doctest-doctestparser: + +DocTestParser objects +^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: DocTestParser() + + A processing class used to extract interactive examples from a string, and use + them to create a :class:`DocTest` object. + + .. versionadded:: 2.4 + +:class:`DocTestParser` defines the following methods: + + +.. method:: DocTestParser.get_doctest(string, globs, name, filename, lineno) + + Extract all doctest examples from the given string, and collect them into a + :class:`DocTest` object. + + *globs*, *name*, *filename*, and *lineno* are attributes for the new + :class:`DocTest` object. See the documentation for :class:`DocTest` for more + information. + + +.. method:: DocTestParser.get_examples(string[, name]) + + Extract all doctest examples from the given string, and return them as a list of + :class:`Example` objects. Line numbers are 0-based. The optional argument + *name* is a name identifying this string, and is only used for error messages. + + +.. method:: DocTestParser.parse(string[, name]) + + Divide the given string into examples and intervening text, and return them as a + list of alternating :class:`Example`\ s and strings. Line numbers for the + :class:`Example`\ s are 0-based. The optional argument *name* is a name + identifying this string, and is only used for error messages. + + +.. _doctest-doctestrunner: + +DocTestRunner objects +^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: DocTestRunner([checker][, verbose][, optionflags]) + + A processing class used to execute and verify the interactive examples in a + :class:`DocTest`. + + The comparison between expected outputs and actual outputs is done by an + :class:`OutputChecker`. This comparison may be customized with a number of + option flags; see section :ref:`doctest-options` for more information. If the + option flags are insufficient, then the comparison may also be customized by + passing a subclass of :class:`OutputChecker` to the constructor. + + The test runner's display output can be controlled in two ways. First, an output + function can be passed to :meth:`TestRunner.run`; this function will be called + with strings that should be displayed. It defaults to ``sys.stdout.write``. If + capturing the output is not sufficient, then the display output can be also + customized by subclassing DocTestRunner, and overriding the methods + :meth:`report_start`, :meth:`report_success`, + :meth:`report_unexpected_exception`, and :meth:`report_failure`. + + The optional keyword argument *checker* specifies the :class:`OutputChecker` + object (or drop-in replacement) that should be used to compare the expected + outputs to the actual outputs of doctest examples. + + The optional keyword argument *verbose* controls the :class:`DocTestRunner`'s + verbosity. If *verbose* is ``True``, then information is printed about each + example, as it is run. If *verbose* is ``False``, then only failures are + printed. If *verbose* is unspecified, or ``None``, then verbose output is used + iff the command-line switch :option:`-v` is used. + + The optional keyword argument *optionflags* can be used to control how the test + runner compares expected output to actual output, and how it displays failures. + For more information, see section :ref:`doctest-options`. + + .. versionadded:: 2.4 + +:class:`DocTestParser` defines the following methods: + + +.. method:: DocTestRunner.report_start(out, test, example) + + Report that the test runner is about to process the given example. This method + is provided to allow subclasses of :class:`DocTestRunner` to customize their + output; it should not be called directly. + + *example* is the example about to be processed. *test* is the test containing + *example*. *out* is the output function that was passed to + :meth:`DocTestRunner.run`. + + +.. method:: DocTestRunner.report_success(out, test, example, got) + + Report that the given example ran successfully. This method is provided to + allow subclasses of :class:`DocTestRunner` to customize their output; it should + not be called directly. + + *example* is the example about to be processed. *got* is the actual output from + the example. *test* is the test containing *example*. *out* is the output + function that was passed to :meth:`DocTestRunner.run`. + + +.. method:: DocTestRunner.report_failure(out, test, example, got) + + Report that the given example failed. This method is provided to allow + subclasses of :class:`DocTestRunner` to customize their output; it should not be + called directly. + + *example* is the example about to be processed. *got* is the actual output from + the example. *test* is the test containing *example*. *out* is the output + function that was passed to :meth:`DocTestRunner.run`. + + +.. method:: DocTestRunner.report_unexpected_exception(out, test, example, exc_info) + + Report that the given example raised an unexpected exception. This method is + provided to allow subclasses of :class:`DocTestRunner` to customize their + output; it should not be called directly. + + *example* is the example about to be processed. *exc_info* is a tuple containing + information about the unexpected exception (as returned by + :func:`sys.exc_info`). *test* is the test containing *example*. *out* is the + output function that was passed to :meth:`DocTestRunner.run`. + + +.. method:: DocTestRunner.run(test[, compileflags][, out][, clear_globs]) + + Run the examples in *test* (a :class:`DocTest` object), and display the results + using the writer function *out*. + + The examples are run in the namespace ``test.globs``. If *clear_globs* is true + (the default), then this namespace will be cleared after the test runs, to help + with garbage collection. If you would like to examine the namespace after the + test completes, then use *clear_globs=False*. + + *compileflags* gives the set of flags that should be used by the Python compiler + when running the examples. If not specified, then it will default to the set of + future-import flags that apply to *globs*. + + The output of each example is checked using the :class:`DocTestRunner`'s output + checker, and the results are formatted by the :meth:`DocTestRunner.report_\*` + methods. + + +.. method:: DocTestRunner.summarize([verbose]) + + Print a summary of all the test cases that have been run by this DocTestRunner, + and return a tuple ``(failure_count, test_count)``. + + The optional *verbose* argument controls how detailed the summary is. If the + verbosity is not specified, then the :class:`DocTestRunner`'s verbosity is used. + + +.. _doctest-outputchecker: + +OutputChecker objects +^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: OutputChecker() + + A class used to check the whether the actual output from a doctest example + matches the expected output. :class:`OutputChecker` defines two methods: + :meth:`check_output`, which compares a given pair of outputs, and returns true + if they match; and :meth:`output_difference`, which returns a string describing + the differences between two outputs. + + .. versionadded:: 2.4 + +:class:`OutputChecker` defines the following methods: + + +.. method:: OutputChecker.check_output(want, got, optionflags) + + Return ``True`` iff the actual output from an example (*got*) matches the + expected output (*want*). These strings are always considered to match if they + are identical; but depending on what option flags the test runner is using, + several non-exact match types are also possible. See section + :ref:`doctest-options` for more information about option flags. + + +.. method:: OutputChecker.output_difference(example, got, optionflags) + + Return a string describing the differences between the expected output for a + given example (*example*) and the actual output (*got*). *optionflags* is the + set of option flags used to compare *want* and *got*. + + +.. _doctest-debugging: + +Debugging +--------- + +Doctest provides several mechanisms for debugging doctest examples: + +* Several functions convert doctests to executable Python programs, which can be + run under the Python debugger, :mod:`pdb`. + +* The :class:`DebugRunner` class is a subclass of :class:`DocTestRunner` that + raises an exception for the first failing example, containing information about + that example. This information can be used to perform post-mortem debugging on + the example. + +* The :mod:`unittest` cases generated by :func:`DocTestSuite` support the + :meth:`debug` method defined by :class:`unittest.TestCase`. + +* You can add a call to :func:`pdb.set_trace` in a doctest example, and you'll + drop into the Python debugger when that line is executed. Then you can inspect + current values of variables, and so on. For example, suppose :file:`a.py` + contains just this module docstring:: + + """ + >>> def f(x): + ... g(x*2) + >>> def g(x): + ... print x+3 + ... import pdb; pdb.set_trace() + >>> f(3) + 9 + """ + + Then an interactive Python session may look like this:: + + >>> import a, doctest + >>> doctest.testmod(a) + --Return-- + > <doctest a[1]>(3)g()->None + -> import pdb; pdb.set_trace() + (Pdb) list + 1 def g(x): + 2 print x+3 + 3 -> import pdb; pdb.set_trace() + [EOF] + (Pdb) print x + 6 + (Pdb) step + --Return-- + > <doctest a[0]>(2)f()->None + -> g(x*2) + (Pdb) list + 1 def f(x): + 2 -> g(x*2) + [EOF] + (Pdb) print x + 3 + (Pdb) step + --Return-- + > <doctest a[2]>(1)?()->None + -> f(3) + (Pdb) cont + (0, 3) + >>> + + .. versionchanged:: 2.4 + The ability to use :func:`pdb.set_trace` usefully inside doctests was added. + +Functions that convert doctests to Python code, and possibly run the synthesized +code under the debugger: + + +.. function:: script_from_examples(s) + + Convert text with examples to a script. + + Argument *s* is a string containing doctest examples. The string is converted + to a Python script, where doctest examples in *s* are converted to regular code, + and everything else is converted to Python comments. The generated script is + returned as a string. For example, :: + + import doctest + print doctest.script_from_examples(r""" + Set x and y to 1 and 2. + >>> x, y = 1, 2 + + Print their sum: + >>> print x+y + 3 + """) + + displays:: + + # Set x and y to 1 and 2. + x, y = 1, 2 + # + # Print their sum: + print x+y + # Expected: + ## 3 + + This function is used internally by other functions (see below), but can also be + useful when you want to transform an interactive Python session into a Python + script. + + .. versionadded:: 2.4 + + +.. function:: testsource(module, name) + + Convert the doctest for an object to a script. + + Argument *module* is a module object, or dotted name of a module, containing the + object whose doctests are of interest. Argument *name* is the name (within the + module) of the object with the doctests of interest. The result is a string, + containing the object's docstring converted to a Python script, as described for + :func:`script_from_examples` above. For example, if module :file:`a.py` + contains a top-level function :func:`f`, then :: + + import a, doctest + print doctest.testsource(a, "a.f") + + prints a script version of function :func:`f`'s docstring, with doctests + converted to code, and the rest placed in comments. + + .. versionadded:: 2.3 + + +.. function:: debug(module, name[, pm]) + + Debug the doctests for an object. + + The *module* and *name* arguments are the same as for function + :func:`testsource` above. The synthesized Python script for the named object's + docstring is written to a temporary file, and then that file is run under the + control of the Python debugger, :mod:`pdb`. + + A shallow copy of ``module.__dict__`` is used for both local and global + execution context. + + Optional argument *pm* controls whether post-mortem debugging is used. If *pm* + has a true value, the script file is run directly, and the debugger gets + involved only if the script terminates via raising an unhandled exception. If + it does, then post-mortem debugging is invoked, via :func:`pdb.post_mortem`, + passing the traceback object from the unhandled exception. If *pm* is not + specified, or is false, the script is run under the debugger from the start, via + passing an appropriate :func:`exec` call to :func:`pdb.run`. + + .. versionadded:: 2.3 + + .. versionchanged:: 2.4 + The *pm* argument was added. + + +.. function:: debug_src(src[, pm][, globs]) + + Debug the doctests in a string. + + This is like function :func:`debug` above, except that a string containing + doctest examples is specified directly, via the *src* argument. + + Optional argument *pm* has the same meaning as in function :func:`debug` above. + + Optional argument *globs* gives a dictionary to use as both local and global + execution context. If not specified, or ``None``, an empty dictionary is used. + If specified, a shallow copy of the dictionary is used. + + .. versionadded:: 2.4 + +The :class:`DebugRunner` class, and the special exceptions it may raise, are of +most interest to testing framework authors, and will only be sketched here. See +the source code, and especially :class:`DebugRunner`'s docstring (which is a +doctest!) for more details: + + +.. class:: DebugRunner([checker][, verbose][, optionflags]) + + A subclass of :class:`DocTestRunner` that raises an exception as soon as a + failure is encountered. If an unexpected exception occurs, an + :exc:`UnexpectedException` exception is raised, containing the test, the + example, and the original exception. If the output doesn't match, then a + :exc:`DocTestFailure` exception is raised, containing the test, the example, and + the actual output. + + For information about the constructor parameters and methods, see the + documentation for :class:`DocTestRunner` in section :ref:`doctest-advanced-api`. + +There are two exceptions that may be raised by :class:`DebugRunner` instances: + + +.. exception:: DocTestFailure(test, example, got) + + An exception thrown by :class:`DocTestRunner` to signal that a doctest example's + actual output did not match its expected output. The constructor arguments are + used to initialize the member variables of the same names. + +:exc:`DocTestFailure` defines the following member variables: + + +.. attribute:: DocTestFailure.test + + The :class:`DocTest` object that was being run when the example failed. + + +.. attribute:: DocTestFailure.example + + The :class:`Example` that failed. + + +.. attribute:: DocTestFailure.got + + The example's actual output. + + +.. exception:: UnexpectedException(test, example, exc_info) + + An exception thrown by :class:`DocTestRunner` to signal that a doctest example + raised an unexpected exception. The constructor arguments are used to + initialize the member variables of the same names. + +:exc:`UnexpectedException` defines the following member variables: + + +.. attribute:: UnexpectedException.test + + The :class:`DocTest` object that was being run when the example failed. + + +.. attribute:: UnexpectedException.example + + The :class:`Example` that failed. + + +.. attribute:: UnexpectedException.exc_info + + A tuple containing information about the unexpected exception, as returned by + :func:`sys.exc_info`. + + +.. _doctest-soapbox: + +Soapbox +------- + +As mentioned in the introduction, :mod:`doctest` has grown to have three primary +uses: + +#. Checking examples in docstrings. + +#. Regression testing. + +#. Executable documentation / literate testing. + +These uses have different requirements, and it is important to distinguish them. +In particular, filling your docstrings with obscure test cases makes for bad +documentation. + +When writing a docstring, choose docstring examples with care. There's an art to +this that needs to be learned---it may not be natural at first. Examples should +add genuine value to the documentation. A good example can often be worth many +words. If done with care, the examples will be invaluable for your users, and +will pay back the time it takes to collect them many times over as the years go +by and things change. I'm still amazed at how often one of my :mod:`doctest` +examples stops working after a "harmless" change. + +Doctest also makes an excellent tool for regression testing, especially if you +don't skimp on explanatory text. By interleaving prose and examples, it becomes +much easier to keep track of what's actually being tested, and why. When a test +fails, good prose can make it much easier to figure out what the problem is, and +how it should be fixed. It's true that you could write extensive comments in +code-based testing, but few programmers do. Many have found that using doctest +approaches instead leads to much clearer tests. Perhaps this is simply because +doctest makes writing prose a little easier than writing code, while writing +comments in code is a little harder. I think it goes deeper than just that: +the natural attitude when writing a doctest-based test is that you want to +explain the fine points of your software, and illustrate them with examples. +This in turn naturally leads to test files that start with the simplest +features, and logically progress to complications and edge cases. A coherent +narrative is the result, instead of a collection of isolated functions that test +isolated bits of functionality seemingly at random. It's a different attitude, +and produces different results, blurring the distinction between testing and +explaining. + +Regression testing is best confined to dedicated objects or files. There are +several options for organizing tests: + +* Write text files containing test cases as interactive examples, and test the + files using :func:`testfile` or :func:`DocFileSuite`. This is recommended, + although is easiest to do for new projects, designed from the start to use + doctest. + +* Define functions named ``_regrtest_topic`` that consist of single docstrings, + containing test cases for the named topics. These functions can be included in + the same file as the module, or separated out into a separate test file. + +* Define a ``__test__`` dictionary mapping from regression test topics to + docstrings containing test cases. + +.. rubric:: Footnotes + +.. [#] Examples containing both expected output and an exception are not supported. + Trying to guess where one ends and the other begins is too error-prone, and that + also makes for a confusing test. + diff --git a/Doc/library/docxmlrpcserver.rst b/Doc/library/docxmlrpcserver.rst new file mode 100644 index 0000000..958ea95 --- /dev/null +++ b/Doc/library/docxmlrpcserver.rst @@ -0,0 +1,97 @@ + +:mod:`DocXMLRPCServer` --- Self-documenting XML-RPC server +========================================================== + +.. module:: DocXMLRPCServer + :synopsis: Self-documenting XML-RPC server implementation. +.. moduleauthor:: Brian Quinlan <brianq@activestate.com> +.. sectionauthor:: Brian Quinlan <brianq@activestate.com> + + +.. versionadded:: 2.3 + +The :mod:`DocXMLRPCServer` module extends the classes found in +:mod:`SimpleXMLRPCServer` to serve HTML documentation in response to HTTP GET +requests. Servers can either be free standing, using :class:`DocXMLRPCServer`, +or embedded in a CGI environment, using :class:`DocCGIXMLRPCRequestHandler`. + + +.. class:: DocXMLRPCServer(addr[, requestHandler[, logRequests[, allow_none[, encoding[, bind_and_activate]]]]]) + + Create a new server instance. All parameters have the same meaning as for + :class:`SimpleXMLRPCServer.SimpleXMLRPCServer`; *requestHandler* defaults to + :class:`DocXMLRPCRequestHandler`. + + +.. class:: DocCGIXMLRPCRequestHandler() + + Create a new instance to handle XML-RPC requests in a CGI environment. + + +.. class:: DocXMLRPCRequestHandler() + + Create a new request handler instance. This request handler supports XML-RPC + POST requests, documentation GET requests, and modifies logging so that the + *logRequests* parameter to the :class:`DocXMLRPCServer` constructor parameter is + honored. + + +.. _doc-xmlrpc-servers: + +DocXMLRPCServer Objects +----------------------- + +The :class:`DocXMLRPCServer` class is derived from +:class:`SimpleXMLRPCServer.SimpleXMLRPCServer` and provides a means of creating +self-documenting, stand alone XML-RPC servers. HTTP POST requests are handled as +XML-RPC method calls. HTTP GET requests are handled by generating pydoc-style +HTML documentation. This allows a server to provide its own web-based +documentation. + + +.. method:: DocXMLRPCServer.set_server_title(server_title) + + Set the title used in the generated HTML documentation. This title will be used + inside the HTML "title" element. + + +.. method:: DocXMLRPCServer.set_server_name(server_name) + + Set the name used in the generated HTML documentation. This name will appear at + the top of the generated documentation inside a "h1" element. + + +.. method:: DocXMLRPCServer.set_server_documentation(server_documentation) + + Set the description used in the generated HTML documentation. This description + will appear as a paragraph, below the server name, in the documentation. + + +DocCGIXMLRPCRequestHandler +-------------------------- + +The :class:`DocCGIXMLRPCRequestHandler` class is derived from +:class:`SimpleXMLRPCServer.CGIXMLRPCRequestHandler` and provides a means of +creating self-documenting, XML-RPC CGI scripts. HTTP POST requests are handled +as XML-RPC method calls. HTTP GET requests are handled by generating pydoc-style +HTML documentation. This allows a server to provide its own web-based +documentation. + + +.. method:: DocCGIXMLRPCRequestHandler.set_server_title(server_title) + + Set the title used in the generated HTML documentation. This title will be used + inside the HTML "title" element. + + +.. method:: DocCGIXMLRPCRequestHandler.set_server_name(server_name) + + Set the name used in the generated HTML documentation. This name will appear at + the top of the generated documentation inside a "h1" element. + + +.. method:: DocCGIXMLRPCRequestHandler.set_server_documentation(server_documentation) + + Set the description used in the generated HTML documentation. This description + will appear as a paragraph, below the server name, in the documentation. + diff --git a/Doc/library/dumbdbm.rst b/Doc/library/dumbdbm.rst new file mode 100644 index 0000000..3db9fda --- /dev/null +++ b/Doc/library/dumbdbm.rst @@ -0,0 +1,81 @@ + +:mod:`dumbdbm` --- Portable DBM implementation +============================================== + +.. module:: dumbdbm + :synopsis: Portable implementation of the simple DBM interface. + + +.. index:: single: databases + +.. note:: + + The :mod:`dumbdbm` module is intended as a last resort fallback for the + :mod:`anydbm` module when no more robust module is available. The :mod:`dumbdbm` + module is not written for speed and is not nearly as heavily used as the other + database modules. + +The :mod:`dumbdbm` module provides a persistent dictionary-like interface which +is written entirely in Python. Unlike other modules such as :mod:`gdbm` and +:mod:`bsddb`, no external library is required. As with other persistent +mappings, the keys and values must always be strings. + +The module defines the following: + + +.. exception:: error + + Raised on dumbdbm-specific errors, such as I/O errors. :exc:`KeyError` is + raised for general mapping errors like specifying an incorrect key. + + +.. function:: open(filename[, flag[, mode]]) + + Open a dumbdbm database and return a dumbdbm object. The *filename* argument is + the basename of the database file (without any specific extensions). When a + dumbdbm database is created, files with :file:`.dat` and :file:`.dir` extensions + are created. + + The optional *flag* argument is currently ignored; the database is always opened + for update, and will be created if it does not exist. + + The optional *mode* argument is the Unix mode of the file, used only when the + database has to be created. It defaults to octal ``0666`` (and will be modified + by the prevailing umask). + + .. versionchanged:: 2.2 + The *mode* argument was ignored in earlier versions. + + +.. seealso:: + + Module :mod:`anydbm` + Generic interface to ``dbm``\ -style databases. + + Module :mod:`dbm` + Similar interface to the DBM/NDBM library. + + Module :mod:`gdbm` + Similar interface to the GNU GDBM library. + + Module :mod:`shelve` + Persistence module which stores non-string data. + + Module :mod:`whichdb` + Utility module used to determine the type of an existing database. + + +.. _dumbdbm-objects: + +Dumbdbm Objects +--------------- + +In addition to the methods provided by the :class:`UserDict.DictMixin` class, +:class:`dumbdbm` objects provide the following methods. + + +.. method:: dumbdbm.sync() + + Synchronize the on-disk directory and data files. This method is called by the + :meth:`sync` method of :class:`Shelve` objects. + diff --git a/Doc/library/dummy_thread.rst b/Doc/library/dummy_thread.rst new file mode 100644 index 0000000..0b2cb17 --- /dev/null +++ b/Doc/library/dummy_thread.rst @@ -0,0 +1,23 @@ + +:mod:`dummy_thread` --- Drop-in replacement for the :mod:`thread` module +======================================================================== + +.. module:: dummy_thread + :synopsis: Drop-in replacement for the thread module. + + +This module provides a duplicate interface to the :mod:`thread` module. It is +meant to be imported when the :mod:`thread` module is not provided on a +platform. + +Suggested usage is:: + + try: + import thread as _thread + except ImportError: + import dummy_thread as _thread + +Be careful to not use this module where deadlock might occur from a thread +being created that blocks waiting for another thread to be created. This often +occurs with blocking I/O. + diff --git a/Doc/library/dummy_threading.rst b/Doc/library/dummy_threading.rst new file mode 100644 index 0000000..0ffb687 --- /dev/null +++ b/Doc/library/dummy_threading.rst @@ -0,0 +1,23 @@ + +:mod:`dummy_threading` --- Drop-in replacement for the :mod:`threading` module +============================================================================== + +.. module:: dummy_threading + :synopsis: Drop-in replacement for the threading module. + + +This module provides a duplicate interface to the :mod:`threading` module. It +is meant to be imported when the :mod:`thread` module is not provided on a +platform. + +Suggested usage is:: + + try: + import threading as _threading + except ImportError: + import dummy_threading as _threading + +Be careful to not use this module where deadlock might occur from a thread +being created that blocks waiting for another thread to be created. This often +occurs with blocking I/O. + diff --git a/Doc/library/easydialogs.rst b/Doc/library/easydialogs.rst new file mode 100644 index 0000000..50b312f --- /dev/null +++ b/Doc/library/easydialogs.rst @@ -0,0 +1,207 @@ + +:mod:`EasyDialogs` --- Basic Macintosh dialogs +============================================== + +.. module:: EasyDialogs + :platform: Mac + :synopsis: Basic Macintosh dialogs. + + +The :mod:`EasyDialogs` module contains some simple dialogs for the Macintosh. +All routines take an optional resource ID parameter *id* with which one can +override the :const:`DLOG` resource used for the dialog, provided that the +dialog items correspond (both type and item number) to those in the default +:const:`DLOG` resource. See source code for details. + +The :mod:`EasyDialogs` module defines the following functions: + + +.. function:: Message(str[, id[, ok]]) + + Displays a modal dialog with the message text *str*, which should be at most 255 + characters long. The button text defaults to "OK", but is set to the string + argument *ok* if the latter is supplied. Control is returned when the user + clicks the "OK" button. + + +.. function:: AskString(prompt[, default[, id[, ok[, cancel]]]]) + + Asks the user to input a string value via a modal dialog. *prompt* is the prompt + message, and the optional *default* supplies the initial value for the string + (otherwise ``""`` is used). The text of the "OK" and "Cancel" buttons can be + changed with the *ok* and *cancel* arguments. All strings can be at most 255 + bytes long. :func:`AskString` returns the string entered or :const:`None` in + case the user cancelled. + + +.. function:: AskPassword(prompt[, default[, id[, ok[, cancel]]]]) + + Asks the user to input a string value via a modal dialog. Like + :func:`AskString`, but with the text shown as bullets. The arguments have the + same meaning as for :func:`AskString`. + + +.. function:: AskYesNoCancel(question[, default[, yes[, no[, cancel[, id]]]]]) + + Presents a dialog with prompt *question* and three buttons labelled "Yes", "No", + and "Cancel". Returns ``1`` for "Yes", ``0`` for "No" and ``-1`` for "Cancel". + The value of *default* (or ``0`` if *default* is not supplied) is returned when + the :kbd:`RETURN` key is pressed. The text of the buttons can be changed with + the *yes*, *no*, and *cancel* arguments; to prevent a button from appearing, + supply ``""`` for the corresponding argument. + + +.. function:: ProgressBar([title[, maxval[, label[, id]]]]) + + Displays a modeless progress-bar dialog. This is the constructor for the + :class:`ProgressBar` class described below. *title* is the text string displayed + (default "Working..."), *maxval* is the value at which progress is complete + (default ``0``, indicating that an indeterminate amount of work remains to be + done), and *label* is the text that is displayed above the progress bar itself. + + +.. function:: GetArgv([optionlist[ commandlist[, addoldfile[, addnewfile[, addfolder[, id]]]]]]) + + Displays a dialog which aids the user in constructing a command-line argument + list. Returns the list in ``sys.argv`` format, suitable for passing as an + argument to :func:`getopt.getopt`. *addoldfile*, *addnewfile*, and *addfolder* + are boolean arguments. When nonzero, they enable the user to insert into the + command line paths to an existing file, a (possibly) not-yet-existent file, and + a folder, respectively. (Note: Option arguments must appear in the command line + before file and folder arguments in order to be recognized by + :func:`getopt.getopt`.) Arguments containing spaces can be specified by + enclosing them within single or double quotes. A :exc:`SystemExit` exception is + raised if the user presses the "Cancel" button. + + *optionlist* is a list that determines a popup menu from which the allowed + options are selected. Its items can take one of two forms: *optstr* or + ``(optstr, descr)``. When present, *descr* is a short descriptive string that + is displayed in the dialog while this option is selected in the popup menu. The + correspondence between *optstr*\s and command-line arguments is: + + +----------------------+------------------------------------------+ + | *optstr* format | Command-line format | + +======================+==========================================+ + | ``x`` | :option:`-x` (short option) | + +----------------------+------------------------------------------+ + | ``x:`` or ``x=`` | :option:`-x` (short option with value) | + +----------------------+------------------------------------------+ + | ``xyz`` | :option:`--xyz` (long option) | + +----------------------+------------------------------------------+ + | ``xyz:`` or ``xyz=`` | :option:`--xyz` (long option with value) | + +----------------------+------------------------------------------+ + + *commandlist* is a list of items of the form *cmdstr* or ``(cmdstr, descr)``, + where *descr* is as above. The *cmdstr*s will appear in a popup menu. When + chosen, the text of *cmdstr* will be appended to the command line as is, except + that a trailing ``':'`` or ``'='`` (if present) will be trimmed off. + + .. versionadded:: 2.0 + + +.. function:: AskFileForOpen( [message] [, typeList] [, defaultLocation] [, defaultOptionFlags] [, location] [, clientName] [, windowTitle] [, actionButtonLabel] [, cancelButtonLabel] [, preferenceKey] [, popupExtension] [, eventProc] [, previewProc] [, filterProc] [, wanted] ) + + Post a dialog asking the user for a file to open, and return the file selected + or :const:`None` if the user cancelled. *message* is a text message to display, + *typeList* is a list of 4-char filetypes allowable, *defaultLocation* is the + pathname, :class:`FSSpec` or :class:`FSRef` of the folder to show initially, + *location* is the ``(x, y)`` position on the screen where the dialog is shown, + *actionButtonLabel* is a string to show instead of "Open" in the OK button, + *cancelButtonLabel* is a string to show instead of "Cancel" in the cancel + button, *wanted* is the type of value wanted as a return: :class:`str`, + :class:`unicode`, :class:`FSSpec`, :class:`FSRef` and subtypes thereof are + acceptable. + + .. index:: single: Navigation Services + + For a description of the other arguments please see the Apple Navigation + Services documentation and the :mod:`EasyDialogs` source code. + + +.. function:: AskFileForSave( [message] [, savedFileName] [, defaultLocation] [, defaultOptionFlags] [, location] [, clientName] [, windowTitle] [, actionButtonLabel] [, cancelButtonLabel] [, preferenceKey] [, popupExtension] [, fileType] [, fileCreator] [, eventProc] [, wanted] ) + + Post a dialog asking the user for a file to save to, and return the file + selected or :const:`None` if the user cancelled. *savedFileName* is the default + for the file name to save to (the return value). See :func:`AskFileForOpen` for + a description of the other arguments. + + +.. function:: AskFolder( [message] [, defaultLocation] [, defaultOptionFlags] [, location] [, clientName] [, windowTitle] [, actionButtonLabel] [, cancelButtonLabel] [, preferenceKey] [, popupExtension] [, eventProc] [, filterProc] [, wanted] ) + + Post a dialog asking the user to select a folder, and return the folder selected + or :const:`None` if the user cancelled. See :func:`AskFileForOpen` for a + description of the arguments. + + +.. seealso:: + + `Navigation Services Reference <http://developer.apple.com/documentation/Carbon/Reference/Navigation_Services_Ref/>`_ + Programmer's reference documentation for the Navigation Services, a part of the + Carbon framework. + + +.. _progressbar-objects: + +ProgressBar Objects +------------------- + +:class:`ProgressBar` objects provide support for modeless progress-bar dialogs. +Both determinate (thermometer style) and indeterminate (barber-pole style) +progress bars are supported. The bar will be determinate if its maximum value +is greater than zero; otherwise it will be indeterminate. + +.. versionchanged:: 2.2 + Support for indeterminate-style progress bars was added. + +The dialog is displayed immediately after creation. If the dialog's "Cancel" +button is pressed, or if :kbd:`Cmd-.` or :kbd:`ESC` is typed, the dialog window +is hidden and :exc:`KeyboardInterrupt` is raised (but note that this response +does not occur until the progress bar is next updated, typically via a call to +:meth:`inc` or :meth:`set`). Otherwise, the bar remains visible until the +:class:`ProgressBar` object is discarded. + +:class:`ProgressBar` objects possess the following attributes and methods: + + +.. attribute:: ProgressBar.curval + + The current value (of type integer or long integer) of the progress bar. The + normal access methods coerce :attr:`curval` between ``0`` and :attr:`maxval`. + This attribute should not be altered directly. + + +.. attribute:: ProgressBar.maxval + + The maximum value (of type integer or long integer) of the progress bar; the + progress bar (thermometer style) is full when :attr:`curval` equals + :attr:`maxval`. If :attr:`maxval` is ``0``, the bar will be indeterminate + (barber-pole). This attribute should not be altered directly. + + +.. method:: ProgressBar.title([newstr]) + + Sets the text in the title bar of the progress dialog to *newstr*. + + +.. method:: ProgressBar.label([newstr]) + + Sets the text in the progress box of the progress dialog to *newstr*. + + +.. method:: ProgressBar.set(value[, max]) + + Sets the progress bar's :attr:`curval` to *value*, and also :attr:`maxval` to + *max* if the latter is provided. *value* is first coerced between 0 and + :attr:`maxval`. The thermometer bar is updated to reflect the changes, + including a change from indeterminate to determinate or vice versa. + + +.. method:: ProgressBar.inc([n]) + + Increments the progress bar's :attr:`curval` by *n*, or by ``1`` if *n* is not + provided. (Note that *n* may be negative, in which case the effect is a + decrement.) The progress bar is updated to reflect the change. If the bar is + indeterminate, this causes one "spin" of the barber pole. The resulting + :attr:`curval` is coerced between 0 and :attr:`maxval` if incrementing causes it + to fall outside this range. + diff --git a/Doc/library/email-examples.rst b/Doc/library/email-examples.rst new file mode 100644 index 0000000..64a9944 --- /dev/null +++ b/Doc/library/email-examples.rst @@ -0,0 +1,33 @@ +:mod:`email`: Examples +---------------------- + +Here are a few examples of how to use the :mod:`email` package to read, write, +and send simple email messages, as well as more complex MIME messages. + +First, let's see how to create and send a simple text message: + +.. literalinclude:: ../includes/email-simple.py + + +Here's an example of how to send a MIME message containing a bunch of family +pictures that may be residing in a directory: + +.. literalinclude:: ../includes/email-mime.py + + +Here's an example of how to send the entire contents of a directory as an email +message: [1]_ + +.. literalinclude:: ../includes/email-dir.py + + +And finally, here's an example of how to unpack a MIME message like the one +above, into a directory of files: + +.. literalinclude:: ../includes/email-unpack.py + + +.. rubric:: Footnotes + +.. [1] Thanks to Matthew Dixon Cowles for the original inspiration and examples. + diff --git a/Doc/library/email.charset.rst b/Doc/library/email.charset.rst new file mode 100644 index 0000000..d16d281 --- /dev/null +++ b/Doc/library/email.charset.rst @@ -0,0 +1,249 @@ +:mod:`email`: Representing character sets +----------------------------------------- + +.. module:: email.charset + :synopsis: Character Sets + + +This module provides a class :class:`Charset` for representing character sets +and character set conversions in email messages, as well as a character set +registry and several convenience methods for manipulating this registry. +Instances of :class:`Charset` are used in several other modules within the +:mod:`email` package. + +Import this class from the :mod:`email.charset` module. + +.. versionadded:: 2.2.2 + + +.. class:: Charset([input_charset]) + + Map character sets to their email properties. + + This class provides information about the requirements imposed on email for a + specific character set. It also provides convenience routines for converting + between character sets, given the availability of the applicable codecs. Given + a character set, it will do its best to provide information on how to use that + character set in an email message in an RFC-compliant way. + + Certain character sets must be encoded with quoted-printable or base64 when used + in email headers or bodies. Certain character sets must be converted outright, + and are not allowed in email. + + Optional *input_charset* is as described below; it is always coerced to lower + case. After being alias normalized it is also used as a lookup into the + registry of character sets to find out the header encoding, body encoding, and + output conversion codec to be used for the character set. For example, if + *input_charset* is ``iso-8859-1``, then headers and bodies will be encoded using + quoted-printable and no output conversion codec is necessary. If + *input_charset* is ``euc-jp``, then headers will be encoded with base64, bodies + will not be encoded, but output text will be converted from the ``euc-jp`` + character set to the ``iso-2022-jp`` character set. + +:class:`Charset` instances have the following data attributes: + + +.. data:: input_charset + + The initial character set specified. Common aliases are converted to their + *official* email names (e.g. ``latin_1`` is converted to ``iso-8859-1``). + Defaults to 7-bit ``us-ascii``. + + +.. data:: header_encoding + + If the character set must be encoded before it can be used in an email header, + this attribute will be set to ``Charset.QP`` (for quoted-printable), + ``Charset.BASE64`` (for base64 encoding), or ``Charset.SHORTEST`` for the + shortest of QP or BASE64 encoding. Otherwise, it will be ``None``. + + +.. data:: body_encoding + + Same as *header_encoding*, but describes the encoding for the mail message's + body, which indeed may be different than the header encoding. + ``Charset.SHORTEST`` is not allowed for *body_encoding*. + + +.. data:: output_charset + + Some character sets must be converted before they can be used in email headers + or bodies. If the *input_charset* is one of them, this attribute will contain + the name of the character set output will be converted to. Otherwise, it will + be ``None``. + + +.. data:: input_codec + + The name of the Python codec used to convert the *input_charset* to Unicode. If + no conversion codec is necessary, this attribute will be ``None``. + + +.. data:: output_codec + + The name of the Python codec used to convert Unicode to the *output_charset*. + If no conversion codec is necessary, this attribute will have the same value as + the *input_codec*. + +:class:`Charset` instances also have the following methods: + + +.. method:: Charset.get_body_encoding() + + Return the content transfer encoding used for body encoding. + + This is either the string ``quoted-printable`` or ``base64`` depending on the + encoding used, or it is a function, in which case you should call the function + with a single argument, the Message object being encoded. The function should + then set the :mailheader:`Content-Transfer-Encoding` header itself to whatever + is appropriate. + + Returns the string ``quoted-printable`` if *body_encoding* is ``QP``, returns + the string ``base64`` if *body_encoding* is ``BASE64``, and returns the string + ``7bit`` otherwise. + + +.. method:: Charset.convert(s) + + Convert the string *s* from the *input_codec* to the *output_codec*. + + +.. method:: Charset.to_splittable(s) + + Convert a possibly multibyte string to a safely splittable format. *s* is the + string to split. + + Uses the *input_codec* to try and convert the string to Unicode, so it can be + safely split on character boundaries (even for multibyte characters). + + Returns the string as-is if it isn't known how to convert *s* to Unicode with + the *input_charset*. + + Characters that could not be converted to Unicode will be replaced with the + Unicode replacement character ``'U+FFFD'``. + + +.. method:: Charset.from_splittable(ustr[, to_output]) + + Convert a splittable string back into an encoded string. *ustr* is a Unicode + string to "unsplit". + + This method uses the proper codec to try and convert the string from Unicode + back into an encoded format. Return the string as-is if it is not Unicode, or + if it could not be converted from Unicode. + + Characters that could not be converted from Unicode will be replaced with an + appropriate character (usually ``'?'``). + + If *to_output* is ``True`` (the default), uses *output_codec* to convert to an + encoded format. If *to_output* is ``False``, it uses *input_codec*. + + +.. method:: Charset.get_output_charset() + + Return the output character set. + + This is the *output_charset* attribute if that is not ``None``, otherwise it is + *input_charset*. + + +.. method:: Charset.encoded_header_len() + + Return the length of the encoded header string, properly calculating for + quoted-printable or base64 encoding. + + +.. method:: Charset.header_encode(s[, convert]) + + Header-encode the string *s*. + + If *convert* is ``True``, the string will be converted from the input charset to + the output charset automatically. This is not useful for multibyte character + sets, which have line length issues (multibyte characters must be split on a + character, not a byte boundary); use the higher-level :class:`Header` class to + deal with these issues (see :mod:`email.header`). *convert* defaults to + ``False``. + + The type of encoding (base64 or quoted-printable) will be based on the + *header_encoding* attribute. + + +.. method:: Charset.body_encode(s[, convert]) + + Body-encode the string *s*. + + If *convert* is ``True`` (the default), the string will be converted from the + input charset to output charset automatically. Unlike :meth:`header_encode`, + there are no issues with byte boundaries and multibyte charsets in email bodies, + so this is usually pretty safe. + + The type of encoding (base64 or quoted-printable) will be based on the + *body_encoding* attribute. + +The :class:`Charset` class also provides a number of methods to support standard +operations and built-in functions. + + +.. method:: Charset.__str__() + + Returns *input_charset* as a string coerced to lower case. :meth:`__repr__` is + an alias for :meth:`__str__`. + + +.. method:: Charset.__eq__(other) + + This method allows you to compare two :class:`Charset` instances for equality. + + +.. method:: Header.__ne__(other) + + This method allows you to compare two :class:`Charset` instances for inequality. + +The :mod:`email.charset` module also provides the following functions for adding +new entries to the global character set, alias, and codec registries: + + +.. function:: add_charset(charset[, header_enc[, body_enc[, output_charset]]]) + + Add character properties to the global registry. + + *charset* is the input character set, and must be the canonical name of a + character set. + + Optional *header_enc* and *body_enc* is either ``Charset.QP`` for + quoted-printable, ``Charset.BASE64`` for base64 encoding, + ``Charset.SHORTEST`` for the shortest of quoted-printable or base64 encoding, + or ``None`` for no encoding. ``SHORTEST`` is only valid for + *header_enc*. The default is ``None`` for no encoding. + + Optional *output_charset* is the character set that the output should be in. + Conversions will proceed from input charset, to Unicode, to the output charset + when the method :meth:`Charset.convert` is called. The default is to output in + the same character set as the input. + + Both *input_charset* and *output_charset* must have Unicode codec entries in the + module's character set-to-codec mapping; use :func:`add_codec` to add codecs the + module does not know about. See the :mod:`codecs` module's documentation for + more information. + + The global character set registry is kept in the module global dictionary + ``CHARSETS``. + + +.. function:: add_alias(alias, canonical) + + Add a character set alias. *alias* is the alias name, e.g. ``latin-1``. + *canonical* is the character set's canonical name, e.g. ``iso-8859-1``. + + The global charset alias registry is kept in the module global dictionary + ``ALIASES``. + + +.. function:: add_codec(charset, codecname) + + Add a codec that map characters in the given character set to and from Unicode. + + *charset* is the canonical name of a character set. *codecname* is the name of a + Python codec, as appropriate for the second argument to the :func:`unicode` + built-in, or to the :meth:`encode` method of a Unicode string. + diff --git a/Doc/library/email.encoders.rst b/Doc/library/email.encoders.rst new file mode 100644 index 0000000..28669c4 --- /dev/null +++ b/Doc/library/email.encoders.rst @@ -0,0 +1,57 @@ +:mod:`email`: Encoders +---------------------- + +.. module:: email.encoders + :synopsis: Encoders for email message payloads. + + +When creating :class:`Message` objects from scratch, you often need to encode +the payloads for transport through compliant mail servers. This is especially +true for :mimetype:`image/\*` and :mimetype:`text/\*` type messages containing +binary data. + +The :mod:`email` package provides some convenient encodings in its +:mod:`encoders` module. These encoders are actually used by the +:class:`MIMEAudio` and :class:`MIMEImage` class constructors to provide default +encodings. All encoder functions take exactly one argument, the message object +to encode. They usually extract the payload, encode it, and reset the payload +to this newly encoded value. They should also set the +:mailheader:`Content-Transfer-Encoding` header as appropriate. + +Here are the encoding functions provided: + + +.. function:: encode_quopri(msg) + + Encodes the payload into quoted-printable form and sets the + :mailheader:`Content-Transfer-Encoding` header to ``quoted-printable`` [#]_. + This is a good encoding to use when most of your payload is normal printable + data, but contains a few unprintable characters. + + +.. function:: encode_base64(msg) + + Encodes the payload into base64 form and sets the + :mailheader:`Content-Transfer-Encoding` header to ``base64``. This is a good + encoding to use when most of your payload is unprintable data since it is a more + compact form than quoted-printable. The drawback of base64 encoding is that it + renders the text non-human readable. + + +.. function:: encode_7or8bit(msg) + + This doesn't actually modify the message's payload, but it does set the + :mailheader:`Content-Transfer-Encoding` header to either ``7bit`` or ``8bit`` as + appropriate, based on the payload data. + + +.. function:: encode_noop(msg) + + This does nothing; it doesn't even set the + :mailheader:`Content-Transfer-Encoding` header. + +.. rubric:: Footnotes + +.. [#] Note that encoding with :meth:`encode_quopri` also encodes all tabs and space + characters in the data. + diff --git a/Doc/library/email.errors.rst b/Doc/library/email.errors.rst new file mode 100644 index 0000000..916d2a5 --- /dev/null +++ b/Doc/library/email.errors.rst @@ -0,0 +1,91 @@ +:mod:`email`: Exception and Defect classes +------------------------------------------ + +.. module:: email.errors + :synopsis: The exception classes used by the email package. + + +The following exception classes are defined in the :mod:`email.errors` module: + + +.. exception:: MessageError() + + This is the base class for all exceptions that the :mod:`email` package can + raise. It is derived from the standard :exc:`Exception` class and defines no + additional methods. + + +.. exception:: MessageParseError() + + This is the base class for exceptions thrown by the :class:`Parser` class. It + is derived from :exc:`MessageError`. + + +.. exception:: HeaderParseError() + + Raised under some error conditions when parsing the :rfc:`2822` headers of a + message, this class is derived from :exc:`MessageParseError`. It can be raised + from the :meth:`Parser.parse` or :meth:`Parser.parsestr` methods. + + Situations where it can be raised include finding an envelope header after the + first :rfc:`2822` header of the message, finding a continuation line before the + first :rfc:`2822` header is found, or finding a line in the headers which is + neither a header or a continuation line. + + +.. exception:: BoundaryError() + + Raised under some error conditions when parsing the :rfc:`2822` headers of a + message, this class is derived from :exc:`MessageParseError`. It can be raised + from the :meth:`Parser.parse` or :meth:`Parser.parsestr` methods. + + Situations where it can be raised include not being able to find the starting or + terminating boundary in a :mimetype:`multipart/\*` message when strict parsing + is used. + + +.. exception:: MultipartConversionError() + + Raised when a payload is added to a :class:`Message` object using + :meth:`add_payload`, but the payload is already a scalar and the message's + :mailheader:`Content-Type` main type is not either :mimetype:`multipart` or + missing. :exc:`MultipartConversionError` multiply inherits from + :exc:`MessageError` and the built-in :exc:`TypeError`. + + Since :meth:`Message.add_payload` is deprecated, this exception is rarely raised + in practice. However the exception may also be raised if the :meth:`attach` + method is called on an instance of a class derived from + :class:`MIMENonMultipart` (e.g. :class:`MIMEImage`). + +Here's the list of the defects that the :class:`FeedParser` can find while +parsing messages. Note that the defects are added to the message where the +problem was found, so for example, if a message nested inside a +:mimetype:`multipart/alternative` had a malformed header, that nested message +object would have a defect, but the containing messages would not. + +All defect classes are subclassed from :class:`email.errors.MessageDefect`, but +this class is *not* an exception! + +.. versionadded:: 2.4 + All the defect classes were added. + +* :class:`NoBoundaryInMultipartDefect` -- A message claimed to be a multipart, + but had no :mimetype:`boundary` parameter. + +* :class:`StartBoundaryNotFoundDefect` -- The start boundary claimed in the + :mailheader:`Content-Type` header was never found. + +* :class:`FirstHeaderLineIsContinuationDefect` -- The message had a continuation + line as its first header line. + +* :class:`MisplacedEnvelopeHeaderDefect` - A "Unix From" header was found in the + middle of a header block. + +* :class:`MalformedHeaderDefect` -- A header was found that was missing a colon, + or was otherwise malformed. + +* :class:`MultipartInvariantViolationDefect` -- A message claimed to be a + :mimetype:`multipart`, but no subparts were found. Note that when a message has + this defect, its :meth:`is_multipart` method may return false even though its + content type claims to be :mimetype:`multipart`. + diff --git a/Doc/library/email.generator.rst b/Doc/library/email.generator.rst new file mode 100644 index 0000000..bb1f57d --- /dev/null +++ b/Doc/library/email.generator.rst @@ -0,0 +1,123 @@ +:mod:`email`: Generating MIME documents +--------------------------------------- + +.. module:: email.generator + :synopsis: Generate flat text email messages from a message structure. + + +One of the most common tasks is to generate the flat text of the email message +represented by a message object structure. You will need to do this if you want +to send your message via the :mod:`smtplib` module or the :mod:`nntplib` module, +or print the message on the console. Taking a message object structure and +producing a flat text document is the job of the :class:`Generator` class. + +Again, as with the :mod:`email.parser` module, you aren't limited to the +functionality of the bundled generator; you could write one from scratch +yourself. However the bundled generator knows how to generate most email in a +standards-compliant way, should handle MIME and non-MIME email messages just +fine, and is designed so that the transformation from flat text, to a message +structure via the :class:`Parser` class, and back to flat text, is idempotent +(the input is identical to the output). + +Here are the public methods of the :class:`Generator` class, imported from the +:mod:`email.generator` module: + + +.. class:: Generator(outfp[, mangle_from_[, maxheaderlen]]) + + The constructor for the :class:`Generator` class takes a file-like object called + *outfp* for an argument. *outfp* must support the :meth:`write` method and be + usable as the output file in a Python extended print statement. + + Optional *mangle_from_* is a flag that, when ``True``, puts a ``>`` character in + front of any line in the body that starts exactly as ``From``, i.e. ``From`` + followed by a space at the beginning of the line. This is the only guaranteed + portable way to avoid having such lines be mistaken for a Unix mailbox format + envelope header separator (see `WHY THE CONTENT-LENGTH FORMAT IS BAD + <http://www.jwz.org/doc/content-length.html>`_ for details). *mangle_from_* + defaults to ``True``, but you might want to set this to ``False`` if you are not + writing Unix mailbox format files. + + Optional *maxheaderlen* specifies the longest length for a non-continued header. + When a header line is longer than *maxheaderlen* (in characters, with tabs + expanded to 8 spaces), the header will be split as defined in the + :mod:`email.header.Header` class. Set to zero to disable header wrapping. The + default is 78, as recommended (but not required) by :rfc:`2822`. + +The other public :class:`Generator` methods are: + + +.. method:: Generator.flatten(msg[, unixfrom]) + + Print the textual representation of the message object structure rooted at *msg* + to the output file specified when the :class:`Generator` instance was created. + Subparts are visited depth-first and the resulting text will be properly MIME + encoded. + + Optional *unixfrom* is a flag that forces the printing of the envelope header + delimiter before the first :rfc:`2822` header of the root message object. If + the root object has no envelope header, a standard one is crafted. By default, + this is set to ``False`` to inhibit the printing of the envelope delimiter. + + Note that for subparts, no envelope header is ever printed. + + .. versionadded:: 2.2.2 + + +.. method:: Generator.clone(fp) + + Return an independent clone of this :class:`Generator` instance with the exact + same options. + + .. versionadded:: 2.2.2 + + +.. method:: Generator.write(s) + + Write the string *s* to the underlying file object, i.e. *outfp* passed to + :class:`Generator`'s constructor. This provides just enough file-like API for + :class:`Generator` instances to be used in extended print statements. + +As a convenience, see the methods :meth:`Message.as_string` and +``str(aMessage)``, a.k.a. :meth:`Message.__str__`, which simplify the generation +of a formatted string representation of a message object. For more detail, see +:mod:`email.message`. + +The :mod:`email.generator` module also provides a derived class, called +:class:`DecodedGenerator` which is like the :class:`Generator` base class, +except that non-\ :mimetype:`text` parts are substituted with a format string +representing the part. + + +.. class:: DecodedGenerator(outfp[, mangle_from_[, maxheaderlen[, fmt]]]) + + This class, derived from :class:`Generator` walks through all the subparts of a + message. If the subpart is of main type :mimetype:`text`, then it prints the + decoded payload of the subpart. Optional *_mangle_from_* and *maxheaderlen* are + as with the :class:`Generator` base class. + + If the subpart is not of main type :mimetype:`text`, optional *fmt* is a format + string that is used instead of the message payload. *fmt* is expanded with the + following keywords, ``%(keyword)s`` format: + + * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part + + * ``maintype`` -- Main MIME type of the non-\ :mimetype:`text` part + + * ``subtype`` -- Sub-MIME type of the non-\ :mimetype:`text` part + + * ``filename`` -- Filename of the non-\ :mimetype:`text` part + + * ``description`` -- Description associated with the non-\ :mimetype:`text` part + + * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part + + The default value for *fmt* is ``None``, meaning :: + + [Non-text (%(type)s) part of message omitted, filename %(filename)s] + + .. versionadded:: 2.2.2 + +.. versionchanged:: 2.5 + The previously deprecated method :meth:`__call__` was removed. + diff --git a/Doc/library/email.header.rst b/Doc/library/email.header.rst new file mode 100644 index 0000000..0ecd35f --- /dev/null +++ b/Doc/library/email.header.rst @@ -0,0 +1,171 @@ +:mod:`email`: Internationalized headers +--------------------------------------- + +.. module:: email.header + :synopsis: Representing non-ASCII headers + + +:rfc:`2822` is the base standard that describes the format of email messages. +It derives from the older :rfc:`822` standard which came into widespread use at +a time when most email was composed of ASCII characters only. :rfc:`2822` is a +specification written assuming email contains only 7-bit ASCII characters. + +Of course, as email has been deployed worldwide, it has become +internationalized, such that language specific character sets can now be used in +email messages. The base standard still requires email messages to be +transferred using only 7-bit ASCII characters, so a slew of RFCs have been +written describing how to encode email containing non-ASCII characters into +:rfc:`2822`\ -compliant format. These RFCs include :rfc:`2045`, :rfc:`2046`, +:rfc:`2047`, and :rfc:`2231`. The :mod:`email` package supports these standards +in its :mod:`email.header` and :mod:`email.charset` modules. + +If you want to include non-ASCII characters in your email headers, say in the +:mailheader:`Subject` or :mailheader:`To` fields, you should use the +:class:`Header` class and assign the field in the :class:`Message` object to an +instance of :class:`Header` instead of using a string for the header value. +Import the :class:`Header` class from the :mod:`email.header` module. For +example:: + + >>> from email.message import Message + >>> from email.header import Header + >>> msg = Message() + >>> h = Header('p\xf6stal', 'iso-8859-1') + >>> msg['Subject'] = h + >>> print msg.as_string() + Subject: =?iso-8859-1?q?p=F6stal?= + + + +Notice here how we wanted the :mailheader:`Subject` field to contain a non-ASCII +character? We did this by creating a :class:`Header` instance and passing in +the character set that the byte string was encoded in. When the subsequent +:class:`Message` instance was flattened, the :mailheader:`Subject` field was +properly :rfc:`2047` encoded. MIME-aware mail readers would show this header +using the embedded ISO-8859-1 character. + +.. versionadded:: 2.2.2 + +Here is the :class:`Header` class description: + + +.. class:: Header([s[, charset[, maxlinelen[, header_name[, continuation_ws[, errors]]]]]]) + + Create a MIME-compliant header that can contain strings in different character + sets. + + Optional *s* is the initial header value. If ``None`` (the default), the + initial header value is not set. You can later append to the header with + :meth:`append` method calls. *s* may be a byte string or a Unicode string, but + see the :meth:`append` documentation for semantics. + + Optional *charset* serves two purposes: it has the same meaning as the *charset* + argument to the :meth:`append` method. It also sets the default character set + for all subsequent :meth:`append` calls that omit the *charset* argument. If + *charset* is not provided in the constructor (the default), the ``us-ascii`` + character set is used both as *s*'s initial charset and as the default for + subsequent :meth:`append` calls. + + The maximum line length can be specified explicit via *maxlinelen*. For + splitting the first line to a shorter value (to account for the field header + which isn't included in *s*, e.g. :mailheader:`Subject`) pass in the name of the + field in *header_name*. The default *maxlinelen* is 76, and the default value + for *header_name* is ``None``, meaning it is not taken into account for the + first line of a long, split header. + + Optional *continuation_ws* must be :rfc:`2822`\ -compliant folding whitespace, + and is usually either a space or a hard tab character. This character will be + prepended to continuation lines. + +Optional *errors* is passed straight through to the :meth:`append` method. + + +.. method:: Header.append(s[, charset[, errors]]) + + Append the string *s* to the MIME header. + + Optional *charset*, if given, should be a :class:`Charset` instance (see + :mod:`email.charset`) or the name of a character set, which will be converted to + a :class:`Charset` instance. A value of ``None`` (the default) means that the + *charset* given in the constructor is used. + + *s* may be a byte string or a Unicode string. If it is a byte string (i.e. + ``isinstance(s, str)`` is true), then *charset* is the encoding of that byte + string, and a :exc:`UnicodeError` will be raised if the string cannot be decoded + with that character set. + + If *s* is a Unicode string, then *charset* is a hint specifying the character + set of the characters in the string. In this case, when producing an + :rfc:`2822`\ -compliant header using :rfc:`2047` rules, the Unicode string will + be encoded using the following charsets in order: ``us-ascii``, the *charset* + hint, ``utf-8``. The first character set to not provoke a :exc:`UnicodeError` + is used. + + Optional *errors* is passed through to any :func:`unicode` or + :func:`ustr.encode` call, and defaults to "strict". + + +.. method:: Header.encode([splitchars]) + + Encode a message header into an RFC-compliant format, possibly wrapping long + lines and encapsulating non-ASCII parts in base64 or quoted-printable encodings. + Optional *splitchars* is a string containing characters to split long ASCII + lines on, in rough support of :rfc:`2822`'s *highest level syntactic breaks*. + This doesn't affect :rfc:`2047` encoded lines. + +The :class:`Header` class also provides a number of methods to support standard +operators and built-in functions. + + +.. method:: Header.__str__() + + A synonym for :meth:`Header.encode`. Useful for ``str(aHeader)``. + + +.. method:: Header.__unicode__() + + A helper for the built-in :func:`unicode` function. Returns the header as a + Unicode string. + + +.. method:: Header.__eq__(other) + + This method allows you to compare two :class:`Header` instances for equality. + + +.. method:: Header.__ne__(other) + + This method allows you to compare two :class:`Header` instances for inequality. + +The :mod:`email.header` module also provides the following convenient functions. + + +.. function:: decode_header(header) + + Decode a message header value without converting the character set. The header + value is in *header*. + + This function returns a list of ``(decoded_string, charset)`` pairs containing + each of the decoded parts of the header. *charset* is ``None`` for non-encoded + parts of the header, otherwise a lower case string containing the name of the + character set specified in the encoded string. + + Here's an example:: + + >>> from email.header import decode_header + >>> decode_header('=?iso-8859-1?q?p=F6stal?=') + [('p\xf6stal', 'iso-8859-1')] + + +.. function:: make_header(decoded_seq[, maxlinelen[, header_name[, continuation_ws]]]) + + Create a :class:`Header` instance from a sequence of pairs as returned by + :func:`decode_header`. + + :func:`decode_header` takes a header value string and returns a sequence of + pairs of the format ``(decoded_string, charset)`` where *charset* is the name of + the character set. + + This function takes one of those sequence of pairs and returns a :class:`Header` + instance. Optional *maxlinelen*, *header_name*, and *continuation_ws* are as in + the :class:`Header` constructor. + diff --git a/Doc/library/email.iterators.rst b/Doc/library/email.iterators.rst new file mode 100644 index 0000000..aa70141 --- /dev/null +++ b/Doc/library/email.iterators.rst @@ -0,0 +1,65 @@ +:mod:`email`: Iterators +----------------------- + +.. module:: email.iterators + :synopsis: Iterate over a message object tree. + + +Iterating over a message object tree is fairly easy with the +:meth:`Message.walk` method. The :mod:`email.iterators` module provides some +useful higher level iterations over message object trees. + + +.. function:: body_line_iterator(msg[, decode]) + + This iterates over all the payloads in all the subparts of *msg*, returning the + string payloads line-by-line. It skips over all the subpart headers, and it + skips over any subpart with a payload that isn't a Python string. This is + somewhat equivalent to reading the flat text representation of the message from + a file using :meth:`readline`, skipping over all the intervening headers. + + Optional *decode* is passed through to :meth:`Message.get_payload`. + + +.. function:: typed_subpart_iterator(msg[, maintype[, subtype]]) + + This iterates over all the subparts of *msg*, returning only those subparts that + match the MIME type specified by *maintype* and *subtype*. + + Note that *subtype* is optional; if omitted, then subpart MIME type matching is + done only with the main type. *maintype* is optional too; it defaults to + :mimetype:`text`. + + Thus, by default :func:`typed_subpart_iterator` returns each subpart that has a + MIME type of :mimetype:`text/\*`. + +The following function has been added as a useful debugging tool. It should +*not* be considered part of the supported public interface for the package. + + +.. function:: _structure(msg[, fp[, level]]) + + Prints an indented representation of the content types of the message object + structure. For example:: + + >>> msg = email.message_from_file(somefile) + >>> _structure(msg) + multipart/mixed + text/plain + text/plain + multipart/digest + message/rfc822 + text/plain + message/rfc822 + text/plain + message/rfc822 + text/plain + message/rfc822 + text/plain + message/rfc822 + text/plain + text/plain + + Optional *fp* is a file-like object to print the output to. It must be suitable + for Python's extended print statement. *level* is used internally. + diff --git a/Doc/library/email.message.rst b/Doc/library/email.message.rst new file mode 100644 index 0000000..e1fb20e --- /dev/null +++ b/Doc/library/email.message.rst @@ -0,0 +1,548 @@ +:mod:`email`: Representing an email message +------------------------------------------- + +.. module:: email.message + :synopsis: The base class representing email messages. + + +The central class in the :mod:`email` package is the :class:`Message` class, +imported from the :mod:`email.message` module. It is the base class for the +:mod:`email` object model. :class:`Message` provides the core functionality for +setting and querying header fields, and for accessing message bodies. + +Conceptually, a :class:`Message` object consists of *headers* and *payloads*. +Headers are :rfc:`2822` style field names and values where the field name and +value are separated by a colon. The colon is not part of either the field name +or the field value. + +Headers are stored and returned in case-preserving form but are matched +case-insensitively. There may also be a single envelope header, also known as +the *Unix-From* header or the ``From_`` header. The payload is either a string +in the case of simple message objects or a list of :class:`Message` objects for +MIME container documents (e.g. :mimetype:`multipart/\*` and +:mimetype:`message/rfc822`). + +:class:`Message` objects provide a mapping style interface for accessing the +message headers, and an explicit interface for accessing both the headers and +the payload. It provides convenience methods for generating a flat text +representation of the message object tree, for accessing commonly used header +parameters, and for recursively walking over the object tree. + +Here are the methods of the :class:`Message` class: + + +.. class:: Message() + + The constructor takes no arguments. + + +.. method:: Message.as_string([unixfrom]) + + Return the entire message flatten as a string. When optional *unixfrom* is + ``True``, the envelope header is included in the returned string. *unixfrom* + defaults to ``False``. + + Note that this method is provided as a convenience and may not always format the + message the way you want. For example, by default it mangles lines that begin + with ``From``. For more flexibility, instantiate a :class:`Generator` instance + and use its :meth:`flatten` method directly. For example:: + + from cStringIO import StringIO + from email.generator import Generator + fp = StringIO() + g = Generator(fp, mangle_from_=False, maxheaderlen=60) + g.flatten(msg) + text = fp.getvalue() + + +.. method:: Message.__str__() + + Equivalent to ``as_string(unixfrom=True)``. + + +.. method:: Message.is_multipart() + + Return ``True`` if the message's payload is a list of sub-\ :class:`Message` + objects, otherwise return ``False``. When :meth:`is_multipart` returns False, + the payload should be a string object. + + +.. method:: Message.set_unixfrom(unixfrom) + + Set the message's envelope header to *unixfrom*, which should be a string. + + +.. method:: Message.get_unixfrom() + + Return the message's envelope header. Defaults to ``None`` if the envelope + header was never set. + + +.. method:: Message.attach(payload) + + Add the given *payload* to the current payload, which must be ``None`` or a list + of :class:`Message` objects before the call. After the call, the payload will + always be a list of :class:`Message` objects. If you want to set the payload to + a scalar object (e.g. a string), use :meth:`set_payload` instead. + + +.. method:: Message.get_payload([i[, decode]]) + + Return a reference the current payload, which will be a list of :class:`Message` + objects when :meth:`is_multipart` is ``True``, or a string when + :meth:`is_multipart` is ``False``. If the payload is a list and you mutate the + list object, you modify the message's payload in place. + + With optional argument *i*, :meth:`get_payload` will return the *i*-th element + of the payload, counting from zero, if :meth:`is_multipart` is ``True``. An + :exc:`IndexError` will be raised if *i* is less than 0 or greater than or equal + to the number of items in the payload. If the payload is a string (i.e. + :meth:`is_multipart` is ``False``) and *i* is given, a :exc:`TypeError` is + raised. + + Optional *decode* is a flag indicating whether the payload should be decoded or + not, according to the :mailheader:`Content-Transfer-Encoding` header. When + ``True`` and the message is not a multipart, the payload will be decoded if this + header's value is ``quoted-printable`` or ``base64``. If some other encoding is + used, or :mailheader:`Content-Transfer-Encoding` header is missing, or if the + payload has bogus base64 data, the payload is returned as-is (undecoded). If + the message is a multipart and the *decode* flag is ``True``, then ``None`` is + returned. The default for *decode* is ``False``. + + +.. method:: Message.set_payload(payload[, charset]) + + Set the entire message object's payload to *payload*. It is the client's + responsibility to ensure the payload invariants. Optional *charset* sets the + message's default character set; see :meth:`set_charset` for details. + + .. versionchanged:: 2.2.2 + *charset* argument added. + + +.. method:: Message.set_charset(charset) + + Set the character set of the payload to *charset*, which can either be a + :class:`Charset` instance (see :mod:`email.charset`), a string naming a + character set, or ``None``. If it is a string, it will be converted to a + :class:`Charset` instance. If *charset* is ``None``, the ``charset`` parameter + will be removed from the :mailheader:`Content-Type` header. Anything else will + generate a :exc:`TypeError`. + + The message will be assumed to be of type :mimetype:`text/\*` encoded with + *charset.input_charset*. It will be converted to *charset.output_charset* and + encoded properly, if needed, when generating the plain text representation of + the message. MIME headers (:mailheader:`MIME-Version`, + :mailheader:`Content-Type`, :mailheader:`Content-Transfer-Encoding`) will be + added as needed. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_charset() + + Return the :class:`Charset` instance associated with the message's payload. + + .. versionadded:: 2.2.2 + +The following methods implement a mapping-like interface for accessing the +message's :rfc:`2822` headers. Note that there are some semantic differences +between these methods and a normal mapping (i.e. dictionary) interface. For +example, in a dictionary there are no duplicate keys, but here there may be +duplicate message headers. Also, in dictionaries there is no guaranteed order +to the keys returned by :meth:`keys`, but in a :class:`Message` object, headers +are always returned in the order they appeared in the original message, or were +added to the message later. Any header deleted and then re-added are always +appended to the end of the header list. + +These semantic differences are intentional and are biased toward maximal +convenience. + +Note that in all cases, any envelope header present in the message is not +included in the mapping interface. + + +.. method:: Message.__len__() + + Return the total number of headers, including duplicates. + + +.. method:: Message.__contains__(name) + + Return true if the message object has a field named *name*. Matching is done + case-insensitively and *name* should not include the trailing colon. Used for + the ``in`` operator, e.g.:: + + if 'message-id' in myMessage: + print 'Message-ID:', myMessage['message-id'] + + +.. method:: Message.__getitem__(name) + + Return the value of the named header field. *name* should not include the colon + field separator. If the header is missing, ``None`` is returned; a + :exc:`KeyError` is never raised. + + Note that if the named field appears more than once in the message's headers, + exactly which of those field values will be returned is undefined. Use the + :meth:`get_all` method to get the values of all the extant named headers. + + +.. method:: Message.__setitem__(name, val) + + Add a header to the message with field name *name* and value *val*. The field + is appended to the end of the message's existing fields. + + Note that this does *not* overwrite or delete any existing header with the same + name. If you want to ensure that the new header is the only one present in the + message with field name *name*, delete the field first, e.g.:: + + del msg['subject'] + msg['subject'] = 'Python roolz!' + + +.. method:: Message.__delitem__(name) + + Delete all occurrences of the field with name *name* from the message's headers. + No exception is raised if the named field isn't present in the headers. + + +.. method:: Message.has_key(name) + + Return true if the message contains a header field named *name*, otherwise + return false. + + +.. method:: Message.keys() + + Return a list of all the message's header field names. + + +.. method:: Message.values() + + Return a list of all the message's field values. + + +.. method:: Message.items() + + Return a list of 2-tuples containing all the message's field headers and values. + + +.. method:: Message.get(name[, failobj]) + + Return the value of the named header field. This is identical to + :meth:`__getitem__` except that optional *failobj* is returned if the named + header is missing (defaults to ``None``). + +Here are some additional useful methods: + + +.. method:: Message.get_all(name[, failobj]) + + Return a list of all the values for the field named *name*. If there are no such + named headers in the message, *failobj* is returned (defaults to ``None``). + + +.. method:: Message.add_header(_name, _value, **_params) + + Extended header setting. This method is similar to :meth:`__setitem__` except + that additional header parameters can be provided as keyword arguments. *_name* + is the header field to add and *_value* is the *primary* value for the header. + + For each item in the keyword argument dictionary *_params*, the key is taken as + the parameter name, with underscores converted to dashes (since dashes are + illegal in Python identifiers). Normally, the parameter will be added as + ``key="value"`` unless the value is ``None``, in which case only the key will be + added. + + Here's an example:: + + msg.add_header('Content-Disposition', 'attachment', filename='bud.gif') + + This will add a header that looks like :: + + Content-Disposition: attachment; filename="bud.gif" + + +.. method:: Message.replace_header(_name, _value) + + Replace a header. Replace the first header found in the message that matches + *_name*, retaining header order and field name case. If no matching header was + found, a :exc:`KeyError` is raised. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_content_type() + + Return the message's content type. The returned string is coerced to lower case + of the form :mimetype:`maintype/subtype`. If there was no + :mailheader:`Content-Type` header in the message the default type as given by + :meth:`get_default_type` will be returned. Since according to :rfc:`2045`, + messages always have a default type, :meth:`get_content_type` will always return + a value. + + :rfc:`2045` defines a message's default type to be :mimetype:`text/plain` unless + it appears inside a :mimetype:`multipart/digest` container, in which case it + would be :mimetype:`message/rfc822`. If the :mailheader:`Content-Type` header + has an invalid type specification, :rfc:`2045` mandates that the default type be + :mimetype:`text/plain`. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_content_maintype() + + Return the message's main content type. This is the :mimetype:`maintype` part + of the string returned by :meth:`get_content_type`. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_content_subtype() + + Return the message's sub-content type. This is the :mimetype:`subtype` part of + the string returned by :meth:`get_content_type`. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_default_type() + + Return the default content type. Most messages have a default content type of + :mimetype:`text/plain`, except for messages that are subparts of + :mimetype:`multipart/digest` containers. Such subparts have a default content + type of :mimetype:`message/rfc822`. + + .. versionadded:: 2.2.2 + + +.. method:: Message.set_default_type(ctype) + + Set the default content type. *ctype* should either be :mimetype:`text/plain` + or :mimetype:`message/rfc822`, although this is not enforced. The default + content type is not stored in the :mailheader:`Content-Type` header. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_params([failobj[, header[, unquote]]]) + + Return the message's :mailheader:`Content-Type` parameters, as a list. The + elements of the returned list are 2-tuples of key/value pairs, as split on the + ``'='`` sign. The left hand side of the ``'='`` is the key, while the right + hand side is the value. If there is no ``'='`` sign in the parameter the value + is the empty string, otherwise the value is as described in :meth:`get_param` + and is unquoted if optional *unquote* is ``True`` (the default). + + Optional *failobj* is the object to return if there is no + :mailheader:`Content-Type` header. Optional *header* is the header to search + instead of :mailheader:`Content-Type`. + + .. versionchanged:: 2.2.2 + *unquote* argument added. + + +.. method:: Message.get_param(param[, failobj[, header[, unquote]]]) + + Return the value of the :mailheader:`Content-Type` header's parameter *param* as + a string. If the message has no :mailheader:`Content-Type` header or if there + is no such parameter, then *failobj* is returned (defaults to ``None``). + + Optional *header* if given, specifies the message header to use instead of + :mailheader:`Content-Type`. + + Parameter keys are always compared case insensitively. The return value can + either be a string, or a 3-tuple if the parameter was :rfc:`2231` encoded. When + it's a 3-tuple, the elements of the value are of the form ``(CHARSET, LANGUAGE, + VALUE)``. Note that both ``CHARSET`` and ``LANGUAGE`` can be ``None``, in which + case you should consider ``VALUE`` to be encoded in the ``us-ascii`` charset. + You can usually ignore ``LANGUAGE``. + + If your application doesn't care whether the parameter was encoded as in + :rfc:`2231`, you can collapse the parameter value by calling + :func:`email.Utils.collapse_rfc2231_value`, passing in the return value from + :meth:`get_param`. This will return a suitably decoded Unicode string whn the + value is a tuple, or the original string unquoted if it isn't. For example:: + + rawparam = msg.get_param('foo') + param = email.Utils.collapse_rfc2231_value(rawparam) + + In any case, the parameter value (either the returned string, or the ``VALUE`` + item in the 3-tuple) is always unquoted, unless *unquote* is set to ``False``. + + .. versionchanged:: 2.2.2 + *unquote* argument added, and 3-tuple return value possible. + + +.. method:: Message.set_param(param, value[, header[, requote[, charset[, language]]]]) + + Set a parameter in the :mailheader:`Content-Type` header. If the parameter + already exists in the header, its value will be replaced with *value*. If the + :mailheader:`Content-Type` header as not yet been defined for this message, it + will be set to :mimetype:`text/plain` and the new parameter value will be + appended as per :rfc:`2045`. + + Optional *header* specifies an alternative header to :mailheader:`Content-Type`, + and all parameters will be quoted as necessary unless optional *requote* is + ``False`` (the default is ``True``). + + If optional *charset* is specified, the parameter will be encoded according to + :rfc:`2231`. Optional *language* specifies the RFC 2231 language, defaulting to + the empty string. Both *charset* and *language* should be strings. + + .. versionadded:: 2.2.2 + + +.. method:: Message.del_param(param[, header[, requote]]) + + Remove the given parameter completely from the :mailheader:`Content-Type` + header. The header will be re-written in place without the parameter or its + value. All values will be quoted as necessary unless *requote* is ``False`` + (the default is ``True``). Optional *header* specifies an alternative to + :mailheader:`Content-Type`. + + .. versionadded:: 2.2.2 + + +.. method:: Message.set_type(type[, header][, requote]) + + Set the main type and subtype for the :mailheader:`Content-Type` header. *type* + must be a string in the form :mimetype:`maintype/subtype`, otherwise a + :exc:`ValueError` is raised. + + This method replaces the :mailheader:`Content-Type` header, keeping all the + parameters in place. If *requote* is ``False``, this leaves the existing + header's quoting as is, otherwise the parameters will be quoted (the default). + + An alternative header can be specified in the *header* argument. When the + :mailheader:`Content-Type` header is set a :mailheader:`MIME-Version` header is + also added. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_filename([failobj]) + + Return the value of the ``filename`` parameter of the + :mailheader:`Content-Disposition` header of the message. If the header does not + have a ``filename`` parameter, this method falls back to looking for the + ``name`` parameter. If neither is found, or the header is missing, then + *failobj* is returned. The returned string will always be unquoted as per + :meth:`Utils.unquote`. + + +.. method:: Message.get_boundary([failobj]) + + Return the value of the ``boundary`` parameter of the :mailheader:`Content-Type` + header of the message, or *failobj* if either the header is missing, or has no + ``boundary`` parameter. The returned string will always be unquoted as per + :meth:`Utils.unquote`. + + +.. method:: Message.set_boundary(boundary) + + Set the ``boundary`` parameter of the :mailheader:`Content-Type` header to + *boundary*. :meth:`set_boundary` will always quote *boundary* if necessary. A + :exc:`HeaderParseError` is raised if the message object has no + :mailheader:`Content-Type` header. + + Note that using this method is subtly different than deleting the old + :mailheader:`Content-Type` header and adding a new one with the new boundary via + :meth:`add_header`, because :meth:`set_boundary` preserves the order of the + :mailheader:`Content-Type` header in the list of headers. However, it does *not* + preserve any continuation lines which may have been present in the original + :mailheader:`Content-Type` header. + + +.. method:: Message.get_content_charset([failobj]) + + Return the ``charset`` parameter of the :mailheader:`Content-Type` header, + coerced to lower case. If there is no :mailheader:`Content-Type` header, or if + that header has no ``charset`` parameter, *failobj* is returned. + + Note that this method differs from :meth:`get_charset` which returns the + :class:`Charset` instance for the default encoding of the message body. + + .. versionadded:: 2.2.2 + + +.. method:: Message.get_charsets([failobj]) + + Return a list containing the character set names in the message. If the message + is a :mimetype:`multipart`, then the list will contain one element for each + subpart in the payload, otherwise, it will be a list of length 1. + + Each item in the list will be a string which is the value of the ``charset`` + parameter in the :mailheader:`Content-Type` header for the represented subpart. + However, if the subpart has no :mailheader:`Content-Type` header, no ``charset`` + parameter, or is not of the :mimetype:`text` main MIME type, then that item in + the returned list will be *failobj*. + + +.. method:: Message.walk() + + The :meth:`walk` method is an all-purpose generator which can be used to iterate + over all the parts and subparts of a message object tree, in depth-first + traversal order. You will typically use :meth:`walk` as the iterator in a + ``for`` loop; each iteration returns the next subpart. + + Here's an example that prints the MIME type of every part of a multipart message + structure:: + + >>> for part in msg.walk(): + ... print part.get_content_type() + multipart/report + text/plain + message/delivery-status + text/plain + text/plain + message/rfc822 + +.. versionchanged:: 2.5 + The previously deprecated methods :meth:`get_type`, :meth:`get_main_type`, and + :meth:`get_subtype` were removed. + +:class:`Message` objects can also optionally contain two instance attributes, +which can be used when generating the plain text of a MIME message. + + +.. data:: preamble + + The format of a MIME document allows for some text between the blank line + following the headers, and the first multipart boundary string. Normally, this + text is never visible in a MIME-aware mail reader because it falls outside the + standard MIME armor. However, when viewing the raw text of the message, or when + viewing the message in a non-MIME aware reader, this text can become visible. + + The *preamble* attribute contains this leading extra-armor text for MIME + documents. When the :class:`Parser` discovers some text after the headers but + before the first boundary string, it assigns this text to the message's + *preamble* attribute. When the :class:`Generator` is writing out the plain text + representation of a MIME message, and it finds the message has a *preamble* + attribute, it will write this text in the area between the headers and the first + boundary. See :mod:`email.parser` and :mod:`email.generator` for details. + + Note that if the message object has no preamble, the *preamble* attribute will + be ``None``. + + +.. data:: epilogue + + The *epilogue* attribute acts the same way as the *preamble* attribute, except + that it contains text that appears between the last boundary and the end of the + message. + + .. versionchanged:: 2.5 + You do not need to set the epilogue to the empty string in order for the + :class:`Generator` to print a newline at the end of the file. + + +.. data:: defects + + The *defects* attribute contains a list of all the problems found when parsing + this message. See :mod:`email.errors` for a detailed description of the + possible parsing defects. + + .. versionadded:: 2.4 + diff --git a/Doc/library/email.mime.rst b/Doc/library/email.mime.rst new file mode 100644 index 0000000..6f1b0ae --- /dev/null +++ b/Doc/library/email.mime.rst @@ -0,0 +1,175 @@ +:mod:`email`: Creating email and MIME objects from scratch +---------------------------------------------------------- + +.. module:: email.mime + :synopsis: Build MIME messages. + + +Ordinarily, you get a message object structure by passing a file or some text to +a parser, which parses the text and returns the root message object. However +you can also build a complete message structure from scratch, or even individual +:class:`Message` objects by hand. In fact, you can also take an existing +structure and add new :class:`Message` objects, move them around, etc. This +makes a very convenient interface for slicing-and-dicing MIME messages. + +You can create a new object structure by creating :class:`Message` instances, +adding attachments and all the appropriate headers manually. For MIME messages +though, the :mod:`email` package provides some convenient subclasses to make +things easier. + +Here are the classes: + + +.. class:: MIMEBase(_maintype, _subtype, **_params) + + Module: :mod:`email.mime.base` + + This is the base class for all the MIME-specific subclasses of :class:`Message`. + Ordinarily you won't create instances specifically of :class:`MIMEBase`, + although you could. :class:`MIMEBase` is provided primarily as a convenient + base class for more specific MIME-aware subclasses. + + *_maintype* is the :mailheader:`Content-Type` major type (e.g. :mimetype:`text` + or :mimetype:`image`), and *_subtype* is the :mailheader:`Content-Type` minor + type (e.g. :mimetype:`plain` or :mimetype:`gif`). *_params* is a parameter + key/value dictionary and is passed directly to :meth:`Message.add_header`. + + The :class:`MIMEBase` class always adds a :mailheader:`Content-Type` header + (based on *_maintype*, *_subtype*, and *_params*), and a + :mailheader:`MIME-Version` header (always set to ``1.0``). + + +.. class:: MIMENonMultipart() + + Module: :mod:`email.mime.nonmultipart` + + A subclass of :class:`MIMEBase`, this is an intermediate base class for MIME + messages that are not :mimetype:`multipart`. The primary purpose of this class + is to prevent the use of the :meth:`attach` method, which only makes sense for + :mimetype:`multipart` messages. If :meth:`attach` is called, a + :exc:`MultipartConversionError` exception is raised. + + .. versionadded:: 2.2.2 + + +.. class:: MIMEMultipart([subtype[, boundary[, _subparts[, _params]]]]) + + Module: :mod:`email.mime.multipart` + + A subclass of :class:`MIMEBase`, this is an intermediate base class for MIME + messages that are :mimetype:`multipart`. Optional *_subtype* defaults to + :mimetype:`mixed`, but can be used to specify the subtype of the message. A + :mailheader:`Content-Type` header of :mimetype:`multipart/`*_subtype* will be + added to the message object. A :mailheader:`MIME-Version` header will also be + added. + + Optional *boundary* is the multipart boundary string. When ``None`` (the + default), the boundary is calculated when needed. + + *_subparts* is a sequence of initial subparts for the payload. It must be + possible to convert this sequence to a list. You can always attach new subparts + to the message by using the :meth:`Message.attach` method. + + Additional parameters for the :mailheader:`Content-Type` header are taken from + the keyword arguments, or passed into the *_params* argument, which is a keyword + dictionary. + + .. versionadded:: 2.2.2 + + +.. class:: MIMEApplication(_data[, _subtype[, _encoder[, **_params]]]) + + Module: :mod:`email.mime.application` + + A subclass of :class:`MIMENonMultipart`, the :class:`MIMEApplication` class is + used to represent MIME message objects of major type :mimetype:`application`. + *_data* is a string containing the raw byte data. Optional *_subtype* specifies + the MIME subtype and defaults to :mimetype:`octet-stream`. + + Optional *_encoder* is a callable (i.e. function) which will perform the actual + encoding of the data for transport. This callable takes one argument, which is + the :class:`MIMEApplication` instance. It should use :meth:`get_payload` and + :meth:`set_payload` to change the payload to encoded form. It should also add + any :mailheader:`Content-Transfer-Encoding` or other headers to the message + object as necessary. The default encoding is base64. See the + :mod:`email.encoders` module for a list of the built-in encoders. + + *_params* are passed straight through to the base class constructor. + + .. versionadded:: 2.5 + + +.. class:: MIMEAudio(_audiodata[, _subtype[, _encoder[, **_params]]]) + + Module: :mod:`email.mime.audio` + + A subclass of :class:`MIMENonMultipart`, the :class:`MIMEAudio` class is used to + create MIME message objects of major type :mimetype:`audio`. *_audiodata* is a + string containing the raw audio data. If this data can be decoded by the + standard Python module :mod:`sndhdr`, then the subtype will be automatically + included in the :mailheader:`Content-Type` header. Otherwise you can explicitly + specify the audio subtype via the *_subtype* parameter. If the minor type could + not be guessed and *_subtype* was not given, then :exc:`TypeError` is raised. + + Optional *_encoder* is a callable (i.e. function) which will perform the actual + encoding of the audio data for transport. This callable takes one argument, + which is the :class:`MIMEAudio` instance. It should use :meth:`get_payload` and + :meth:`set_payload` to change the payload to encoded form. It should also add + any :mailheader:`Content-Transfer-Encoding` or other headers to the message + object as necessary. The default encoding is base64. See the + :mod:`email.encoders` module for a list of the built-in encoders. + + *_params* are passed straight through to the base class constructor. + + +.. class:: MIMEImage(_imagedata[, _subtype[, _encoder[, **_params]]]) + + Module: :mod:`email.mime.image` + + A subclass of :class:`MIMENonMultipart`, the :class:`MIMEImage` class is used to + create MIME message objects of major type :mimetype:`image`. *_imagedata* is a + string containing the raw image data. If this data can be decoded by the + standard Python module :mod:`imghdr`, then the subtype will be automatically + included in the :mailheader:`Content-Type` header. Otherwise you can explicitly + specify the image subtype via the *_subtype* parameter. If the minor type could + not be guessed and *_subtype* was not given, then :exc:`TypeError` is raised. + + Optional *_encoder* is a callable (i.e. function) which will perform the actual + encoding of the image data for transport. This callable takes one argument, + which is the :class:`MIMEImage` instance. It should use :meth:`get_payload` and + :meth:`set_payload` to change the payload to encoded form. It should also add + any :mailheader:`Content-Transfer-Encoding` or other headers to the message + object as necessary. The default encoding is base64. See the + :mod:`email.encoders` module for a list of the built-in encoders. + + *_params* are passed straight through to the :class:`MIMEBase` constructor. + + +.. class:: MIMEMessage(_msg[, _subtype]) + + Module: :mod:`email.mime.message` + + A subclass of :class:`MIMENonMultipart`, the :class:`MIMEMessage` class is used + to create MIME objects of main type :mimetype:`message`. *_msg* is used as the + payload, and must be an instance of class :class:`Message` (or a subclass + thereof), otherwise a :exc:`TypeError` is raised. + + Optional *_subtype* sets the subtype of the message; it defaults to + :mimetype:`rfc822`. + + +.. class:: MIMEText(_text[, _subtype[, _charset]]) + + Module: :mod:`email.mime.text` + + A subclass of :class:`MIMENonMultipart`, the :class:`MIMEText` class is used to + create MIME objects of major type :mimetype:`text`. *_text* is the string for + the payload. *_subtype* is the minor type and defaults to :mimetype:`plain`. + *_charset* is the character set of the text and is passed as a parameter to the + :class:`MIMENonMultipart` constructor; it defaults to ``us-ascii``. No guessing + or encoding is performed on the text data. + + .. versionchanged:: 2.4 + The previously deprecated *_encoding* argument has been removed. Encoding + happens implicitly based on the *_charset* argument. + diff --git a/Doc/library/email.parser.rst b/Doc/library/email.parser.rst new file mode 100644 index 0000000..048ed22 --- /dev/null +++ b/Doc/library/email.parser.rst @@ -0,0 +1,220 @@ +:mod:`email`: Parsing email messages +------------------------------------ + +.. module:: email.parser + :synopsis: Parse flat text email messages to produce a message object structure. + + +Message object structures can be created in one of two ways: they can be created +from whole cloth by instantiating :class:`Message` objects and stringing them +together via :meth:`attach` and :meth:`set_payload` calls, or they can be +created by parsing a flat text representation of the email message. + +The :mod:`email` package provides a standard parser that understands most email +document structures, including MIME documents. You can pass the parser a string +or a file object, and the parser will return to you the root :class:`Message` +instance of the object structure. For simple, non-MIME messages the payload of +this root object will likely be a string containing the text of the message. +For MIME messages, the root object will return ``True`` from its +:meth:`is_multipart` method, and the subparts can be accessed via the +:meth:`get_payload` and :meth:`walk` methods. + +There are actually two parser interfaces available for use, the classic +:class:`Parser` API and the incremental :class:`FeedParser` API. The classic +:class:`Parser` API is fine if you have the entire text of the message in memory +as a string, or if the entire message lives in a file on the file system. +:class:`FeedParser` is more appropriate for when you're reading the message from +a stream which might block waiting for more input (e.g. reading an email message +from a socket). The :class:`FeedParser` can consume and parse the message +incrementally, and only returns the root object when you close the parser [#]_. + +Note that the parser can be extended in limited ways, and of course you can +implement your own parser completely from scratch. There is no magical +connection between the :mod:`email` package's bundled parser and the +:class:`Message` class, so your custom parser can create message object trees +any way it finds necessary. + + +FeedParser API +^^^^^^^^^^^^^^ + +.. versionadded:: 2.4 + +The :class:`FeedParser`, imported from the :mod:`email.feedparser` module, +provides an API that is conducive to incremental parsing of email messages, such +as would be necessary when reading the text of an email message from a source +that can block (e.g. a socket). The :class:`FeedParser` can of course be used +to parse an email message fully contained in a string or a file, but the classic +:class:`Parser` API may be more convenient for such use cases. The semantics +and results of the two parser APIs are identical. + +The :class:`FeedParser`'s API is simple; you create an instance, feed it a bunch +of text until there's no more to feed it, then close the parser to retrieve the +root message object. The :class:`FeedParser` is extremely accurate when parsing +standards-compliant messages, and it does a very good job of parsing +non-compliant messages, providing information about how a message was deemed +broken. It will populate a message object's *defects* attribute with a list of +any problems it found in a message. See the :mod:`email.errors` module for the +list of defects that it can find. + +Here is the API for the :class:`FeedParser`: + + +.. class:: FeedParser([_factory]) + + Create a :class:`FeedParser` instance. Optional *_factory* is a no-argument + callable that will be called whenever a new message object is needed. It + defaults to the :class:`email.message.Message` class. + + +.. method:: FeedParser.feed(data) + + Feed the :class:`FeedParser` some more data. *data* should be a string + containing one or more lines. The lines can be partial and the + :class:`FeedParser` will stitch such partial lines together properly. The lines + in the string can have any of the common three line endings, carriage return, + newline, or carriage return and newline (they can even be mixed). + + +.. method:: FeedParser.close() + + Closing a :class:`FeedParser` completes the parsing of all previously fed data, + and returns the root message object. It is undefined what happens if you feed + more data to a closed :class:`FeedParser`. + + +Parser class API +^^^^^^^^^^^^^^^^ + +The :class:`Parser` class, imported from the :mod:`email.parser` module, +provides an API that can be used to parse a message when the complete contents +of the message are available in a string or file. The :mod:`email.parser` +module also provides a second class, called :class:`HeaderParser` which can be +used if you're only interested in the headers of the message. +:class:`HeaderParser` can be much faster in these situations, since it does not +attempt to parse the message body, instead setting the payload to the raw body +as a string. :class:`HeaderParser` has the same API as the :class:`Parser` +class. + + +.. class:: Parser([_class]) + + The constructor for the :class:`Parser` class takes an optional argument + *_class*. This must be a callable factory (such as a function or a class), and + it is used whenever a sub-message object needs to be created. It defaults to + :class:`Message` (see :mod:`email.message`). The factory will be called without + arguments. + + The optional *strict* flag is ignored. + + .. deprecated:: 2.4 + Because the :class:`Parser` class is a backward compatible API wrapper + around the new-in-Python 2.4 :class:`FeedParser`, *all* parsing is + effectively non-strict. You should simply stop passing a *strict* flag to + the :class:`Parser` constructor. + + .. versionchanged:: 2.2.2 + The *strict* flag was added. + + .. versionchanged:: 2.4 + The *strict* flag was deprecated. + +The other public :class:`Parser` methods are: + + +.. method:: Parser.parse(fp[, headersonly]) + + Read all the data from the file-like object *fp*, parse the resulting text, and + return the root message object. *fp* must support both the :meth:`readline` and + the :meth:`read` methods on file-like objects. + + The text contained in *fp* must be formatted as a block of :rfc:`2822` style + headers and header continuation lines, optionally preceded by a envelope + header. The header block is terminated either by the end of the data or by a + blank line. Following the header block is the body of the message (which may + contain MIME-encoded subparts). + + Optional *headersonly* is as with the :meth:`parse` method. + + .. versionchanged:: 2.2.2 + The *headersonly* flag was added. + + +.. method:: Parser.parsestr(text[, headersonly]) + + Similar to the :meth:`parse` method, except it takes a string object instead of + a file-like object. Calling this method on a string is exactly equivalent to + wrapping *text* in a :class:`StringIO` instance first and calling :meth:`parse`. + + Optional *headersonly* is a flag specifying whether to stop parsing after + reading the headers or not. The default is ``False``, meaning it parses the + entire contents of the file. + + .. versionchanged:: 2.2.2 + The *headersonly* flag was added. + +Since creating a message object structure from a string or a file object is such +a common task, two functions are provided as a convenience. They are available +in the top-level :mod:`email` package namespace. + + +.. function:: message_from_string(s[, _class[, strict]]) + + Return a message object structure from a string. This is exactly equivalent to + ``Parser().parsestr(s)``. Optional *_class* and *strict* are interpreted as + with the :class:`Parser` class constructor. + + .. versionchanged:: 2.2.2 + The *strict* flag was added. + + +.. function:: message_from_file(fp[, _class[, strict]]) + + Return a message object structure tree from an open file object. This is + exactly equivalent to ``Parser().parse(fp)``. Optional *_class* and *strict* + are interpreted as with the :class:`Parser` class constructor. + + .. versionchanged:: 2.2.2 + The *strict* flag was added. + +Here's an example of how you might use this at an interactive Python prompt:: + + >>> import email + >>> msg = email.message_from_string(myString) + + +Additional notes +^^^^^^^^^^^^^^^^ + +Here are some notes on the parsing semantics: + +* Most non-\ :mimetype:`multipart` type messages are parsed as a single message + object with a string payload. These objects will return ``False`` for + :meth:`is_multipart`. Their :meth:`get_payload` method will return a string + object. + +* All :mimetype:`multipart` type messages will be parsed as a container message + object with a list of sub-message objects for their payload. The outer + container message will return ``True`` for :meth:`is_multipart` and their + :meth:`get_payload` method will return the list of :class:`Message` subparts. + +* Most messages with a content type of :mimetype:`message/\*` (e.g. + :mimetype:`message/delivery-status` and :mimetype:`message/rfc822`) will also be + parsed as container object containing a list payload of length 1. Their + :meth:`is_multipart` method will return ``True``. The single element in the + list payload will be a sub-message object. + +* Some non-standards compliant messages may not be internally consistent about + their :mimetype:`multipart`\ -edness. Such messages may have a + :mailheader:`Content-Type` header of type :mimetype:`multipart`, but their + :meth:`is_multipart` method may return ``False``. If such messages were parsed + with the :class:`FeedParser`, they will have an instance of the + :class:`MultipartInvariantViolationDefect` class in their *defects* attribute + list. See :mod:`email.errors` for details. + +.. rubric:: Footnotes + +.. [#] As of email package version 3.0, introduced in Python 2.4, the classic + :class:`Parser` was re-implemented in terms of the :class:`FeedParser`, so the + semantics and results are identical between the two parsers. + diff --git a/Doc/library/email.rst b/Doc/library/email.rst new file mode 100644 index 0000000..212c321 --- /dev/null +++ b/Doc/library/email.rst @@ -0,0 +1,324 @@ +.. % Copyright (C) 2001-2007 Python Software Foundation +.. % Author: barry@python.org (Barry Warsaw) + + +:mod:`email` --- An email and MIME handling package +=================================================== + +.. module:: email + :synopsis: Package supporting the parsing, manipulating, and generating email messages, + including MIME documents. +.. moduleauthor:: Barry A. Warsaw <barry@python.org> +.. sectionauthor:: Barry A. Warsaw <barry@python.org> + + +.. versionadded:: 2.2 + +The :mod:`email` package is a library for managing email messages, including +MIME and other :rfc:`2822`\ -based message documents. It subsumes most of the +functionality in several older standard modules such as :mod:`rfc822`, +:mod:`mimetools`, :mod:`multifile`, and other non-standard packages such as +:mod:`mimecntl`. It is specifically *not* designed to do any sending of email +messages to SMTP (:rfc:`2821`), NNTP, or other servers; those are functions of +modules such as :mod:`smtplib` and :mod:`nntplib`. The :mod:`email` package +attempts to be as RFC-compliant as possible, supporting in addition to +:rfc:`2822`, such MIME-related RFCs as :rfc:`2045`, :rfc:`2046`, :rfc:`2047`, +and :rfc:`2231`. + +The primary distinguishing feature of the :mod:`email` package is that it splits +the parsing and generating of email messages from the internal *object model* +representation of email. Applications using the :mod:`email` package deal +primarily with objects; you can add sub-objects to messages, remove sub-objects +from messages, completely re-arrange the contents, etc. There is a separate +parser and a separate generator which handles the transformation from flat text +to the object model, and then back to flat text again. There are also handy +subclasses for some common MIME object types, and a few miscellaneous utilities +that help with such common tasks as extracting and parsing message field values, +creating RFC-compliant dates, etc. + +The following sections describe the functionality of the :mod:`email` package. +The ordering follows a progression that should be common in applications: an +email message is read as flat text from a file or other source, the text is +parsed to produce the object structure of the email message, this structure is +manipulated, and finally, the object tree is rendered back into flat text. + +It is perfectly feasible to create the object structure out of whole cloth --- +i.e. completely from scratch. From there, a similar progression can be taken as +above. + +Also included are detailed specifications of all the classes and modules that +the :mod:`email` package provides, the exception classes you might encounter +while using the :mod:`email` package, some auxiliary utilities, and a few +examples. For users of the older :mod:`mimelib` package, or previous versions +of the :mod:`email` package, a section on differences and porting is provided. + +Contents of the :mod:`email` package documentation: + +.. toctree:: + + email.message.rst + email.parser.rst + email.generator.rst + email.mime.rst + email.header.rst + email.charset.rst + email.encoders.rst + email.errors.rst + email.util.rst + email.iterators.rst + email-examples.rst + + +.. seealso:: + + Module :mod:`smtplib` + SMTP protocol client + + Module :mod:`nntplib` + NNTP protocol client + + +.. _email-pkg-history: + +Package History +--------------- + +This table describes the release history of the email package, corresponding to +the version of Python that the package was released with. For purposes of this +document, when you see a note about change or added versions, these refer to the +Python version the change was made in, *not* the email package version. This +table also describes the Python compatibility of each version of the package. + ++---------------+------------------------------+-----------------------+ +| email version | distributed with | compatible with | ++===============+==============================+=======================+ +| :const:`1.x` | Python 2.2.0 to Python 2.2.1 | *no longer supported* | ++---------------+------------------------------+-----------------------+ +| :const:`2.5` | Python 2.2.2+ and Python 2.3 | Python 2.1 to 2.5 | ++---------------+------------------------------+-----------------------+ +| :const:`3.0` | Python 2.4 | Python 2.3 to 2.5 | ++---------------+------------------------------+-----------------------+ +| :const:`4.0` | Python 2.5 | Python 2.3 to 2.5 | ++---------------+------------------------------+-----------------------+ + +Here are the major differences between :mod:`email` version 4 and version 3: + +* All modules have been renamed according to :pep:`8` standards. For example, + the version 3 module :mod:`email.Message` was renamed to :mod:`email.message` in + version 4. + +* A new subpackage :mod:`email.mime` was added and all the version 3 + :mod:`email.MIME\*` modules were renamed and situated into the :mod:`email.mime` + subpackage. For example, the version 3 module :mod:`email.MIMEText` was renamed + to :mod:`email.mime.text`. + + *Note that the version 3 names will continue to work until Python 2.6*. + +* The :mod:`email.mime.application` module was added, which contains the + :class:`MIMEApplication` class. + +* Methods that were deprecated in version 3 have been removed. These include + :meth:`Generator.__call__`, :meth:`Message.get_type`, + :meth:`Message.get_main_type`, :meth:`Message.get_subtype`. + +* Fixes have been added for :rfc:`2231` support which can change some of the + return types for :func:`Message.get_param` and friends. Under some + circumstances, values which used to return a 3-tuple now return simple strings + (specifically, if all extended parameter segments were unencoded, there is no + language and charset designation expected, so the return type is now a simple + string). Also, %-decoding used to be done for both encoded and unencoded + segments; this decoding is now done only for encoded segments. + +Here are the major differences between :mod:`email` version 3 and version 2: + +* The :class:`FeedParser` class was introduced, and the :class:`Parser` class + was implemented in terms of the :class:`FeedParser`. All parsing therefore is + non-strict, and parsing will make a best effort never to raise an exception. + Problems found while parsing messages are stored in the message's *defect* + attribute. + +* All aspects of the API which raised :exc:`DeprecationWarning`\ s in version 2 + have been removed. These include the *_encoder* argument to the + :class:`MIMEText` constructor, the :meth:`Message.add_payload` method, the + :func:`Utils.dump_address_pair` function, and the functions :func:`Utils.decode` + and :func:`Utils.encode`. + +* New :exc:`DeprecationWarning`\ s have been added to: + :meth:`Generator.__call__`, :meth:`Message.get_type`, + :meth:`Message.get_main_type`, :meth:`Message.get_subtype`, and the *strict* + argument to the :class:`Parser` class. These are expected to be removed in + future versions. + +* Support for Pythons earlier than 2.3 has been removed. + +Here are the differences between :mod:`email` version 2 and version 1: + +* The :mod:`email.Header` and :mod:`email.Charset` modules have been added. + +* The pickle format for :class:`Message` instances has changed. Since this was + never (and still isn't) formally defined, this isn't considered a backward + incompatibility. However if your application pickles and unpickles + :class:`Message` instances, be aware that in :mod:`email` version 2, + :class:`Message` instances now have private variables *_charset* and + *_default_type*. + +* Several methods in the :class:`Message` class have been deprecated, or their + signatures changed. Also, many new methods have been added. See the + documentation for the :class:`Message` class for details. The changes should be + completely backward compatible. + +* The object structure has changed in the face of :mimetype:`message/rfc822` + content types. In :mod:`email` version 1, such a type would be represented by a + scalar payload, i.e. the container message's :meth:`is_multipart` returned + false, :meth:`get_payload` was not a list object, but a single :class:`Message` + instance. + + This structure was inconsistent with the rest of the package, so the object + representation for :mimetype:`message/rfc822` content types was changed. In + :mod:`email` version 2, the container *does* return ``True`` from + :meth:`is_multipart`, and :meth:`get_payload` returns a list containing a single + :class:`Message` item. + + Note that this is one place that backward compatibility could not be completely + maintained. However, if you're already testing the return type of + :meth:`get_payload`, you should be fine. You just need to make sure your code + doesn't do a :meth:`set_payload` with a :class:`Message` instance on a container + with a content type of :mimetype:`message/rfc822`. + +* The :class:`Parser` constructor's *strict* argument was added, and its + :meth:`parse` and :meth:`parsestr` methods grew a *headersonly* argument. The + *strict* flag was also added to functions :func:`email.message_from_file` and + :func:`email.message_from_string`. + +* :meth:`Generator.__call__` is deprecated; use :meth:`Generator.flatten` + instead. The :class:`Generator` class has also grown the :meth:`clone` method. + +* The :class:`DecodedGenerator` class in the :mod:`email.Generator` module was + added. + +* The intermediate base classes :class:`MIMENonMultipart` and + :class:`MIMEMultipart` have been added, and interposed in the class hierarchy + for most of the other MIME-related derived classes. + +* The *_encoder* argument to the :class:`MIMEText` constructor has been + deprecated. Encoding now happens implicitly based on the *_charset* argument. + +* The following functions in the :mod:`email.Utils` module have been deprecated: + :func:`dump_address_pairs`, :func:`decode`, and :func:`encode`. The following + functions have been added to the module: :func:`make_msgid`, + :func:`decode_rfc2231`, :func:`encode_rfc2231`, and :func:`decode_params`. + +* The non-public function :func:`email.Iterators._structure` was added. + + +Differences from :mod:`mimelib` +------------------------------- + +The :mod:`email` package was originally prototyped as a separate library called +`mimelib <http://mimelib.sf.net/>`_. Changes have been made so that method names +are more consistent, and some methods or modules have either been added or +removed. The semantics of some of the methods have also changed. For the most +part, any functionality available in :mod:`mimelib` is still available in the +:mod:`email` package, albeit often in a different way. Backward compatibility +between the :mod:`mimelib` package and the :mod:`email` package was not a +priority. + +Here is a brief description of the differences between the :mod:`mimelib` and +the :mod:`email` packages, along with hints on how to port your applications. + +Of course, the most visible difference between the two packages is that the +package name has been changed to :mod:`email`. In addition, the top-level +package has the following differences: + +* :func:`messageFromString` has been renamed to :func:`message_from_string`. + +* :func:`messageFromFile` has been renamed to :func:`message_from_file`. + +The :class:`Message` class has the following differences: + +* The method :meth:`asString` was renamed to :meth:`as_string`. + +* The method :meth:`ismultipart` was renamed to :meth:`is_multipart`. + +* The :meth:`get_payload` method has grown a *decode* optional argument. + +* The method :meth:`getall` was renamed to :meth:`get_all`. + +* The method :meth:`addheader` was renamed to :meth:`add_header`. + +* The method :meth:`gettype` was renamed to :meth:`get_type`. + +* The method :meth:`getmaintype` was renamed to :meth:`get_main_type`. + +* The method :meth:`getsubtype` was renamed to :meth:`get_subtype`. + +* The method :meth:`getparams` was renamed to :meth:`get_params`. Also, whereas + :meth:`getparams` returned a list of strings, :meth:`get_params` returns a list + of 2-tuples, effectively the key/value pairs of the parameters, split on the + ``'='`` sign. + +* The method :meth:`getparam` was renamed to :meth:`get_param`. + +* The method :meth:`getcharsets` was renamed to :meth:`get_charsets`. + +* The method :meth:`getfilename` was renamed to :meth:`get_filename`. + +* The method :meth:`getboundary` was renamed to :meth:`get_boundary`. + +* The method :meth:`setboundary` was renamed to :meth:`set_boundary`. + +* The method :meth:`getdecodedpayload` was removed. To get similar + functionality, pass the value 1 to the *decode* flag of the get_payload() + method. + +* The method :meth:`getpayloadastext` was removed. Similar functionality is + supported by the :class:`DecodedGenerator` class in the :mod:`email.generator` + module. + +* The method :meth:`getbodyastext` was removed. You can get similar + functionality by creating an iterator with :func:`typed_subpart_iterator` in the + :mod:`email.iterators` module. + +The :class:`Parser` class has no differences in its public interface. It does +have some additional smarts to recognize :mimetype:`message/delivery-status` +type messages, which it represents as a :class:`Message` instance containing +separate :class:`Message` subparts for each header block in the delivery status +notification [#]_. + +The :class:`Generator` class has no differences in its public interface. There +is a new class in the :mod:`email.generator` module though, called +:class:`DecodedGenerator` which provides most of the functionality previously +available in the :meth:`Message.getpayloadastext` method. + +The following modules and classes have been changed: + +* The :class:`MIMEBase` class constructor arguments *_major* and *_minor* have + changed to *_maintype* and *_subtype* respectively. + +* The ``Image`` class/module has been renamed to ``MIMEImage``. The *_minor* + argument has been renamed to *_subtype*. + +* The ``Text`` class/module has been renamed to ``MIMEText``. The *_minor* + argument has been renamed to *_subtype*. + +* The ``MessageRFC822`` class/module has been renamed to ``MIMEMessage``. Note + that an earlier version of :mod:`mimelib` called this class/module ``RFC822``, + but that clashed with the Python standard library module :mod:`rfc822` on some + case-insensitive file systems. + + Also, the :class:`MIMEMessage` class now represents any kind of MIME message + with main type :mimetype:`message`. It takes an optional argument *_subtype* + which is used to set the MIME subtype. *_subtype* defaults to + :mimetype:`rfc822`. + +:mod:`mimelib` provided some utility functions in its :mod:`address` and +:mod:`date` modules. All of these functions have been moved to the +:mod:`email.utils` module. + +The ``MsgReader`` class/module has been removed. Its functionality is most +closely supported in the :func:`body_line_iterator` function in the +:mod:`email.iterators` module. + +.. rubric:: Footnotes + +.. [#] Delivery Status Notifications (DSN) are defined in :rfc:`1894`. diff --git a/Doc/library/email.util.rst b/Doc/library/email.util.rst new file mode 100644 index 0000000..aa67885 --- /dev/null +++ b/Doc/library/email.util.rst @@ -0,0 +1,166 @@ +:mod:`email`: Miscellaneous utilities +------------------------------------- + +.. module:: email.utils + :synopsis: Miscellaneous email package utilities. + + +There are several useful utilities provided in the :mod:`email.utils` module: + + +.. function:: quote(str) + + Return a new string with backslashes in *str* replaced by two backslashes, and + double quotes replaced by backslash-double quote. + + +.. function:: unquote(str) + + Return a new string which is an *unquoted* version of *str*. If *str* ends and + begins with double quotes, they are stripped off. Likewise if *str* ends and + begins with angle brackets, they are stripped off. + + +.. function:: parseaddr(address) + + Parse address -- which should be the value of some address-containing field such + as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and + *email address* parts. Returns a tuple of that information, unless the parse + fails, in which case a 2-tuple of ``('', '')`` is returned. + + +.. function:: formataddr(pair) + + The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, + email_address)`` and returns the string value suitable for a :mailheader:`To` or + :mailheader:`Cc` header. If the first element of *pair* is false, then the + second element is returned unmodified. + + +.. function:: getaddresses(fieldvalues) + + This method returns a list of 2-tuples of the form returned by ``parseaddr()``. + *fieldvalues* is a sequence of header field values as might be returned by + :meth:`Message.get_all`. Here's a simple example that gets all the recipients + of a message:: + + from email.utils import getaddresses + + tos = msg.get_all('to', []) + ccs = msg.get_all('cc', []) + resent_tos = msg.get_all('resent-to', []) + resent_ccs = msg.get_all('resent-cc', []) + all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs) + + +.. function:: parsedate(date) + + Attempts to parse a date according to the rules in :rfc:`2822`. however, some + mailers don't follow that format as specified, so :func:`parsedate` tries to + guess correctly in such cases. *date* is a string containing an :rfc:`2822` + date, such as ``"Mon, 20 Nov 1995 19:12:08 -0500"``. If it succeeds in parsing + the date, :func:`parsedate` returns a 9-tuple that can be passed directly to + :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6, + 7, and 8 of the result tuple are not usable. + + +.. function:: parsedate_tz(date) + + Performs the same function as :func:`parsedate`, but returns either ``None`` or + a 10-tuple; the first 9 elements make up a tuple that can be passed directly to + :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC + (which is the official term for Greenwich Mean Time) [#]_. If the input string + has no timezone, the last element of the tuple returned is ``None``. Note that + indexes 6, 7, and 8 of the result tuple are not usable. + + +.. function:: mktime_tz(tuple) + + Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. It + the timezone item in the tuple is ``None``, assume local time. Minor + deficiency: :func:`mktime_tz` interprets the first 8 elements of *tuple* as a + local time and then compensates for the timezone difference. This may yield a + slight error around changes in daylight savings time, though not worth worrying + about for common use. + + +.. function:: formatdate([timeval[, localtime][, usegmt]]) + + Returns a date string as per :rfc:`2822`, e.g.:: + + Fri, 09 Nov 2001 01:08:47 -0000 + + Optional *timeval* if given is a floating point time value as accepted by + :func:`time.gmtime` and :func:`time.localtime`, otherwise the current time is + used. + + Optional *localtime* is a flag that when ``True``, interprets *timeval*, and + returns a date relative to the local timezone instead of UTC, properly taking + daylight savings time into account. The default is ``False`` meaning UTC is + used. + + Optional *usegmt* is a flag that when ``True``, outputs a date string with the + timezone as an ascii string ``GMT``, rather than a numeric ``-0000``. This is + needed for some protocols (such as HTTP). This only applies when *localtime* is + ``False``. + + .. versionadded:: 2.4 + + +.. function:: make_msgid([idstring]) + + Returns a string suitable for an :rfc:`2822`\ -compliant + :mailheader:`Message-ID` header. Optional *idstring* if given, is a string used + to strengthen the uniqueness of the message id. + + +.. function:: decode_rfc2231(s) + + Decode the string *s* according to :rfc:`2231`. + + +.. function:: encode_rfc2231(s[, charset[, language]]) + + Encode the string *s* according to :rfc:`2231`. Optional *charset* and + *language*, if given is the character set name and language name to use. If + neither is given, *s* is returned as-is. If *charset* is given but *language* + is not, the string is encoded using the empty string for *language*. + + +.. function:: collapse_rfc2231_value(value[, errors[, fallback_charset]]) + + When a header parameter is encoded in :rfc:`2231` format, + :meth:`Message.get_param` may return a 3-tuple containing the character set, + language, and value. :func:`collapse_rfc2231_value` turns this into a unicode + string. Optional *errors* is passed to the *errors* argument of the built-in + :func:`unicode` function; it defaults to ``replace``. Optional + *fallback_charset* specifies the character set to use if the one in the + :rfc:`2231` header is not known by Python; it defaults to ``us-ascii``. + + For convenience, if the *value* passed to :func:`collapse_rfc2231_value` is not + a tuple, it should be a string and it is returned unquoted. + + +.. function:: decode_params(params) + + Decode parameters list according to :rfc:`2231`. *params* is a sequence of + 2-tuples containing elements of the form ``(content-type, string-value)``. + +.. versionchanged:: 2.4 + The :func:`dump_address_pair` function has been removed; use :func:`formataddr` + instead. + +.. versionchanged:: 2.4 + The :func:`decode` function has been removed; use the + :meth:`Header.decode_header` method instead. + +.. versionchanged:: 2.4 + The :func:`encode` function has been removed; use the :meth:`Header.encode` + method instead. + +.. rubric:: Footnotes + +.. [#] Note that the sign of the timezone offset is the opposite of the sign of the + ``time.timezone`` variable for the same timezone; the latter variable follows + the POSIX standard while this module follows :rfc:`2822`. + diff --git a/Doc/library/errno.rst b/Doc/library/errno.rst new file mode 100644 index 0000000..daf9ff0 --- /dev/null +++ b/Doc/library/errno.rst @@ -0,0 +1,636 @@ + +:mod:`errno` --- Standard errno system symbols +============================================== + +.. module:: errno + :synopsis: Standard errno system symbols. + + +This module makes available standard ``errno`` system symbols. The value of each +symbol is the corresponding integer value. The names and descriptions are +borrowed from :file:`linux/include/errno.h`, which should be pretty +all-inclusive. + + +.. data:: errorcode + + Dictionary providing a mapping from the errno value to the string name in the + underlying system. For instance, ``errno.errorcode[errno.EPERM]`` maps to + ``'EPERM'``. + +To translate a numeric error code to an error message, use :func:`os.strerror`. + +Of the following list, symbols that are not used on the current platform are not +defined by the module. The specific list of defined symbols is available as +``errno.errorcode.keys()``. Symbols available can include: + + +.. data:: EPERM + + Operation not permitted + + +.. data:: ENOENT + + No such file or directory + + +.. data:: ESRCH + + No such process + + +.. data:: EINTR + + Interrupted system call + + +.. data:: EIO + + I/O error + + +.. data:: ENXIO + + No such device or address + + +.. data:: E2BIG + + Arg list too long + + +.. data:: ENOEXEC + + Exec format error + + +.. data:: EBADF + + Bad file number + + +.. data:: ECHILD + + No child processes + + +.. data:: EAGAIN + + Try again + + +.. data:: ENOMEM + + Out of memory + + +.. data:: EACCES + + Permission denied + + +.. data:: EFAULT + + Bad address + + +.. data:: ENOTBLK + + Block device required + + +.. data:: EBUSY + + Device or resource busy + + +.. data:: EEXIST + + File exists + + +.. data:: EXDEV + + Cross-device link + + +.. data:: ENODEV + + No such device + + +.. data:: ENOTDIR + + Not a directory + + +.. data:: EISDIR + + Is a directory + + +.. data:: EINVAL + + Invalid argument + + +.. data:: ENFILE + + File table overflow + + +.. data:: EMFILE + + Too many open files + + +.. data:: ENOTTY + + Not a typewriter + + +.. data:: ETXTBSY + + Text file busy + + +.. data:: EFBIG + + File too large + + +.. data:: ENOSPC + + No space left on device + + +.. data:: ESPIPE + + Illegal seek + + +.. data:: EROFS + + Read-only file system + + +.. data:: EMLINK + + Too many links + + +.. data:: EPIPE + + Broken pipe + + +.. data:: EDOM + + Math argument out of domain of func + + +.. data:: ERANGE + + Math result not representable + + +.. data:: EDEADLK + + Resource deadlock would occur + + +.. data:: ENAMETOOLONG + + File name too long + + +.. data:: ENOLCK + + No record locks available + + +.. data:: ENOSYS + + Function not implemented + + +.. data:: ENOTEMPTY + + Directory not empty + + +.. data:: ELOOP + + Too many symbolic links encountered + + +.. data:: EWOULDBLOCK + + Operation would block + + +.. data:: ENOMSG + + No message of desired type + + +.. data:: EIDRM + + Identifier removed + + +.. data:: ECHRNG + + Channel number out of range + + +.. data:: EL2NSYNC + + Level 2 not synchronized + + +.. data:: EL3HLT + + Level 3 halted + + +.. data:: EL3RST + + Level 3 reset + + +.. data:: ELNRNG + + Link number out of range + + +.. data:: EUNATCH + + Protocol driver not attached + + +.. data:: ENOCSI + + No CSI structure available + + +.. data:: EL2HLT + + Level 2 halted + + +.. data:: EBADE + + Invalid exchange + + +.. data:: EBADR + + Invalid request descriptor + + +.. data:: EXFULL + + Exchange full + + +.. data:: ENOANO + + No anode + + +.. data:: EBADRQC + + Invalid request code + + +.. data:: EBADSLT + + Invalid slot + + +.. data:: EDEADLOCK + + File locking deadlock error + + +.. data:: EBFONT + + Bad font file format + + +.. data:: ENOSTR + + Device not a stream + + +.. data:: ENODATA + + No data available + + +.. data:: ETIME + + Timer expired + + +.. data:: ENOSR + + Out of streams resources + + +.. data:: ENONET + + Machine is not on the network + + +.. data:: ENOPKG + + Package not installed + + +.. data:: EREMOTE + + Object is remote + + +.. data:: ENOLINK + + Link has been severed + + +.. data:: EADV + + Advertise error + + +.. data:: ESRMNT + + Srmount error + + +.. data:: ECOMM + + Communication error on send + + +.. data:: EPROTO + + Protocol error + + +.. data:: EMULTIHOP + + Multihop attempted + + +.. data:: EDOTDOT + + RFS specific error + + +.. data:: EBADMSG + + Not a data message + + +.. data:: EOVERFLOW + + Value too large for defined data type + + +.. data:: ENOTUNIQ + + Name not unique on network + + +.. data:: EBADFD + + File descriptor in bad state + + +.. data:: EREMCHG + + Remote address changed + + +.. data:: ELIBACC + + Can not access a needed shared library + + +.. data:: ELIBBAD + + Accessing a corrupted shared library + + +.. data:: ELIBSCN + + .lib section in a.out corrupted + + +.. data:: ELIBMAX + + Attempting to link in too many shared libraries + + +.. data:: ELIBEXEC + + Cannot exec a shared library directly + + +.. data:: EILSEQ + + Illegal byte sequence + + +.. data:: ERESTART + + Interrupted system call should be restarted + + +.. data:: ESTRPIPE + + Streams pipe error + + +.. data:: EUSERS + + Too many users + + +.. data:: ENOTSOCK + + Socket operation on non-socket + + +.. data:: EDESTADDRREQ + + Destination address required + + +.. data:: EMSGSIZE + + Message too long + + +.. data:: EPROTOTYPE + + Protocol wrong type for socket + + +.. data:: ENOPROTOOPT + + Protocol not available + + +.. data:: EPROTONOSUPPORT + + Protocol not supported + + +.. data:: ESOCKTNOSUPPORT + + Socket type not supported + + +.. data:: EOPNOTSUPP + + Operation not supported on transport endpoint + + +.. data:: EPFNOSUPPORT + + Protocol family not supported + + +.. data:: EAFNOSUPPORT + + Address family not supported by protocol + + +.. data:: EADDRINUSE + + Address already in use + + +.. data:: EADDRNOTAVAIL + + Cannot assign requested address + + +.. data:: ENETDOWN + + Network is down + + +.. data:: ENETUNREACH + + Network is unreachable + + +.. data:: ENETRESET + + Network dropped connection because of reset + + +.. data:: ECONNABORTED + + Software caused connection abort + + +.. data:: ECONNRESET + + Connection reset by peer + + +.. data:: ENOBUFS + + No buffer space available + + +.. data:: EISCONN + + Transport endpoint is already connected + + +.. data:: ENOTCONN + + Transport endpoint is not connected + + +.. data:: ESHUTDOWN + + Cannot send after transport endpoint shutdown + + +.. data:: ETOOMANYREFS + + Too many references: cannot splice + + +.. data:: ETIMEDOUT + + Connection timed out + + +.. data:: ECONNREFUSED + + Connection refused + + +.. data:: EHOSTDOWN + + Host is down + + +.. data:: EHOSTUNREACH + + No route to host + + +.. data:: EALREADY + + Operation already in progress + + +.. data:: EINPROGRESS + + Operation now in progress + + +.. data:: ESTALE + + Stale NFS file handle + + +.. data:: EUCLEAN + + Structure needs cleaning + + +.. data:: ENOTNAM + + Not a XENIX named type file + + +.. data:: ENAVAIL + + No XENIX semaphores available + + +.. data:: EISNAM + + Is a named type file + + +.. data:: EREMOTEIO + + Remote I/O error + + +.. data:: EDQUOT + + Quota exceeded + diff --git a/Doc/library/exceptions.rst b/Doc/library/exceptions.rst new file mode 100644 index 0000000..d6a64fc --- /dev/null +++ b/Doc/library/exceptions.rst @@ -0,0 +1,475 @@ +.. _bltin-exceptions: + +Built-in Exceptions +=================== + +.. module:: exceptions + :synopsis: Standard exception classes. + + +Exceptions should be class objects. The exceptions are defined in the module +:mod:`exceptions`. This module never needs to be imported explicitly: the +exceptions are provided in the built-in namespace as well as the +:mod:`exceptions` module. + +.. index:: + statement: try + statement: except + +For class exceptions, in a :keyword:`try` statement with an :keyword:`except` +clause that mentions a particular class, that clause also handles any exception +classes derived from that class (but not exception classes from which *it* is +derived). Two exception classes that are not related via subclassing are never +equivalent, even if they have the same name. + +.. index:: statement: raise + +The built-in exceptions listed below can be generated by the interpreter or +built-in functions. Except where mentioned, they have an "associated value" +indicating the detailed cause of the error. This may be a string or a tuple +containing several items of information (e.g., an error code and a string +explaining the code). The associated value is the second argument to the +:keyword:`raise` statement. If the exception class is derived from the standard +root class :exc:`BaseException`, the associated value is present as the +exception instance's :attr:`args` attribute. + +User code can raise built-in exceptions. This can be used to test an exception +handler or to report an error condition "just like" the situation in which the +interpreter raises the same exception; but beware that there is nothing to +prevent user code from raising an inappropriate error. + +The built-in exception classes can be sub-classed to define new exceptions; +programmers are encouraged to at least derive new exceptions from the +:exc:`Exception` class and not :exc:`BaseException`. More information on +defining exceptions is available in the Python Tutorial under +:ref:`tut-userexceptions`. + +The following exceptions are only used as base classes for other exceptions. + + +.. exception:: BaseException + + The base class for all built-in exceptions. It is not meant to be directly + inherited by user-defined classes (for that use :exc:`Exception`). If + :func:`str` or :func:`unicode` is called on an instance of this class, the + representation of the argument(s) to the instance are returned or the emptry + string when there were no arguments. All arguments are stored in :attr:`args` + as a tuple. + + .. versionadded:: 2.5 + + +.. exception:: Exception + + All built-in, non-system-exiting exceptions are derived from this class. All + user-defined exceptions should also be derived from this class. + + .. versionchanged:: 2.5 + Changed to inherit from :exc:`BaseException`. + + +.. exception:: ArithmeticError + + The base class for those built-in exceptions that are raised for various + arithmetic errors: :exc:`OverflowError`, :exc:`ZeroDivisionError`, + :exc:`FloatingPointError`. + + +.. exception:: LookupError + + The base class for the exceptions that are raised when a key or index used on a + mapping or sequence is invalid: :exc:`IndexError`, :exc:`KeyError`. This can be + raised directly by :func:`sys.setdefaultencoding`. + + +.. exception:: EnvironmentError + + The base class for exceptions that can occur outside the Python system: + :exc:`IOError`, :exc:`OSError`. When exceptions of this type are created with a + 2-tuple, the first item is available on the instance's :attr:`errno` attribute + (it is assumed to be an error number), and the second item is available on the + :attr:`strerror` attribute (it is usually the associated error message). The + tuple itself is also available on the :attr:`args` attribute. + + .. versionadded:: 1.5.2 + + When an :exc:`EnvironmentError` exception is instantiated with a 3-tuple, the + first two items are available as above, while the third item is available on the + :attr:`filename` attribute. However, for backwards compatibility, the + :attr:`args` attribute contains only a 2-tuple of the first two constructor + arguments. + + The :attr:`filename` attribute is ``None`` when this exception is created with + other than 3 arguments. The :attr:`errno` and :attr:`strerror` attributes are + also ``None`` when the instance was created with other than 2 or 3 arguments. + In this last case, :attr:`args` contains the verbatim constructor arguments as a + tuple. + +The following exceptions are the exceptions that are actually raised. + + +.. exception:: AssertionError + + .. index:: statement: assert + + Raised when an :keyword:`assert` statement fails. + + +.. exception:: AttributeError + + Raised when an attribute reference or assignment fails. (When an object does + not support attribute references or attribute assignments at all, + :exc:`TypeError` is raised.) + + .. % xref to attribute reference? + + +.. exception:: EOFError + + Raised when attempting to read beyond the end of a file. (N.B.: the :meth:`read` + and :meth:`readline` methods of file objects return an empty string when they + hit EOF.) + + .. % XXXJH xrefs here + .. % XXXJH xrefs here + + +.. exception:: FloatingPointError + + Raised when a floating point operation fails. This exception is always defined, + but can only be raised when Python is configured with the + :option:`--with-fpectl` option, or the :const:`WANT_SIGFPE_HANDLER` symbol is + defined in the :file:`pyconfig.h` file. + + +.. exception:: GeneratorExit + + Raise when a generator's :meth:`close` method is called. + + .. versionadded:: 2.5 + + .. versionchanged:: 3.0 + Changed to inherit from Exception instead of StandardError. + + +.. exception:: IOError + + Raised when an I/O operation (such as a :keyword:`print` statement, the built-in + :func:`open` function or a method of a file object) fails for an I/O-related + reason, e.g., "file not found" or "disk full". + + .. % XXXJH xrefs here + + This class is derived from :exc:`EnvironmentError`. See the discussion above + for more information on exception instance attributes. + + +.. exception:: ImportError + + Raised when an :keyword:`import` statement fails to find the module definition + or when a ``from ... import`` fails to find a name that is to be imported. + + .. % XXXJH xref to import statement? + + +.. exception:: IndexError + + Raised when a sequence subscript is out of range. (Slice indices are silently + truncated to fall in the allowed range; if an index is not a plain integer, + :exc:`TypeError` is raised.) + + .. % XXXJH xref to sequences + + +.. exception:: KeyError + + Raised when a mapping (dictionary) key is not found in the set of existing keys. + + .. % XXXJH xref to mapping objects? + + +.. exception:: KeyboardInterrupt + + Raised when the user hits the interrupt key (normally :kbd:`Control-C` or + :kbd:`Delete`). During execution, a check for interrupts is made regularly. The + exception inherits from :exc:`BaseException` so as to not be accidentally caught + by code that catches :exc:`Exception` and thus prevent the interpreter from + exiting. + + .. % XXX(hylton) xrefs here + + .. versionchanged:: 2.5 + Changed to inherit from :exc:`BaseException`. + + +.. exception:: MemoryError + + Raised when an operation runs out of memory but the situation may still be + rescued (by deleting some objects). The associated value is a string indicating + what kind of (internal) operation ran out of memory. Note that because of the + underlying memory management architecture (C's :cfunc:`malloc` function), the + interpreter may not always be able to completely recover from this situation; it + nevertheless raises an exception so that a stack traceback can be printed, in + case a run-away program was the cause. + + +.. exception:: NameError + + Raised when a local or global name is not found. This applies only to + unqualified names. The associated value is an error message that includes the + name that could not be found. + + +.. exception:: NotImplementedError + + This exception is derived from :exc:`RuntimeError`. In user defined base + classes, abstract methods should raise this exception when they require derived + classes to override the method. + + .. versionadded:: 1.5.2 + + +.. exception:: OSError + + This class is derived from :exc:`EnvironmentError` and is used primarily as the + :mod:`os` module's ``os.error`` exception. See :exc:`EnvironmentError` above for + a description of the possible associated values. + + .. % xref for os module + + .. versionadded:: 1.5.2 + + +.. exception:: OverflowError + + Raised when the result of an arithmetic operation is too large to be + represented. This cannot occur for long integers (which would rather raise + :exc:`MemoryError` than give up). Because of the lack of standardization of + floating point exception handling in C, most floating point operations also + aren't checked. For plain integers, all operations that can overflow are + checked except left shift, where typical applications prefer to drop bits than + raise an exception. + + .. % XXXJH reference to long's and/or int's? + + +.. exception:: ReferenceError + + This exception is raised when a weak reference proxy, created by the + :func:`weakref.proxy` function, is used to access an attribute of the referent + after it has been garbage collected. For more information on weak references, + see the :mod:`weakref` module. + + .. versionadded:: 2.2 + Previously known as the :exc:`weakref.ReferenceError` exception. + + +.. exception:: RuntimeError + + Raised when an error is detected that doesn't fall in any of the other + categories. The associated value is a string indicating what precisely went + wrong. (This exception is mostly a relic from a previous version of the + interpreter; it is not used very much any more.) + + +.. exception:: StopIteration + + Raised by builtin :func:`next` and an iterator's :meth:`__next__` method to + signal that there are no further values. + + .. versionadded:: 2.2 + + .. versionchanged:: 3.0 + Changed to inherit from Exception instead of StandardError. + + +.. exception:: SyntaxError + + Raised when the parser encounters a syntax error. This may occur in an + :keyword:`import` statement, in a call to the built-in functions :func:`exec` + or :func:`eval`, or when reading the initial script or standard input + (also interactively). + + .. % XXXJH xref to these functions? + + Instances of this class have attributes :attr:`filename`, :attr:`lineno`, + :attr:`offset` and :attr:`text` for easier access to the details. :func:`str` + of the exception instance returns only the message. + + +.. exception:: SystemError + + Raised when the interpreter finds an internal error, but the situation does not + look so serious to cause it to abandon all hope. The associated value is a + string indicating what went wrong (in low-level terms). + + You should report this to the author or maintainer of your Python interpreter. + Be sure to report the version of the Python interpreter (``sys.version``; it is + also printed at the start of an interactive Python session), the exact error + message (the exception's associated value) and if possible the source of the + program that triggered the error. + + +.. exception:: SystemExit + + This exception is raised by the :func:`sys.exit` function. When it is not + handled, the Python interpreter exits; no stack traceback is printed. If the + associated value is a plain integer, it specifies the system exit status (passed + to C's :cfunc:`exit` function); if it is ``None``, the exit status is zero; if + it has another type (such as a string), the object's value is printed and the + exit status is one. + + .. % XXX(hylton) xref to module sys? + + Instances have an attribute :attr:`code` which is set to the proposed exit + status or error message (defaulting to ``None``). Also, this exception derives + directly from :exc:`BaseException` and not :exc:`Exception`, since it is not + technically an error. + + A call to :func:`sys.exit` is translated into an exception so that clean-up + handlers (:keyword:`finally` clauses of :keyword:`try` statements) can be + executed, and so that a debugger can execute a script without running the risk + of losing control. The :func:`os._exit` function can be used if it is + absolutely positively necessary to exit immediately (for example, in the child + process after a call to :func:`fork`). + + The exception inherits from :exc:`BaseException` instead of :exc:`Exception` so + that it is not accidentally caught by code that catches :exc:`Exception`. This + allows the exception to properly propagate up and cause the interpreter to exit. + + .. versionchanged:: 2.5 + Changed to inherit from :exc:`BaseException`. + + +.. exception:: TypeError + + Raised when an operation or function is applied to an object of inappropriate + type. The associated value is a string giving details about the type mismatch. + + +.. exception:: UnboundLocalError + + Raised when a reference is made to a local variable in a function or method, but + no value has been bound to that variable. This is a subclass of + :exc:`NameError`. + + .. versionadded:: 2.0 + + +.. exception:: UnicodeError + + Raised when a Unicode-related encoding or decoding error occurs. It is a + subclass of :exc:`ValueError`. + + .. versionadded:: 2.0 + + +.. exception:: UnicodeEncodeError + + Raised when a Unicode-related error occurs during encoding. It is a subclass of + :exc:`UnicodeError`. + + .. versionadded:: 2.3 + + +.. exception:: UnicodeDecodeError + + Raised when a Unicode-related error occurs during decoding. It is a subclass of + :exc:`UnicodeError`. + + .. versionadded:: 2.3 + + +.. exception:: UnicodeTranslateError + + Raised when a Unicode-related error occurs during translating. It is a subclass + of :exc:`UnicodeError`. + + .. versionadded:: 2.3 + + +.. exception:: ValueError + + Raised when a built-in operation or function receives an argument that has the + right type but an inappropriate value, and the situation is not described by a + more precise exception such as :exc:`IndexError`. + + +.. exception:: WindowsError + + Raised when a Windows-specific error occurs or when the error number does not + correspond to an :cdata:`errno` value. The :attr:`winerror` and + :attr:`strerror` values are created from the return values of the + :cfunc:`GetLastError` and :cfunc:`FormatMessage` functions from the Windows + Platform API. The :attr:`errno` value maps the :attr:`winerror` value to + corresponding ``errno.h`` values. This is a subclass of :exc:`OSError`. + + .. versionadded:: 2.0 + + .. versionchanged:: 2.5 + Previous versions put the :cfunc:`GetLastError` codes into :attr:`errno`. + + +.. exception:: ZeroDivisionError + + Raised when the second argument of a division or modulo operation is zero. The + associated value is a string indicating the type of the operands and the + operation. + +The following exceptions are used as warning categories; see the :mod:`warnings` +module for more information. + + +.. exception:: Warning + + Base class for warning categories. + + +.. exception:: UserWarning + + Base class for warnings generated by user code. + + +.. exception:: DeprecationWarning + + Base class for warnings about deprecated features. + + +.. exception:: PendingDeprecationWarning + + Base class for warnings about features which will be deprecated in the future. + + +.. exception:: SyntaxWarning + + Base class for warnings about dubious syntax + + +.. exception:: RuntimeWarning + + Base class for warnings about dubious runtime behavior. + + +.. exception:: FutureWarning + + Base class for warnings about constructs that will change semantically in the + future. + + +.. exception:: ImportWarning + + Base class for warnings about probable mistakes in module imports. + + .. versionadded:: 2.5 + + +.. exception:: UnicodeWarning + + Base class for warnings related to Unicode. + + .. versionadded:: 2.5 + +The class hierarchy for built-in exceptions is: + + +.. literalinclude:: ../../Lib/test/exception_hierarchy.txt diff --git a/Doc/library/fcntl.rst b/Doc/library/fcntl.rst new file mode 100644 index 0000000..2d7bb9c --- /dev/null +++ b/Doc/library/fcntl.rst @@ -0,0 +1,155 @@ + +:mod:`fcntl` --- The :func:`fcntl` and :func:`ioctl` system calls +================================================================= + +.. module:: fcntl + :platform: Unix + :synopsis: The fcntl() and ioctl() system calls. +.. sectionauthor:: Jaap Vermeulen + + +.. index:: + pair: UNIX@Unix; file control + pair: UNIX@Unix; I/O control + +This module performs file control and I/O control on file descriptors. It is an +interface to the :cfunc:`fcntl` and :cfunc:`ioctl` Unix routines. + +All functions in this module take a file descriptor *fd* as their first +argument. This can be an integer file descriptor, such as returned by +``sys.stdin.fileno()``, or a file object, such as ``sys.stdin`` itself, which +provides a :meth:`fileno` which returns a genuine file descriptor. + +The module defines the following functions: + + +.. function:: fcntl(fd, op[, arg]) + + Perform the requested operation on file descriptor *fd* (file objects providing + a :meth:`fileno` method are accepted as well). The operation is defined by *op* + and is operating system dependent. These codes are also found in the + :mod:`fcntl` module. The argument *arg* is optional, and defaults to the integer + value ``0``. When present, it can either be an integer value, or a string. + With the argument missing or an integer value, the return value of this function + is the integer return value of the C :cfunc:`fcntl` call. When the argument is + a string it represents a binary structure, e.g. created by :func:`struct.pack`. + The binary data is copied to a buffer whose address is passed to the C + :cfunc:`fcntl` call. The return value after a successful call is the contents + of the buffer, converted to a string object. The length of the returned string + will be the same as the length of the *arg* argument. This is limited to 1024 + bytes. If the information returned in the buffer by the operating system is + larger than 1024 bytes, this is most likely to result in a segmentation + violation or a more subtle data corruption. + + If the :cfunc:`fcntl` fails, an :exc:`IOError` is raised. + + +.. function:: ioctl(fd, op[, arg[, mutate_flag]]) + + This function is identical to the :func:`fcntl` function, except that the + operations are typically defined in the library module :mod:`termios` and the + argument handling is even more complicated. + + The parameter *arg* can be one of an integer, absent (treated identically to the + integer ``0``), an object supporting the read-only buffer interface (most likely + a plain Python string) or an object supporting the read-write buffer interface. + + In all but the last case, behaviour is as for the :func:`fcntl` function. + + If a mutable buffer is passed, then the behaviour is determined by the value of + the *mutate_flag* parameter. + + If it is false, the buffer's mutability is ignored and behaviour is as for a + read-only buffer, except that the 1024 byte limit mentioned above is avoided -- + so long as the buffer you pass is as least as long as what the operating system + wants to put there, things should work. + + If *mutate_flag* is true, then the buffer is (in effect) passed to the + underlying :func:`ioctl` system call, the latter's return code is passed back to + the calling Python, and the buffer's new contents reflect the action of the + :func:`ioctl`. This is a slight simplification, because if the supplied buffer + is less than 1024 bytes long it is first copied into a static buffer 1024 bytes + long which is then passed to :func:`ioctl` and copied back into the supplied + buffer. + + If *mutate_flag* is not supplied, then from Python 2.5 it defaults to true, + which is a change from versions 2.3 and 2.4. Supply the argument explicitly if + version portability is a priority. + + An example:: + + >>> import array, fcntl, struct, termios, os + >>> os.getpgrp() + 13341 + >>> struct.unpack('h', fcntl.ioctl(0, termios.TIOCGPGRP, " "))[0] + 13341 + >>> buf = array.array('h', [0]) + >>> fcntl.ioctl(0, termios.TIOCGPGRP, buf, 1) + 0 + >>> buf + array('h', [13341]) + + +.. function:: flock(fd, op) + + Perform the lock operation *op* on file descriptor *fd* (file objects providing + a :meth:`fileno` method are accepted as well). See the Unix manual + :manpage:`flock(3)` for details. (On some systems, this function is emulated + using :cfunc:`fcntl`.) + + +.. function:: lockf(fd, operation, [length, [start, [whence]]]) + + This is essentially a wrapper around the :func:`fcntl` locking calls. *fd* is + the file descriptor of the file to lock or unlock, and *operation* is one of the + following values: + + * :const:`LOCK_UN` -- unlock + * :const:`LOCK_SH` -- acquire a shared lock + * :const:`LOCK_EX` -- acquire an exclusive lock + + When *operation* is :const:`LOCK_SH` or :const:`LOCK_EX`, it can also be + bit-wise OR'd with :const:`LOCK_NB` to avoid blocking on lock acquisition. + If :const:`LOCK_NB` is used and the lock cannot be acquired, an + :exc:`IOError` will be raised and the exception will have an *errno* + attribute set to :const:`EACCES` or :const:`EAGAIN` (depending on the + operating system; for portability, check for both values). On at least some + systems, :const:`LOCK_EX` can only be used if the file descriptor refers to a + file opened for writing. + + *length* is the number of bytes to lock, *start* is the byte offset at which the + lock starts, relative to *whence*, and *whence* is as with :func:`fileobj.seek`, + specifically: + + * :const:`0` -- relative to the start of the file (:const:`SEEK_SET`) + * :const:`1` -- relative to the current buffer position (:const:`SEEK_CUR`) + * :const:`2` -- relative to the end of the file (:const:`SEEK_END`) + + The default for *start* is 0, which means to start at the beginning of the file. + The default for *length* is 0 which means to lock to the end of the file. The + default for *whence* is also 0. + +Examples (all on a SVR4 compliant system):: + + import struct, fcntl, os + + f = open(...) + rv = fcntl.fcntl(f, fcntl.F_SETFL, os.O_NDELAY) + + lockdata = struct.pack('hhllhh', fcntl.F_WRLCK, 0, 0, 0, 0, 0) + rv = fcntl.fcntl(f, fcntl.F_SETLKW, lockdata) + +Note that in the first example the return value variable *rv* will hold an +integer value; in the second example it will hold a string value. The structure +lay-out for the *lockdata* variable is system dependent --- therefore using the +:func:`flock` call may be better. + + +.. seealso:: + + Module :mod:`os` + If the locking flags :const:`O_SHLOCK` and :const:`O_EXLOCK` are present + in the :mod:`os` module, the :func:`os.open` function provides a more + platform-independent alternative to the :func:`lockf` and :func:`flock` + functions. + diff --git a/Doc/library/filecmp.rst b/Doc/library/filecmp.rst new file mode 100644 index 0000000..6004214 --- /dev/null +++ b/Doc/library/filecmp.rst @@ -0,0 +1,152 @@ + +:mod:`filecmp` --- File and Directory Comparisons +================================================= + +.. module:: filecmp + :synopsis: Compare files efficiently. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`filecmp` module defines functions to compare files and directories, +with various optional time/correctness trade-offs. + +The :mod:`filecmp` module defines the following functions: + + +.. function:: cmp(f1, f2[, shallow]) + + Compare the files named *f1* and *f2*, returning ``True`` if they seem equal, + ``False`` otherwise. + + Unless *shallow* is given and is false, files with identical :func:`os.stat` + signatures are taken to be equal. + + Files that were compared using this function will not be compared again unless + their :func:`os.stat` signature changes. + + Note that no external programs are called from this function, giving it + portability and efficiency. + + +.. function:: cmpfiles(dir1, dir2, common[, shallow]) + + Returns three lists of file names: *match*, *mismatch*, *errors*. *match* + contains the list of files match in both directories, *mismatch* includes the + names of those that don't, and *errros* lists the names of files which could not + be compared. Files may be listed in *errors* because the user may lack + permission to read them or many other reasons, but always that the comparison + could not be done for some reason. + + The *common* parameter is a list of file names found in both directories. The + *shallow* parameter has the same meaning and default value as for + :func:`filecmp.cmp`. + +Example:: + + >>> import filecmp + >>> filecmp.cmp('undoc.rst', 'undoc.rst') + True + >>> filecmp.cmp('undoc.rst', 'index.rst') + False + + +.. _dircmp-objects: + +The :class:`dircmp` class +------------------------- + +:class:`dircmp` instances are built using this constructor: + + +.. class:: dircmp(a, b[, ignore[, hide]]) + + Construct a new directory comparison object, to compare the directories *a* and + *b*. *ignore* is a list of names to ignore, and defaults to ``['RCS', 'CVS', + 'tags']``. *hide* is a list of names to hide, and defaults to ``[os.curdir, + os.pardir]``. + +The :class:`dircmp` class provides the following methods: + + +.. method:: dircmp.report() + + Print (to ``sys.stdout``) a comparison between *a* and *b*. + + +.. method:: dircmp.report_partial_closure() + + Print a comparison between *a* and *b* and common immediate subdirectories. + + +.. method:: dircmp.report_full_closure() + + Print a comparison between *a* and *b* and common subdirectories (recursively). + +The :class:`dircmp` offers a number of interesting attributes that may be used +to get various bits of information about the directory trees being compared. + +Note that via :meth:`__getattr__` hooks, all attributes are computed lazily, so +there is no speed penalty if only those attributes which are lightweight to +compute are used. + + +.. attribute:: dircmp.left_list + + Files and subdirectories in *a*, filtered by *hide* and *ignore*. + + +.. attribute:: dircmp.right_list + + Files and subdirectories in *b*, filtered by *hide* and *ignore*. + + +.. attribute:: dircmp.common + + Files and subdirectories in both *a* and *b*. + + +.. attribute:: dircmp.left_only + + Files and subdirectories only in *a*. + + +.. attribute:: dircmp.right_only + + Files and subdirectories only in *b*. + + +.. attribute:: dircmp.common_dirs + + Subdirectories in both *a* and *b*. + + +.. attribute:: dircmp.common_files + + Files in both *a* and *b* + + +.. attribute:: dircmp.common_funny + + Names in both *a* and *b*, such that the type differs between the directories, + or names for which :func:`os.stat` reports an error. + + +.. attribute:: dircmp.same_files + + Files which are identical in both *a* and *b*. + + +.. attribute:: dircmp.diff_files + + Files which are in both *a* and *b*, whose contents differ. + + +.. attribute:: dircmp.funny_files + + Files which are in both *a* and *b*, but could not be compared. + + +.. attribute:: dircmp.subdirs + + A dictionary mapping names in :attr:`common_dirs` to :class:`dircmp` objects. + diff --git a/Doc/library/fileformats.rst b/Doc/library/fileformats.rst new file mode 100644 index 0000000..c0c2eed --- /dev/null +++ b/Doc/library/fileformats.rst @@ -0,0 +1,18 @@ + +.. _fileformats: + +************ +File Formats +************ + +The modules described in this chapter parse various miscellaneous file formats +that aren't markup languages or are related to e-mail. + + +.. toctree:: + + csv.rst + configparser.rst + robotparser.rst + netrc.rst + xdrlib.rst diff --git a/Doc/library/fileinput.rst b/Doc/library/fileinput.rst new file mode 100644 index 0000000..d0a3ed9 --- /dev/null +++ b/Doc/library/fileinput.rst @@ -0,0 +1,183 @@ +:mod:`fileinput` --- Iterate over lines from multiple input streams +=================================================================== + +.. module:: fileinput + :synopsis: Loop over standard input or a list of files. +.. moduleauthor:: Guido van Rossum <guido@python.org> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +This module implements a helper class and functions to quickly write a loop over +standard input or a list of files. + +The typical use is:: + + import fileinput + for line in fileinput.input(): + process(line) + +This iterates over the lines of all files listed in ``sys.argv[1:]``, defaulting +to ``sys.stdin`` if the list is empty. If a filename is ``'-'``, it is also +replaced by ``sys.stdin``. To specify an alternative list of filenames, pass it +as the first argument to :func:`input`. A single file name is also allowed. + +All files are opened in text mode by default, but you can override this by +specifying the *mode* parameter in the call to :func:`input` or +:class:`FileInput()`. If an I/O error occurs during opening or reading a file, +:exc:`IOError` is raised. + +If ``sys.stdin`` is used more than once, the second and further use will return +no lines, except perhaps for interactive use, or if it has been explicitly reset +(e.g. using ``sys.stdin.seek(0)``). + +Empty files are opened and immediately closed; the only time their presence in +the list of filenames is noticeable at all is when the last file opened is +empty. + +Lines are returned with any newlines intact, which means that the last line in +a file may not have one. + +You can control how files are opened by providing an opening hook via the +*openhook* parameter to :func:`fileinput.input` or :class:`FileInput()`. The +hook must be a function that takes two arguments, *filename* and *mode*, and +returns an accordingly opened file-like object. Two useful hooks are already +provided by this module. + +The following function is the primary interface of this module: + + +.. function:: input([files[, inplace[, backup[, mode[, openhook]]]]]) + + Create an instance of the :class:`FileInput` class. The instance will be used + as global state for the functions of this module, and is also returned to use + during iteration. The parameters to this function will be passed along to the + constructor of the :class:`FileInput` class. + + .. versionchanged:: 2.5 + Added the *mode* and *openhook* parameters. + +The following functions use the global state created by :func:`fileinput.input`; +if there is no active state, :exc:`RuntimeError` is raised. + + +.. function:: filename() + + Return the name of the file currently being read. Before the first line has + been read, returns ``None``. + + +.. function:: fileno() + + Return the integer "file descriptor" for the current file. When no file is + opened (before the first line and between files), returns ``-1``. + + .. versionadded:: 2.5 + + +.. function:: lineno() + + Return the cumulative line number of the line that has just been read. Before + the first line has been read, returns ``0``. After the last line of the last + file has been read, returns the line number of that line. + + +.. function:: filelineno() + + Return the line number in the current file. Before the first line has been + read, returns ``0``. After the last line of the last file has been read, + returns the line number of that line within the file. + + +.. function:: isfirstline() + + Returns true if the line just read is the first line of its file, otherwise + returns false. + + +.. function:: isstdin() + + Returns true if the last line was read from ``sys.stdin``, otherwise returns + false. + + +.. function:: nextfile() + + Close the current file so that the next iteration will read the first line from + the next file (if any); lines not read from the file will not count towards the + cumulative line count. The filename is not changed until after the first line + of the next file has been read. Before the first line has been read, this + function has no effect; it cannot be used to skip the first file. After the + last line of the last file has been read, this function has no effect. + + +.. function:: close() + + Close the sequence. + +The class which implements the sequence behavior provided by the module is +available for subclassing as well: + + +.. class:: FileInput([files[, inplace[, backup[, mode[, openhook]]]]]) + + Class :class:`FileInput` is the implementation; its methods :meth:`filename`, + :meth:`fileno`, :meth:`lineno`, :meth:`filelineno`, :meth:`isfirstline`, + :meth:`isstdin`, :meth:`nextfile` and :meth:`close` correspond to the functions + of the same name in the module. In addition it has a :meth:`readline` method + which returns the next input line, and a :meth:`__getitem__` method which + implements the sequence behavior. The sequence must be accessed in strictly + sequential order; random access and :meth:`readline` cannot be mixed. + + With *mode* you can specify which file mode will be passed to :func:`open`. It + must be one of ``'r'``, ``'rU'``, ``'U'`` and ``'rb'``. + + The *openhook*, when given, must be a function that takes two arguments, + *filename* and *mode*, and returns an accordingly opened file-like object. You + cannot use *inplace* and *openhook* together. + + .. versionchanged:: 2.5 + Added the *mode* and *openhook* parameters. + +**Optional in-place filtering:** if the keyword argument ``inplace=1`` is passed +to :func:`fileinput.input` or to the :class:`FileInput` constructor, the file is +moved to a backup file and standard output is directed to the input file (if a +file of the same name as the backup file already exists, it will be replaced +silently). This makes it possible to write a filter that rewrites its input +file in place. If the *backup* parameter is given (typically as +``backup='.<some extension>'``), it specifies the extension for the backup file, +and the backup file remains around; by default, the extension is ``'.bak'`` and +it is deleted when the output file is closed. In-place filtering is disabled +when standard input is read. + +**Caveat:** The current implementation does not work for MS-DOS 8+3 filesystems. + +The two following opening hooks are provided by this module: + + +.. function:: hook_compressed(filename, mode) + + Transparently opens files compressed with gzip and bzip2 (recognized by the + extensions ``'.gz'`` and ``'.bz2'``) using the :mod:`gzip` and :mod:`bz2` + modules. If the filename extension is not ``'.gz'`` or ``'.bz2'``, the file is + opened normally (ie, using :func:`open` without any decompression). + + Usage example: ``fi = fileinput.FileInput(openhook=fileinput.hook_compressed)`` + + .. versionadded:: 2.5 + + +.. function:: hook_encoded(encoding) + + Returns a hook which opens each file with :func:`codecs.open`, using the given + *encoding* to read the file. + + Usage example: ``fi = + fileinput.FileInput(openhook=fileinput.hook_encoded("iso-8859-1"))`` + + .. note:: + + With this hook, :class:`FileInput` might return Unicode strings depending on the + specified *encoding*. + + .. versionadded:: 2.5 + diff --git a/Doc/library/filesys.rst b/Doc/library/filesys.rst new file mode 100644 index 0000000..e5b5e44 --- /dev/null +++ b/Doc/library/filesys.rst @@ -0,0 +1,38 @@ + +.. _filesys: + +************************* +File and Directory Access +************************* + +The modules described in this chapter deal with disk files and directories. For +example, there are modules for reading the properties of files, manipulating +paths in a portable way, and creating temporary files. The full list of modules +in this chapter is: + + +.. toctree:: + + os.path.rst + fileinput.rst + stat.rst + statvfs.rst + filecmp.rst + tempfile.rst + glob.rst + fnmatch.rst + linecache.rst + shutil.rst + dircache.rst + macpath.rst + + +.. seealso:: + + Section :ref:`bltin-file-objects` + A description of Python's built-in file objects. + + Module :mod:`os` + Operating system interfaces, including functions to work with files at a lower + level than the built-in file object. + diff --git a/Doc/library/fnmatch.rst b/Doc/library/fnmatch.rst new file mode 100644 index 0000000..244bad9 --- /dev/null +++ b/Doc/library/fnmatch.rst @@ -0,0 +1,91 @@ + +:mod:`fnmatch` --- Unix filename pattern matching +================================================= + +.. module:: fnmatch + :synopsis: Unix shell style filename pattern matching. + + +.. index:: single: filenames; wildcard expansion + +.. index:: module: re + +This module provides support for Unix shell-style wildcards, which are *not* the +same as regular expressions (which are documented in the :mod:`re` module). The +special characters used in shell-style wildcards are: + ++------------+------------------------------------+ +| Pattern | Meaning | ++============+====================================+ +| ``*`` | matches everything | ++------------+------------------------------------+ +| ``?`` | matches any single character | ++------------+------------------------------------+ +| ``[seq]`` | matches any character in *seq* | ++------------+------------------------------------+ +| ``[!seq]`` | matches any character not in *seq* | ++------------+------------------------------------+ + +.. index:: module: glob + +Note that the filename separator (``'/'`` on Unix) is *not* special to this +module. See module :mod:`glob` for pathname expansion (:mod:`glob` uses +:func:`fnmatch` to match pathname segments). Similarly, filenames starting with +a period are not special for this module, and are matched by the ``*`` and ``?`` +patterns. + + +.. function:: fnmatch(filename, pattern) + + Test whether the *filename* string matches the *pattern* string, returning true + or false. If the operating system is case-insensitive, then both parameters + will be normalized to all lower- or upper-case before the comparison is + performed. If you require a case-sensitive comparison regardless of whether + that's standard for your operating system, use :func:`fnmatchcase` instead. + + This example will print all file names in the current directory with the + extension ``.txt``:: + + import fnmatch + import os + + for file in os.listdir('.'): + if fnmatch.fnmatch(file, '*.txt'): + print file + + +.. function:: fnmatchcase(filename, pattern) + + Test whether *filename* matches *pattern*, returning true or false; the + comparison is case-sensitive. + + +.. function:: filter(names, pattern) + + Return the subset of the list of *names* that match *pattern*. It is the same as + ``[n for n in names if fnmatch(n, pattern)]``, but implemented more efficiently. + + .. versionadded:: 2.2 + + +.. function:: translate(pattern) + + Return the shell-style *pattern* converted to a regular expression. + + Example:: + + >>> import fnmatch, re + >>> + >>> regex = fnmatch.translate('*.txt') + >>> regex + '.*\\.txt$' + >>> reobj = re.compile(regex) + >>> print reobj.match('foobar.txt') + <_sre.SRE_Match object at 0x...> + + +.. seealso:: + + Module :mod:`glob` + Unix shell-style path expansion. + diff --git a/Doc/library/formatter.rst b/Doc/library/formatter.rst new file mode 100644 index 0000000..2774a2b --- /dev/null +++ b/Doc/library/formatter.rst @@ -0,0 +1,350 @@ + +:mod:`formatter` --- Generic output formatting +============================================== + +.. module:: formatter + :synopsis: Generic output formatter and device interface. + + +.. index:: single: HTMLParser (class in htmllib) + +This module supports two interface definitions, each with multiple +implementations. The *formatter* interface is used by the :class:`HTMLParser` +class of the :mod:`htmllib` module, and the *writer* interface is required by +the formatter interface. + +Formatter objects transform an abstract flow of formatting events into specific +output events on writer objects. Formatters manage several stack structures to +allow various properties of a writer object to be changed and restored; writers +need not be able to handle relative changes nor any sort of "change back" +operation. Specific writer properties which may be controlled via formatter +objects are horizontal alignment, font, and left margin indentations. A +mechanism is provided which supports providing arbitrary, non-exclusive style +settings to a writer as well. Additional interfaces facilitate formatting +events which are not reversible, such as paragraph separation. + +Writer objects encapsulate device interfaces. Abstract devices, such as file +formats, are supported as well as physical devices. The provided +implementations all work with abstract devices. The interface makes available +mechanisms for setting the properties which formatter objects manage and +inserting data into the output. + + +.. _formatter-interface: + +The Formatter Interface +----------------------- + +Interfaces to create formatters are dependent on the specific formatter class +being instantiated. The interfaces described below are the required interfaces +which all formatters must support once initialized. + +One data element is defined at the module level: + + +.. data:: AS_IS + + Value which can be used in the font specification passed to the ``push_font()`` + method described below, or as the new value to any other ``push_property()`` + method. Pushing the ``AS_IS`` value allows the corresponding ``pop_property()`` + method to be called without having to track whether the property was changed. + +The following attributes are defined for formatter instance objects: + + +.. attribute:: formatter.writer + + The writer instance with which the formatter interacts. + + +.. method:: formatter.end_paragraph(blanklines) + + Close any open paragraphs and insert at least *blanklines* before the next + paragraph. + + +.. method:: formatter.add_line_break() + + Add a hard line break if one does not already exist. This does not break the + logical paragraph. + + +.. method:: formatter.add_hor_rule(*args, **kw) + + Insert a horizontal rule in the output. A hard break is inserted if there is + data in the current paragraph, but the logical paragraph is not broken. The + arguments and keywords are passed on to the writer's :meth:`send_line_break` + method. + + +.. method:: formatter.add_flowing_data(data) + + Provide data which should be formatted with collapsed whitespace. Whitespace + from preceding and successive calls to :meth:`add_flowing_data` is considered as + well when the whitespace collapse is performed. The data which is passed to + this method is expected to be word-wrapped by the output device. Note that any + word-wrapping still must be performed by the writer object due to the need to + rely on device and font information. + + +.. method:: formatter.add_literal_data(data) + + Provide data which should be passed to the writer unchanged. Whitespace, + including newline and tab characters, are considered legal in the value of + *data*. + + +.. method:: formatter.add_label_data(format, counter) + + Insert a label which should be placed to the left of the current left margin. + This should be used for constructing bulleted or numbered lists. If the + *format* value is a string, it is interpreted as a format specification for + *counter*, which should be an integer. The result of this formatting becomes the + value of the label; if *format* is not a string it is used as the label value + directly. The label value is passed as the only argument to the writer's + :meth:`send_label_data` method. Interpretation of non-string label values is + dependent on the associated writer. + + Format specifications are strings which, in combination with a counter value, + are used to compute label values. Each character in the format string is copied + to the label value, with some characters recognized to indicate a transform on + the counter value. Specifically, the character ``'1'`` represents the counter + value formatter as an Arabic number, the characters ``'A'`` and ``'a'`` + represent alphabetic representations of the counter value in upper and lower + case, respectively, and ``'I'`` and ``'i'`` represent the counter value in Roman + numerals, in upper and lower case. Note that the alphabetic and roman + transforms require that the counter value be greater than zero. + + +.. method:: formatter.flush_softspace() + + Send any pending whitespace buffered from a previous call to + :meth:`add_flowing_data` to the associated writer object. This should be called + before any direct manipulation of the writer object. + + +.. method:: formatter.push_alignment(align) + + Push a new alignment setting onto the alignment stack. This may be + :const:`AS_IS` if no change is desired. If the alignment value is changed from + the previous setting, the writer's :meth:`new_alignment` method is called with + the *align* value. + + +.. method:: formatter.pop_alignment() + + Restore the previous alignment. + + +.. method:: formatter.push_font((size, italic, bold, teletype)) + + Change some or all font properties of the writer object. Properties which are + not set to :const:`AS_IS` are set to the values passed in while others are + maintained at their current settings. The writer's :meth:`new_font` method is + called with the fully resolved font specification. + + +.. method:: formatter.pop_font() + + Restore the previous font. + + +.. method:: formatter.push_margin(margin) + + Increase the number of left margin indentations by one, associating the logical + tag *margin* with the new indentation. The initial margin level is ``0``. + Changed values of the logical tag must be true values; false values other than + :const:`AS_IS` are not sufficient to change the margin. + + +.. method:: formatter.pop_margin() + + Restore the previous margin. + + +.. method:: formatter.push_style(*styles) + + Push any number of arbitrary style specifications. All styles are pushed onto + the styles stack in order. A tuple representing the entire stack, including + :const:`AS_IS` values, is passed to the writer's :meth:`new_styles` method. + + +.. method:: formatter.pop_style([n=1]) + + Pop the last *n* style specifications passed to :meth:`push_style`. A tuple + representing the revised stack, including :const:`AS_IS` values, is passed to + the writer's :meth:`new_styles` method. + + +.. method:: formatter.set_spacing(spacing) + + Set the spacing style for the writer. + + +.. method:: formatter.assert_line_data([flag=1]) + + Inform the formatter that data has been added to the current paragraph + out-of-band. This should be used when the writer has been manipulated + directly. The optional *flag* argument can be set to false if the writer + manipulations produced a hard line break at the end of the output. + + +.. _formatter-impls: + +Formatter Implementations +------------------------- + +Two implementations of formatter objects are provided by this module. Most +applications may use one of these classes without modification or subclassing. + + +.. class:: NullFormatter([writer]) + + A formatter which does nothing. If *writer* is omitted, a :class:`NullWriter` + instance is created. No methods of the writer are called by + :class:`NullFormatter` instances. Implementations should inherit from this + class if implementing a writer interface but don't need to inherit any + implementation. + + +.. class:: AbstractFormatter(writer) + + The standard formatter. This implementation has demonstrated wide applicability + to many writers, and may be used directly in most circumstances. It has been + used to implement a full-featured World Wide Web browser. + + +.. _writer-interface: + +The Writer Interface +-------------------- + +Interfaces to create writers are dependent on the specific writer class being +instantiated. The interfaces described below are the required interfaces which +all writers must support once initialized. Note that while most applications can +use the :class:`AbstractFormatter` class as a formatter, the writer must +typically be provided by the application. + + +.. method:: writer.flush() + + Flush any buffered output or device control events. + + +.. method:: writer.new_alignment(align) + + Set the alignment style. The *align* value can be any object, but by convention + is a string or ``None``, where ``None`` indicates that the writer's "preferred" + alignment should be used. Conventional *align* values are ``'left'``, + ``'center'``, ``'right'``, and ``'justify'``. + + +.. method:: writer.new_font(font) + + Set the font style. The value of *font* will be ``None``, indicating that the + device's default font should be used, or a tuple of the form ``(``*size*, + *italic*, *bold*, *teletype*``)``. Size will be a string indicating the size of + font that should be used; specific strings and their interpretation must be + defined by the application. The *italic*, *bold*, and *teletype* values are + Boolean values specifying which of those font attributes should be used. + + +.. method:: writer.new_margin(margin, level) + + Set the margin level to the integer *level* and the logical tag to *margin*. + Interpretation of the logical tag is at the writer's discretion; the only + restriction on the value of the logical tag is that it not be a false value for + non-zero values of *level*. + + +.. method:: writer.new_spacing(spacing) + + Set the spacing style to *spacing*. + + +.. method:: writer.new_styles(styles) + + Set additional styles. The *styles* value is a tuple of arbitrary values; the + value :const:`AS_IS` should be ignored. The *styles* tuple may be interpreted + either as a set or as a stack depending on the requirements of the application + and writer implementation. + + +.. method:: writer.send_line_break() + + Break the current line. + + +.. method:: writer.send_paragraph(blankline) + + Produce a paragraph separation of at least *blankline* blank lines, or the + equivalent. The *blankline* value will be an integer. Note that the + implementation will receive a call to :meth:`send_line_break` before this call + if a line break is needed; this method should not include ending the last line + of the paragraph. It is only responsible for vertical spacing between + paragraphs. + + +.. method:: writer.send_hor_rule(*args, **kw) + + Display a horizontal rule on the output device. The arguments to this method + are entirely application- and writer-specific, and should be interpreted with + care. The method implementation may assume that a line break has already been + issued via :meth:`send_line_break`. + + +.. method:: writer.send_flowing_data(data) + + Output character data which may be word-wrapped and re-flowed as needed. Within + any sequence of calls to this method, the writer may assume that spans of + multiple whitespace characters have been collapsed to single space characters. + + +.. method:: writer.send_literal_data(data) + + Output character data which has already been formatted for display. Generally, + this should be interpreted to mean that line breaks indicated by newline + characters should be preserved and no new line breaks should be introduced. The + data may contain embedded newline and tab characters, unlike data provided to + the :meth:`send_formatted_data` interface. + + +.. method:: writer.send_label_data(data) + + Set *data* to the left of the current left margin, if possible. The value of + *data* is not restricted; treatment of non-string values is entirely + application- and writer-dependent. This method will only be called at the + beginning of a line. + + +.. _writer-impls: + +Writer Implementations +---------------------- + +Three implementations of the writer object interface are provided as examples by +this module. Most applications will need to derive new writer classes from the +:class:`NullWriter` class. + + +.. class:: NullWriter() + + A writer which only provides the interface definition; no actions are taken on + any methods. This should be the base class for all writers which do not need to + inherit any implementation methods. + + +.. class:: AbstractWriter() + + A writer which can be used in debugging formatters, but not much else. Each + method simply announces itself by printing its name and arguments on standard + output. + + +.. class:: DumbWriter([file[, maxcol=72]]) + + Simple writer class which writes output on the file object passed in as *file* + or, if *file* is omitted, on standard output. The output is simply word-wrapped + to the number of columns specified by *maxcol*. This class is suitable for + reflowing a sequence of paragraphs. + diff --git a/Doc/library/fpectl.rst b/Doc/library/fpectl.rst new file mode 100644 index 0000000..ef030f0 --- /dev/null +++ b/Doc/library/fpectl.rst @@ -0,0 +1,120 @@ + +:mod:`fpectl` --- Floating point exception control +================================================== + +.. module:: fpectl + :platform: Unix + :synopsis: Provide control for floating point exception handling. +.. moduleauthor:: Lee Busby <busby1@llnl.gov> +.. sectionauthor:: Lee Busby <busby1@llnl.gov> + + +.. note:: + + The :mod:`fpectl` module is not built by default, and its usage is discouraged + and may be dangerous except in the hands of experts. See also the section + :ref:`fpectl-limitations` on limitations for more details. + +.. index:: single: IEEE-754 + +Most computers carry out floating point operations in conformance with the +so-called IEEE-754 standard. On any real computer, some floating point +operations produce results that cannot be expressed as a normal floating point +value. For example, try :: + + >>> import math + >>> math.exp(1000) + inf + >>> math.exp(1000) / math.exp(1000) + nan + +(The example above will work on many platforms. DEC Alpha may be one exception.) +"Inf" is a special, non-numeric value in IEEE-754 that stands for "infinity", +and "nan" means "not a number." Note that, other than the non-numeric results, +nothing special happened when you asked Python to carry out those calculations. +That is in fact the default behaviour prescribed in the IEEE-754 standard, and +if it works for you, stop reading now. + +In some circumstances, it would be better to raise an exception and stop +processing at the point where the faulty operation was attempted. The +:mod:`fpectl` module is for use in that situation. It provides control over +floating point units from several hardware manufacturers, allowing the user to +turn on the generation of :const:`SIGFPE` whenever any of the IEEE-754 +exceptions Division by Zero, Overflow, or Invalid Operation occurs. In tandem +with a pair of wrapper macros that are inserted into the C code comprising your +python system, :const:`SIGFPE` is trapped and converted into the Python +:exc:`FloatingPointError` exception. + +The :mod:`fpectl` module defines the following functions and may raise the given +exception: + + +.. function:: turnon_sigfpe() + + Turn on the generation of :const:`SIGFPE`, and set up an appropriate signal + handler. + + +.. function:: turnoff_sigfpe() + + Reset default handling of floating point exceptions. + + +.. exception:: FloatingPointError + + After :func:`turnon_sigfpe` has been executed, a floating point operation that + raises one of the IEEE-754 exceptions Division by Zero, Overflow, or Invalid + operation will in turn raise this standard Python exception. + + +.. _fpectl-example: + +Example +------- + +The following example demonstrates how to start up and test operation of the +:mod:`fpectl` module. :: + + >>> import fpectl + >>> import fpetest + >>> fpectl.turnon_sigfpe() + >>> fpetest.test() + overflow PASS + FloatingPointError: Overflow + + div by 0 PASS + FloatingPointError: Division by zero + [ more output from test elided ] + >>> import math + >>> math.exp(1000) + Traceback (most recent call last): + File "<stdin>", line 1, in ? + FloatingPointError: in math_1 + + +.. _fpectl-limitations: + +Limitations and other considerations +------------------------------------ + +Setting up a given processor to trap IEEE-754 floating point errors currently +requires custom code on a per-architecture basis. You may have to modify +:mod:`fpectl` to control your particular hardware. + +Conversion of an IEEE-754 exception to a Python exception requires that the +wrapper macros ``PyFPE_START_PROTECT`` and ``PyFPE_END_PROTECT`` be inserted +into your code in an appropriate fashion. Python itself has been modified to +support the :mod:`fpectl` module, but many other codes of interest to numerical +analysts have not. + +The :mod:`fpectl` module is not thread-safe. + + +.. seealso:: + + Some files in the source distribution may be interesting in learning more about + how this module operates. The include file :file:`Include/pyfpe.h` discusses the + implementation of this module at some length. :file:`Modules/fpetestmodule.c` + gives several examples of use. Many additional examples can be found in + :file:`Objects/floatobject.c`. + diff --git a/Doc/library/fpformat.rst b/Doc/library/fpformat.rst new file mode 100644 index 0000000..33655fb --- /dev/null +++ b/Doc/library/fpformat.rst @@ -0,0 +1,56 @@ + +:mod:`fpformat` --- Floating point conversions +============================================== + +.. module:: fpformat + :synopsis: General floating point formatting functions. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`fpformat` module defines functions for dealing with floating point +numbers representations in 100% pure Python. + +.. note:: + + This module is unneeded: everything here could be done via the ``%`` string + interpolation operator. + +The :mod:`fpformat` module defines the following functions and an exception: + + +.. function:: fix(x, digs) + + Format *x* as ``[-]ddd.ddd`` with *digs* digits after the point and at least one + digit before. If ``digs <= 0``, the decimal point is suppressed. + + *x* can be either a number or a string that looks like one. *digs* is an + integer. + + Return value is a string. + + +.. function:: sci(x, digs) + + Format *x* as ``[-]d.dddE[+-]ddd`` with *digs* digits after the point and + exactly one digit before. If ``digs <= 0``, one digit is kept and the point is + suppressed. + + *x* can be either a real number, or a string that looks like one. *digs* is an + integer. + + Return value is a string. + + +.. exception:: NotANumber + + Exception raised when a string passed to :func:`fix` or :func:`sci` as the *x* + parameter does not look like a number. This is a subclass of :exc:`ValueError` + when the standard exceptions are strings. The exception value is the improperly + formatted string that caused the exception to be raised. + +Example:: + + >>> import fpformat + >>> fpformat.fix(1.23, 1) + '1.2' + diff --git a/Doc/library/framework.rst b/Doc/library/framework.rst new file mode 100644 index 0000000..c665fb7 --- /dev/null +++ b/Doc/library/framework.rst @@ -0,0 +1,335 @@ + +:mod:`FrameWork` --- Interactive application framework +====================================================== + +.. module:: FrameWork + :platform: Mac + :synopsis: Interactive application framework. + + +The :mod:`FrameWork` module contains classes that together provide a framework +for an interactive Macintosh application. The programmer builds an application +by creating subclasses that override various methods of the bases classes, +thereby implementing the functionality wanted. Overriding functionality can +often be done on various different levels, i.e. to handle clicks in a single +dialog window in a non-standard way it is not necessary to override the complete +event handling. + +Work on the :mod:`FrameWork` has pretty much stopped, now that :mod:`PyObjC` is +available for full Cocoa access from Python, and the documentation describes +only the most important functionality, and not in the most logical manner at +that. Examine the source or the examples for more details. The following are +some comments posted on the MacPython newsgroup about the strengths and +limitations of :mod:`FrameWork`: + + +.. epigraph:: + + The strong point of :mod:`FrameWork` is that it allows you to break into the + control-flow at many different places. :mod:`W`, for instance, uses a different + way to enable/disable menus and that plugs right in leaving the rest intact. + The weak points of :mod:`FrameWork` are that it has no abstract command + interface (but that shouldn't be difficult), that its dialog support is minimal + and that its control/toolbar support is non-existent. + +The :mod:`FrameWork` module defines the following functions: + + +.. function:: Application() + + An object representing the complete application. See below for a description of + the methods. The default :meth:`__init__` routine creates an empty window + dictionary and a menu bar with an apple menu. + + +.. function:: MenuBar() + + An object representing the menubar. This object is usually not created by the + user. + + +.. function:: Menu(bar, title[, after]) + + An object representing a menu. Upon creation you pass the ``MenuBar`` the menu + appears in, the *title* string and a position (1-based) *after* where the menu + should appear (default: at the end). + + +.. function:: MenuItem(menu, title[, shortcut, callback]) + + Create a menu item object. The arguments are the menu to create, the item title + string and optionally the keyboard shortcut and a callback routine. The callback + is called with the arguments menu-id, item number within menu (1-based), current + front window and the event record. + + Instead of a callable object the callback can also be a string. In this case + menu selection causes the lookup of a method in the topmost window and the + application. The method name is the callback string with ``'domenu_'`` + prepended. + + Calling the ``MenuBar`` :meth:`fixmenudimstate` method sets the correct dimming + for all menu items based on the current front window. + + +.. function:: Separator(menu) + + Add a separator to the end of a menu. + + +.. function:: SubMenu(menu, label) + + Create a submenu named *label* under menu *menu*. The menu object is returned. + + +.. function:: Window(parent) + + Creates a (modeless) window. *Parent* is the application object to which the + window belongs. The window is not displayed until later. + + +.. function:: DialogWindow(parent) + + Creates a modeless dialog window. + + +.. function:: windowbounds(width, height) + + Return a ``(left, top, right, bottom)`` tuple suitable for creation of a window + of given width and height. The window will be staggered with respect to previous + windows, and an attempt is made to keep the whole window on-screen. However, the + window will however always be the exact size given, so parts may be offscreen. + + +.. function:: setwatchcursor() + + Set the mouse cursor to a watch. + + +.. function:: setarrowcursor() + + Set the mouse cursor to an arrow. + + +.. _application-objects: + +Application Objects +------------------- + +Application objects have the following methods, among others: + + +.. method:: Application.makeusermenus() + + Override this method if you need menus in your application. Append the menus to + the attribute :attr:`menubar`. + + +.. method:: Application.getabouttext() + + Override this method to return a text string describing your application. + Alternatively, override the :meth:`do_about` method for more elaborate "about" + messages. + + +.. method:: Application.mainloop([mask[, wait]]) + + This routine is the main event loop, call it to set your application rolling. + *Mask* is the mask of events you want to handle, *wait* is the number of ticks + you want to leave to other concurrent application (default 0, which is probably + not a good idea). While raising *self* to exit the mainloop is still supported + it is not recommended: call ``self._quit()`` instead. + + The event loop is split into many small parts, each of which can be overridden. + The default methods take care of dispatching events to windows and dialogs, + handling drags and resizes, Apple Events, events for non-FrameWork windows, etc. + + In general, all event handlers should return ``1`` if the event is fully handled + and ``0`` otherwise (because the front window was not a FrameWork window, for + instance). This is needed so that update events and such can be passed on to + other windows like the Sioux console window. Calling :func:`MacOS.HandleEvent` + is not allowed within *our_dispatch* or its callees, since this may result in an + infinite loop if the code is called through the Python inner-loop event handler. + + +.. method:: Application.asyncevents(onoff) + + Call this method with a nonzero parameter to enable asynchronous event handling. + This will tell the inner interpreter loop to call the application event handler + *async_dispatch* whenever events are available. This will cause FrameWork window + updates and the user interface to remain working during long computations, but + will slow the interpreter down and may cause surprising results in non-reentrant + code (such as FrameWork itself). By default *async_dispatch* will immediately + call *our_dispatch* but you may override this to handle only certain events + asynchronously. Events you do not handle will be passed to Sioux and such. + + The old on/off value is returned. + + +.. method:: Application._quit() + + Terminate the running :meth:`mainloop` call at the next convenient moment. + + +.. method:: Application.do_char(c, event) + + The user typed character *c*. The complete details of the event can be found in + the *event* structure. This method can also be provided in a ``Window`` object, + which overrides the application-wide handler if the window is frontmost. + + +.. method:: Application.do_dialogevent(event) + + Called early in the event loop to handle modeless dialog events. The default + method simply dispatches the event to the relevant dialog (not through the + ``DialogWindow`` object involved). Override if you need special handling of + dialog events (keyboard shortcuts, etc). + + +.. method:: Application.idle(event) + + Called by the main event loop when no events are available. The null-event is + passed (so you can look at mouse position, etc). + + +.. _window-objects: + +Window Objects +-------------- + +Window objects have the following methods, among others: + + +.. method:: Window.open() + + Override this method to open a window. Store the MacOS window-id in + :attr:`self.wid` and call the :meth:`do_postopen` method to register the window + with the parent application. + + +.. method:: Window.close() + + Override this method to do any special processing on window close. Call the + :meth:`do_postclose` method to cleanup the parent state. + + +.. method:: Window.do_postresize(width, height, macoswindowid) + + Called after the window is resized. Override if more needs to be done than + calling ``InvalRect``. + + +.. method:: Window.do_contentclick(local, modifiers, event) + + The user clicked in the content part of a window. The arguments are the + coordinates (window-relative), the key modifiers and the raw event. + + +.. method:: Window.do_update(macoswindowid, event) + + An update event for the window was received. Redraw the window. + + +.. method:: Window.do_activate(activate, event) + + The window was activated (``activate == 1``) or deactivated (``activate == 0``). + Handle things like focus highlighting, etc. + + +.. _controlswindow-object: + +ControlsWindow Object +--------------------- + +ControlsWindow objects have the following methods besides those of ``Window`` +objects: + + +.. method:: ControlsWindow.do_controlhit(window, control, pcode, event) + + Part *pcode* of control *control* was hit by the user. Tracking and such has + already been taken care of. + + +.. _scrolledwindow-object: + +ScrolledWindow Object +--------------------- + +ScrolledWindow objects are ControlsWindow objects with the following extra +methods: + + +.. method:: ScrolledWindow.scrollbars([wantx[, wanty]]) + + Create (or destroy) horizontal and vertical scrollbars. The arguments specify + which you want (default: both). The scrollbars always have minimum ``0`` and + maximum ``32767``. + + +.. method:: ScrolledWindow.getscrollbarvalues() + + You must supply this method. It should return a tuple ``(x, y)`` giving the + current position of the scrollbars (between ``0`` and ``32767``). You can return + ``None`` for either to indicate the whole document is visible in that direction. + + +.. method:: ScrolledWindow.updatescrollbars() + + Call this method when the document has changed. It will call + :meth:`getscrollbarvalues` and update the scrollbars. + + +.. method:: ScrolledWindow.scrollbar_callback(which, what, value) + + Supplied by you and called after user interaction. *which* will be ``'x'`` or + ``'y'``, *what* will be ``'-'``, ``'--'``, ``'set'``, ``'++'`` or ``'+'``. For + ``'set'``, *value* will contain the new scrollbar position. + + +.. method:: ScrolledWindow.scalebarvalues(absmin, absmax, curmin, curmax) + + Auxiliary method to help you calculate values to return from + :meth:`getscrollbarvalues`. You pass document minimum and maximum value and + topmost (leftmost) and bottommost (rightmost) visible values and it returns the + correct number or ``None``. + + +.. method:: ScrolledWindow.do_activate(onoff, event) + + Takes care of dimming/highlighting scrollbars when a window becomes frontmost. + If you override this method, call this one at the end of your method. + + +.. method:: ScrolledWindow.do_postresize(width, height, window) + + Moves scrollbars to the correct position. Call this method initially if you + override it. + + +.. method:: ScrolledWindow.do_controlhit(window, control, pcode, event) + + Handles scrollbar interaction. If you override it call this method first, a + nonzero return value indicates the hit was in the scrollbars and has been + handled. + + +.. _dialogwindow-objects: + +DialogWindow Objects +-------------------- + +DialogWindow objects have the following methods besides those of ``Window`` +objects: + + +.. method:: DialogWindow.open(resid) + + Create the dialog window, from the DLOG resource with id *resid*. The dialog + object is stored in :attr:`self.wid`. + + +.. method:: DialogWindow.do_itemhit(item, event) + + Item number *item* was hit. You are responsible for redrawing toggle buttons, + etc. + diff --git a/Doc/library/frameworks.rst b/Doc/library/frameworks.rst new file mode 100644 index 0000000..5d8dad5 --- /dev/null +++ b/Doc/library/frameworks.rst @@ -0,0 +1,18 @@ + +.. _frameworks: + +****************** +Program Frameworks +****************** + +The modules described in this chapter are frameworks that will largely dictate +the structure of your program. Currently the modules described here are all +oriented toward writing command-line interfaces. + +The full list of modules described in this chapter is: + + +.. toctree:: + + cmd.rst + shlex.rst diff --git a/Doc/library/ftplib.rst b/Doc/library/ftplib.rst new file mode 100644 index 0000000..60e88cf --- /dev/null +++ b/Doc/library/ftplib.rst @@ -0,0 +1,320 @@ + +:mod:`ftplib` --- FTP protocol client +===================================== + +.. module:: ftplib + :synopsis: FTP protocol client (requires sockets). + + +.. index:: + pair: FTP; protocol + single: FTP; ftplib (standard module) + +This module defines the class :class:`FTP` and a few related items. The +:class:`FTP` class implements the client side of the FTP protocol. You can use +this to write Python programs that perform a variety of automated FTP jobs, such +as mirroring other ftp servers. It is also used by the module :mod:`urllib` to +handle URLs that use FTP. For more information on FTP (File Transfer Protocol), +see Internet :rfc:`959`. + +Here's a sample session using the :mod:`ftplib` module:: + + >>> from ftplib import FTP + >>> ftp = FTP('ftp.cwi.nl') # connect to host, default port + >>> ftp.login() # user anonymous, passwd anonymous@ + >>> ftp.retrlines('LIST') # list directory contents + total 24418 + drwxrwsr-x 5 ftp-usr pdmaint 1536 Mar 20 09:48 . + dr-xr-srwt 105 ftp-usr pdmaint 1536 Mar 21 14:32 .. + -rw-r--r-- 1 ftp-usr pdmaint 5305 Mar 20 09:48 INDEX + . + . + . + >>> ftp.retrbinary('RETR README', open('README', 'wb').write) + '226 Transfer complete.' + >>> ftp.quit() + +The module defines the following items: + + +.. class:: FTP([host[, user[, passwd[, acct[, timeout]]]]]) + + Return a new instance of the :class:`FTP` class. When *host* is given, the + method call ``connect(host)`` is made. When *user* is given, additionally the + method call ``login(user, passwd, acct)`` is made (where *passwd* and *acct* + default to the empty string when not given). The optional *timeout* parameter + specifies a timeout in seconds for the connection attempt (if is not specified, + or passed as None, the global default timeout setting will be used). + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. data:: all_errors + + The set of all exceptions (as a tuple) that methods of :class:`FTP` instances + may raise as a result of problems with the FTP connection (as opposed to + programming errors made by the caller). This set includes the four exceptions + listed below as well as :exc:`socket.error` and :exc:`IOError`. + + +.. exception:: error_reply + + Exception raised when an unexpected reply is received from the server. + + +.. exception:: error_temp + + Exception raised when an error code in the range 400--499 is received. + + +.. exception:: error_perm + + Exception raised when an error code in the range 500--599 is received. + + +.. exception:: error_proto + + Exception raised when a reply is received from the server that does not begin + with a digit in the range 1--5. + + +.. seealso:: + + Module :mod:`netrc` + Parser for the :file:`.netrc` file format. The file :file:`.netrc` is typically + used by FTP clients to load user authentication information before prompting the + user. + + .. index:: single: ftpmirror.py + + The file :file:`Tools/scripts/ftpmirror.py` in the Python source distribution is + a script that can mirror FTP sites, or portions thereof, using the :mod:`ftplib` + module. It can be used as an extended example that applies this module. + + +.. _ftp-objects: + +FTP Objects +----------- + +Several methods are available in two flavors: one for handling text files and +another for binary files. These are named for the command which is used +followed by ``lines`` for the text version or ``binary`` for the binary version. + +:class:`FTP` instances have the following methods: + + +.. method:: FTP.set_debuglevel(level) + + Set the instance's debugging level. This controls the amount of debugging + output printed. The default, ``0``, produces no debugging output. A value of + ``1`` produces a moderate amount of debugging output, generally a single line + per request. A value of ``2`` or higher produces the maximum amount of + debugging output, logging each line sent and received on the control connection. + + +.. method:: FTP.connect(host[, port[, timeout]]) + + Connect to the given host and port. The default port number is ``21``, as + specified by the FTP protocol specification. It is rarely needed to specify a + different port number. This function should be called only once for each + instance; it should not be called at all if a host was given when the instance + was created. All other methods can only be used after a connection has been + made. + + The optional *timeout* parameter specifies a timeout in seconds for the + connection attempt. If is not specified, or passed as None, the object timeout + is used (the timeout that you passed when instantiating the class); if the + object timeout is also None, the global default timeout setting will be used. + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. method:: FTP.getwelcome() + + Return the welcome message sent by the server in reply to the initial + connection. (This message sometimes contains disclaimers or help information + that may be relevant to the user.) + + +.. method:: FTP.login([user[, passwd[, acct]]]) + + Log in as the given *user*. The *passwd* and *acct* parameters are optional and + default to the empty string. If no *user* is specified, it defaults to + ``'anonymous'``. If *user* is ``'anonymous'``, the default *passwd* is + ``'anonymous@'``. This function should be called only once for each instance, + after a connection has been established; it should not be called at all if a + host and user were given when the instance was created. Most FTP commands are + only allowed after the client has logged in. + + +.. method:: FTP.abort() + + Abort a file transfer that is in progress. Using this does not always work, but + it's worth a try. + + +.. method:: FTP.sendcmd(command) + + Send a simple command string to the server and return the response string. + + +.. method:: FTP.voidcmd(command) + + Send a simple command string to the server and handle the response. Return + nothing if a response code in the range 200--299 is received. Raise an exception + otherwise. + + +.. method:: FTP.retrbinary(command, callback[, maxblocksize[, rest]]) + + Retrieve a file in binary transfer mode. *command* should be an appropriate + ``RETR`` command: ``'RETR filename'``. The *callback* function is called for + each block of data received, with a single string argument giving the data + block. The optional *maxblocksize* argument specifies the maximum chunk size to + read on the low-level socket object created to do the actual transfer (which + will also be the largest size of the data blocks passed to *callback*). A + reasonable default is chosen. *rest* means the same thing as in the + :meth:`transfercmd` method. + + +.. method:: FTP.retrlines(command[, callback]) + + Retrieve a file or directory listing in ASCII transfer mode. *command* should be + an appropriate ``RETR`` command (see :meth:`retrbinary`) or a ``LIST`` command + (usually just the string ``'LIST'``). The *callback* function is called for + each line, with the trailing CRLF stripped. The default *callback* prints the + line to ``sys.stdout``. + + +.. method:: FTP.set_pasv(boolean) + + Enable "passive" mode if *boolean* is true, other disable passive mode. (In + Python 2.0 and before, passive mode was off by default; in Python 2.1 and later, + it is on by default.) + + +.. method:: FTP.storbinary(command, file[, blocksize]) + + Store a file in binary transfer mode. *command* should be an appropriate + ``STOR`` command: ``"STOR filename"``. *file* is an open file object which is + read until EOF using its :meth:`read` method in blocks of size *blocksize* to + provide the data to be stored. The *blocksize* argument defaults to 8192. + + .. versionchanged:: 2.1 + default for *blocksize* added. + + +.. method:: FTP.storlines(command, file) + + Store a file in ASCII transfer mode. *command* should be an appropriate + ``STOR`` command (see :meth:`storbinary`). Lines are read until EOF from the + open file object *file* using its :meth:`readline` method to provide the data to + be stored. + + +.. method:: FTP.transfercmd(cmd[, rest]) + + Initiate a transfer over the data connection. If the transfer is active, send a + ``EPRT`` or ``PORT`` command and the transfer command specified by *cmd*, and + accept the connection. If the server is passive, send a ``EPSV`` or ``PASV`` + command, connect to it, and start the transfer command. Either way, return the + socket for the connection. + + If optional *rest* is given, a ``REST`` command is sent to the server, passing + *rest* as an argument. *rest* is usually a byte offset into the requested file, + telling the server to restart sending the file's bytes at the requested offset, + skipping over the initial bytes. Note however that RFC 959 requires only that + *rest* be a string containing characters in the printable range from ASCII code + 33 to ASCII code 126. The :meth:`transfercmd` method, therefore, converts + *rest* to a string, but no check is performed on the string's contents. If the + server does not recognize the ``REST`` command, an :exc:`error_reply` exception + will be raised. If this happens, simply call :meth:`transfercmd` without a + *rest* argument. + + +.. method:: FTP.ntransfercmd(cmd[, rest]) + + Like :meth:`transfercmd`, but returns a tuple of the data connection and the + expected size of the data. If the expected size could not be computed, ``None`` + will be returned as the expected size. *cmd* and *rest* means the same thing as + in :meth:`transfercmd`. + + +.. method:: FTP.nlst(argument[, ...]) + + Return a list of files as returned by the ``NLST`` command. The optional + *argument* is a directory to list (default is the current server directory). + Multiple arguments can be used to pass non-standard options to the ``NLST`` + command. + + +.. method:: FTP.dir(argument[, ...]) + + Produce a directory listing as returned by the ``LIST`` command, printing it to + standard output. The optional *argument* is a directory to list (default is the + current server directory). Multiple arguments can be used to pass non-standard + options to the ``LIST`` command. If the last argument is a function, it is used + as a *callback* function as for :meth:`retrlines`; the default prints to + ``sys.stdout``. This method returns ``None``. + + +.. method:: FTP.rename(fromname, toname) + + Rename file *fromname* on the server to *toname*. + + +.. method:: FTP.delete(filename) + + Remove the file named *filename* from the server. If successful, returns the + text of the response, otherwise raises :exc:`error_perm` on permission errors or + :exc:`error_reply` on other errors. + + +.. method:: FTP.cwd(pathname) + + Set the current directory on the server. + + +.. method:: FTP.mkd(pathname) + + Create a new directory on the server. + + +.. method:: FTP.pwd() + + Return the pathname of the current directory on the server. + + +.. method:: FTP.rmd(dirname) + + Remove the directory named *dirname* on the server. + + +.. method:: FTP.size(filename) + + Request the size of the file named *filename* on the server. On success, the + size of the file is returned as an integer, otherwise ``None`` is returned. + Note that the ``SIZE`` command is not standardized, but is supported by many + common server implementations. + + +.. method:: FTP.quit() + + Send a ``QUIT`` command to the server and close the connection. This is the + "polite" way to close a connection, but it may raise an exception of the server + reponds with an error to the ``QUIT`` command. This implies a call to the + :meth:`close` method which renders the :class:`FTP` instance useless for + subsequent calls (see below). + + +.. method:: FTP.close() + + Close the connection unilaterally. This should not be applied to an already + closed connection such as after a successful call to :meth:`quit`. After this + call the :class:`FTP` instance should not be used any more (after a call to + :meth:`close` or :meth:`quit` you cannot reopen the connection by issuing + another :meth:`login` method). + diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst new file mode 100644 index 0000000..b0a5577c --- /dev/null +++ b/Doc/library/functions.rst @@ -0,0 +1,1138 @@ + +.. _built-in-funcs: + +Built-in Functions +================== + +The Python interpreter has a number of functions built into it that are always +available. They are listed here in alphabetical order. + + +.. function:: __import__(name[, globals[, locals[, fromlist[, level]]]]) + + .. index:: + statement: import + module: ihooks + module: rexec + module: imp + + .. note:: + + This is an advanced function that is not needed in everyday Python + programming. + + The function is invoked by the :keyword:`import` statement. It mainly exists + so that you can replace it with another function that has a compatible + interface, in order to change the semantics of the :keyword:`import` statement. + For examples of why and how you would do this, see the standard library modules + :mod:`ihooks` and :mod:`rexec`. See also the built-in module :mod:`imp`, which + defines some useful operations out of which you can build your own + :func:`__import__` function. + + For example, the statement ``import spam`` results in the following call: + ``__import__('spam',`` ``globals(),`` ``locals(), [], -1)``; the statement + ``from spam.ham import eggs`` results in ``__import__('spam.ham', globals(), + locals(), ['eggs'], -1)``. Note that even though ``locals()`` and ``['eggs']`` + are passed in as arguments, the :func:`__import__` function does not set the + local variable named ``eggs``; this is done by subsequent code that is generated + for the import statement. (In fact, the standard implementation does not use + its *locals* argument at all, and uses its *globals* only to determine the + package context of the :keyword:`import` statement.) + + When the *name* variable is of the form ``package.module``, normally, the + top-level package (the name up till the first dot) is returned, *not* the + module named by *name*. However, when a non-empty *fromlist* argument is + given, the module named by *name* is returned. This is done for + compatibility with the bytecode generated for the different kinds of import + statement; when using ``import spam.ham.eggs``, the top-level package + :mod:`spam` must be placed in the importing namespace, but when using ``from + spam.ham import eggs``, the ``spam.ham`` subpackage must be used to find the + ``eggs`` variable. As a workaround for this behavior, use :func:`getattr` to + extract the desired components. For example, you could define the following + helper:: + + def my_import(name): + mod = __import__(name) + components = name.split('.') + for comp in components[1:]: + mod = getattr(mod, comp) + return mod + + *level* specifies whether to use absolute or relative imports. The default is + ``-1`` which indicates both absolute and relative imports will be attempted. + ``0`` means only perform absolute imports. Positive values for *level* indicate + the number of parent directories to search relative to the directory of the + module calling :func:`__import__`. + + .. versionchanged:: 2.5 + The level parameter was added. + + .. versionchanged:: 2.5 + Keyword support for parameters was added. + + +.. function:: abs(x) + + Return the absolute value of a number. The argument may be a plain or long + integer or a floating point number. If the argument is a complex number, its + magnitude is returned. + + +.. function:: all(iterable) + + Return True if all elements of the *iterable* are true. Equivalent to:: + + def all(iterable): + for element in iterable: + if not element: + return False + return True + + .. versionadded:: 2.5 + + +.. function:: any(iterable) + + Return True if any element of the *iterable* is true. Equivalent to:: + + def any(iterable): + for element in iterable: + if element: + return True + return False + + .. versionadded:: 2.5 + + +.. function:: basestring() + + This abstract type is the superclass for :class:`str`. It + cannot be called or instantiated, but it can be used to test whether an object + is an instance of :class:`str` (or a user-defined type inherited from + :class:`basestring`). + + .. versionadded:: 2.3 + + +.. function:: bin(x) + + Convert an integer number to a binary string. The result is a valid Python + expression. If *x* is not a Python :class:`int` object, it has to define an + :meth:`__index__` method that returns an integer. + + .. versionadded:: 3.0 + + +.. function:: bool([x]) + + Convert a value to a Boolean, using the standard truth testing procedure. If + *x* is false or omitted, this returns :const:`False`; otherwise it returns + :const:`True`. :class:`bool` is also a class, which is a subclass of + :class:`int`. Class :class:`bool` cannot be subclassed further. Its only + instances are :const:`False` and :const:`True`. + + .. index:: pair: Boolean; type + + .. versionadded:: 2.2.1 + + .. versionchanged:: 2.3 + If no argument is given, this function returns :const:`False`. + + +.. function:: chr(i) + + Return the string of one character whose Unicode codepoint is the integer *i*. For + example, ``chr(97)`` returns the string ``'a'``. This is the inverse of + :func:`ord`. The valid range for the argument depends how Python was + configured -- it may be either UCS2 [0..0xFFFF] or UCS4 [0..0x10FFFF]. + :exc:`ValueError` will be raised if *i* is outside that range. + + +.. function:: classmethod(function) + + Return a class method for *function*. + + A class method receives the class as implicit first argument, just like an + instance method receives the instance. To declare a class method, use this + idiom:: + + class C: + @classmethod + def f(cls, arg1, arg2, ...): ... + + The ``@classmethod`` form is a function decorator -- see the description of + function definitions in :ref:`function` for details. + + It can be called either on the class (such as ``C.f()``) or on an instance (such + as ``C().f()``). The instance is ignored except for its class. If a class + method is called for a derived class, the derived class object is passed as the + implied first argument. + + Class methods are different than C++ or Java static methods. If you want those, + see :func:`staticmethod` in this section. + + For more information on class methods, consult the documentation on the standard + type hierarchy in :ref:`types`. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.4 + Function decorator syntax added. + + +.. function:: cmp(x, y) + + Compare the two objects *x* and *y* and return an integer according to the + outcome. The return value is negative if ``x < y``, zero if ``x == y`` and + strictly positive if ``x > y``. + + +.. function:: compile(source, filename, mode[, flags[, dont_inherit]]) + + Compile the *source* into a code object. Code objects can be executed by a call + to :func:`exec` or evaluated by a call to :func:`eval`. The *filename* argument + should give the file from which the code was read; pass some recognizable value + if it wasn't read from a file (``'<string>'`` is commonly used). The *mode* + argument specifies what kind of code must be compiled; it can be ``'exec'`` if + *source* consists of a sequence of statements, ``'eval'`` if it consists of a + single expression, or ``'single'`` if it consists of a single interactive + statement (in the latter case, expression statements that evaluate to something + else than ``None`` will be printed). + + When compiling multi-line statements, two caveats apply: line endings must be + represented by a single newline character (``'\n'``), and the input must be + terminated by at least one newline character. If line endings are represented + by ``'\r\n'``, use the string :meth:`replace` method to change them into + ``'\n'``. + + The optional arguments *flags* and *dont_inherit* (which are new in Python 2.2) + control which future statements (see :pep:`236`) affect the compilation of + *source*. If neither is present (or both are zero) the code is compiled with + those future statements that are in effect in the code that is calling compile. + If the *flags* argument is given and *dont_inherit* is not (or is zero) then the + future statements specified by the *flags* argument are used in addition to + those that would be used anyway. If *dont_inherit* is a non-zero integer then + the *flags* argument is it -- the future statements in effect around the call to + compile are ignored. + + Future statements are specified by bits which can be bitwise or-ed together to + specify multiple statements. The bitfield required to specify a given feature + can be found as the :attr:`compiler_flag` attribute on the :class:`_Feature` + instance in the :mod:`__future__` module. + + +.. function:: complex([real[, imag]]) + + Create a complex number with the value *real* + *imag*\*j or convert a string or + number to a complex number. If the first parameter is a string, it will be + interpreted as a complex number and the function must be called without a second + parameter. The second parameter can never be a string. Each argument may be any + numeric type (including complex). If *imag* is omitted, it defaults to zero and + the function serves as a numeric conversion function like :func:`int`, + :func:`long` and :func:`float`. If both arguments are omitted, returns ``0j``. + + The complex type is described in :ref:`typesnumeric`. + + +.. function:: delattr(object, name) + + This is a relative of :func:`setattr`. The arguments are an object and a + string. The string must be the name of one of the object's attributes. The + function deletes the named attribute, provided the object allows it. For + example, ``delattr(x, 'foobar')`` is equivalent to ``del x.foobar``. + + +.. function:: dict([arg]) + :noindex: + + Create a new data dictionary, optionally with items taken from *arg*. + The dictionary type is described in :ref:`typesmapping`. + + For other containers see the built in :class:`list`, :class:`set`, and + :class:`tuple` classes, and the :mod:`collections` module. + + +.. function:: dir([object]) + + Without arguments, return the list of names in the current local scope. With an + argument, attempt to return a list of valid attributes for that object. + + If the object has a method named :meth:`__dir__`, this method will be called and + must return the list of attributes. This allows objects that implement a custom + :func:`__getattr__` or :func:`__getattribute__` function to customize the way + :func:`dir` reports their attributes. + + If the object does not provide :meth:`__dir__`, the function tries its best to + gather information from the object's :attr:`__dict__` attribute, if defined, and + from its type object. The resulting list is not necessarily complete, and may + be inaccurate when the object has a custom :func:`__getattr__`. + + The default :func:`dir` mechanism behaves differently with different types of + objects, as it attempts to produce the most relevant, rather than complete, + information: + + * If the object is a module object, the list contains the names of the module's + attributes. + + * If the object is a type or class object, the list contains the names of its + attributes, and recursively of the attributes of its bases. + + * Otherwise, the list contains the object's attributes' names, the names of its + class's attributes, and recursively of the attributes of its class's base + classes. + + The resulting list is sorted alphabetically. For example:: + + >>> import struct + >>> dir() + ['__builtins__', '__doc__', '__name__', 'struct'] + >>> dir(struct) + ['__doc__', '__name__', 'calcsize', 'error', 'pack', 'unpack'] + >>> class Foo(object): + ... def __dir__(self): + ... return ["kan", "ga", "roo"] + ... + >>> f = Foo() + >>> dir(f) + ['ga', 'kan', 'roo'] + + .. note:: + + Because :func:`dir` is supplied primarily as a convenience for use at an + interactive prompt, it tries to supply an interesting set of names more than it + tries to supply a rigorously or consistently defined set of names, and its + detailed behavior may change across releases. + + +.. function:: divmod(a, b) + + Take two (non complex) numbers as arguments and return a pair of numbers + consisting of their quotient and remainder when using long division. With mixed + operand types, the rules for binary arithmetic operators apply. For plain and + long integers, the result is the same as ``(a // b, a % b)``. For floating point + numbers the result is ``(q, a % b)``, where *q* is usually ``math.floor(a / b)`` + but may be 1 less than that. In any case ``q * b + a % b`` is very close to + *a*, if ``a % b`` is non-zero it has the same sign as *b*, and ``0 <= abs(a % b) + < abs(b)``. + + .. versionchanged:: 2.3 + Using :func:`divmod` with complex numbers is deprecated. + + +.. function:: enumerate(iterable) + + Return an enumerate object. *iterable* must be a sequence, an iterator, or some + other object which supports iteration. The :meth:`__next__` method of the + iterator returned by :func:`enumerate` returns a tuple containing a count (from + zero) and the corresponding value obtained from iterating over *iterable*. + :func:`enumerate` is useful for obtaining an indexed series: ``(0, seq[0])``, + ``(1, seq[1])``, ``(2, seq[2])``, .... For example:: + + >>> for i, season in enumerate(['Spring', 'Summer', 'Fall', 'Winter')]: + >>> print i, season + 0 Spring + 1 Summer + 2 Fall + 3 Winter + + .. versionadded:: 2.3 + + +.. function:: eval(expression[, globals[, locals]]) + + The arguments are a string and optional globals and locals. If provided, + *globals* must be a dictionary. If provided, *locals* can be any mapping + object. + + .. versionchanged:: 2.4 + formerly *locals* was required to be a dictionary. + + The *expression* argument is parsed and evaluated as a Python expression + (technically speaking, a condition list) using the *globals* and *locals* + dictionaries as global and local name space. If the *globals* dictionary is + present and lacks '__builtins__', the current globals are copied into *globals* + before *expression* is parsed. This means that *expression* normally has full + access to the standard :mod:`__builtin__` module and restricted environments are + propagated. If the *locals* dictionary is omitted it defaults to the *globals* + dictionary. If both dictionaries are omitted, the expression is executed in the + environment where :keyword:`eval` is called. The return value is the result of + the evaluated expression. Syntax errors are reported as exceptions. Example:: + + >>> x = 1 + >>> print eval('x+1') + 2 + + This function can also be used to execute arbitrary code objects (such as those + created by :func:`compile`). In this case pass a code object instead of a + string. The code object must have been compiled passing ``'eval'`` as the + *kind* argument. + + Hints: dynamic execution of statements is supported by the :func:`exec` + function. The :func:`globals` and :func:`locals` functions + returns the current global and local dictionary, respectively, which may be + useful to pass around for use by :func:`eval` or :func:`exec`. + + +.. function:: exec(object[, globals[, locals]]) + + This function supports dynamic execution of Python code. *object* must be either + a string, an open file object, or a code object. If it is a string, the string + is parsed as a suite of Python statements which is then executed (unless a + syntax error occurs). If it is an open file, the file is parsed until EOF and + executed. If it is a code object, it is simply executed. In all cases, the + code that's executed is expected to be valid as file input (see the section + "File input" in the Reference Manual). Be aware that the :keyword:`return` and + :keyword:`yield` statements may not be used outside of function definitions even + within the context of code passed to the :func:`exec` function. The return value + is ``None``. + + In all cases, if the optional parts are omitted, the code is executed in the + current scope. If only *globals* is provided, it must be a dictionary, which + will be used for both the global and the local variables. If *globals* and + *locals* are given, they are used for the global and local variables, + respectively. If provided, *locals* can be any mapping object. + + If the *globals* dictionary does not contain a value for the key + ``__builtins__``, a reference to the dictionary of the built-in module + :mod:`__builtin__` is inserted under that key. That way you can control what + builtins are available to the executed code by inserting your own + ``__builtins__`` dictionary into *globals* before passing it to :func:`exec`. + + .. note:: + + The built-in functions :func:`globals` and :func:`locals` return the current + global and local dictionary, respectively, which may be useful to pass around + for use as the second and third argument to :func:`exec`. + + .. warning:: + + The default *locals* act as described for function :func:`locals` below: + modifications to the default *locals* dictionary should not be attempted. Pass + an explicit *locals* dictionary if you need to see effects of the code on + *locals* after function :func:`execfile` returns. :func:`exec` cannot be + used reliably to modify a function's locals. + + +.. function:: filter(function, iterable) + + Construct a list from those elements of *iterable* for which *function* returns + true. *iterable* may be either a sequence, a container which supports + iteration, or an iterator, If *iterable* is a string or a tuple, the result + also has that type; otherwise it is always a list. If *function* is ``None``, + the identity function is assumed, that is, all elements of *iterable* that are + false are removed. + + Note that ``filter(function, iterable)`` is equivalent to ``[item for item in + iterable if function(item)]`` if function is not ``None`` and ``[item for item + in iterable if item]`` if function is ``None``. + + +.. function:: float([x]) + + Convert a string or a number to floating point. If the argument is a string, it + must contain a possibly signed decimal or floating point number, possibly + embedded in whitespace. Otherwise, the argument may be a plain or long integer + or a floating point number, and a floating point number with the same value + (within Python's floating point precision) is returned. If no argument is + given, returns ``0.0``. + + .. note:: + + .. index:: + single: NaN + single: Infinity + + When passing in a string, values for NaN and Infinity may be returned, depending + on the underlying C library. The specific set of strings accepted which cause + these values to be returned depends entirely on the C library and is known to + vary. + + The float type is described in :ref:`typesnumeric`. + +.. function:: frozenset([iterable]) + :noindex: + + Return a frozenset object, optionally with elements taken from *iterable*. + The frozenset type is described in :ref:`types-set`. + + For other containers see the built in :class:`dict`, :class:`list`, and + :class:`tuple` classes, and the :mod:`collections` module. + + .. versionadded:: 2.4 + + +.. function:: getattr(object, name[, default]) + + Return the value of the named attributed of *object*. *name* must be a string. + If the string is the name of one of the object's attributes, the result is the + value of that attribute. For example, ``getattr(x, 'foobar')`` is equivalent to + ``x.foobar``. If the named attribute does not exist, *default* is returned if + provided, otherwise :exc:`AttributeError` is raised. + + +.. function:: globals() + + Return a dictionary representing the current global symbol table. This is always + the dictionary of the current module (inside a function or method, this is the + module where it is defined, not the module from which it is called). + + +.. function:: hasattr(object, name) + + The arguments are an object and a string. The result is ``True`` if the string + is the name of one of the object's attributes, ``False`` if not. (This is + implemented by calling ``getattr(object, name)`` and seeing whether it raises an + exception or not.) + + +.. function:: hash(object) + + Return the hash value of the object (if it has one). Hash values are integers. + They are used to quickly compare dictionary keys during a dictionary lookup. + Numeric values that compare equal have the same hash value (even if they are of + different types, as is the case for 1 and 1.0). + + +.. function:: help([object]) + + Invoke the built-in help system. (This function is intended for interactive + use.) If no argument is given, the interactive help system starts on the + interpreter console. If the argument is a string, then the string is looked up + as the name of a module, function, class, method, keyword, or documentation + topic, and a help page is printed on the console. If the argument is any other + kind of object, a help page on the object is generated. + + .. versionadded:: 2.2 + + +.. function:: hex(x) + + Convert an integer number to a hexadecimal string. The result is a valid Python + expression. If *x* is not a Python :class:`int` object, it has to define an + :meth:`__index__` method that returns an integer. + + .. versionchanged:: 2.4 + Formerly only returned an unsigned literal. + + +.. function:: id(object) + + Return the "identity" of an object. This is an integer (or long integer) which + is guaranteed to be unique and constant for this object during its lifetime. + Two objects with non-overlapping lifetimes may have the same :func:`id` value. + (Implementation note: this is the address of the object.) + + +.. function:: int([x[, radix]]) + + Convert a string or number to an integer. If the argument is a string, it + must contain a possibly signed number of arbitrary size, + possibly embedded in whitespace. The *radix* parameter gives the base for the + conversion and may be any integer in the range [2, 36], or zero. If *radix* is + zero, the interpretation is the same as for integer literals. If *radix* is + specified and *x* is not a string, :exc:`TypeError` is raised. Otherwise, the + argument may be another integer, a floating point number or any other object + that has an :meth:`__int__` method. Conversion + of floating point numbers to integers truncates (towards zero). If no + arguments are given, returns ``0``. + + The integer type is described in :ref:`typesnumeric`. + + +.. function:: isinstance(object, classinfo) + + Return true if the *object* argument is an instance of the *classinfo* argument, + or of a (direct or indirect) subclass thereof. Also return true if *classinfo* + is a type object (new-style class) and *object* is an object of that type or of + a (direct or indirect) subclass thereof. If *object* is not a class instance or + an object of the given type, the function always returns false. If *classinfo* + is neither a class object nor a type object, it may be a tuple of class or type + objects, or may recursively contain other such tuples (other sequence types are + not accepted). If *classinfo* is not a class, type, or tuple of classes, types, + and such tuples, a :exc:`TypeError` exception is raised. + + .. versionchanged:: 2.2 + Support for a tuple of type information was added. + + +.. function:: issubclass(class, classinfo) + + Return true if *class* is a subclass (direct or indirect) of *classinfo*. A + class is considered a subclass of itself. *classinfo* may be a tuple of class + objects, in which case every entry in *classinfo* will be checked. In any other + case, a :exc:`TypeError` exception is raised. + + .. versionchanged:: 2.3 + Support for a tuple of type information was added. + + +.. function:: iter(o[, sentinel]) + + Return an iterator object. The first argument is interpreted very differently + depending on the presence of the second argument. Without a second argument, *o* + must be a collection object which supports the iteration protocol (the + :meth:`__iter__` method), or it must support the sequence protocol (the + :meth:`__getitem__` method with integer arguments starting at ``0``). If it + does not support either of those protocols, :exc:`TypeError` is raised. If the + second argument, *sentinel*, is given, then *o* must be a callable object. The + iterator created in this case will call *o* with no arguments for each call to + its :meth:`__next__` method; if the value returned is equal to *sentinel*, + :exc:`StopIteration` will be raised, otherwise the value will be returned. + + .. versionadded:: 2.2 + + +.. function:: len(s) + + Return the length (the number of items) of an object. The argument may be a + sequence (string, tuple or list) or a mapping (dictionary). + + +.. function:: list([iterable]) + + Return a list whose items are the same and in the same order as *iterable*'s + items. *iterable* may be either a sequence, a container that supports + iteration, or an iterator object. If *iterable* is already a list, a copy is + made and returned, similar to ``iterable[:]``. For instance, ``list('abc')`` + returns ``['a', 'b', 'c']`` and ``list( (1, 2, 3) )`` returns ``[1, 2, 3]``. If + no argument is given, returns a new empty list, ``[]``. + + :class:`list` is a mutable sequence type, as documented in + :ref:`typesseq`. For other containers see the built in :class:`dict`, + :class:`set`, and :class:`tuple` classes, and the :mod:`collections` module. + + +.. function:: locals() + + Update and return a dictionary representing the current local symbol table. + + .. warning:: + + The contents of this dictionary should not be modified; changes may not affect + the values of local variables used by the interpreter. + + Free variables are returned by *locals* when it is called in a function block. + Modifications of free variables may not affect the values used by the + interpreter. Free variables are not returned in class blocks. + + +.. function:: map(function, iterable, ...) + + Apply *function* to every item of *iterable* and return a list of the results. + If additional *iterable* arguments are passed, *function* must take that many + arguments and is applied to the items from all iterables in parallel. If one + iterable is shorter than another it is assumed to be extended with ``None`` + items. If *function* is ``None``, the identity function is assumed; if there + are multiple arguments, :func:`map` returns a list consisting of tuples + containing the corresponding items from all iterables (a kind of transpose + operation). The *iterable* arguments may be a sequence or any iterable object; + the result is always a list. + + +.. function:: max(iterable[, args...][key]) + + With a single argument *iterable*, return the largest item of a non-empty + iterable (such as a string, tuple or list). With more than one argument, return + the largest of the arguments. + + The optional *key* argument specifies a one-argument ordering function like that + used for :meth:`list.sort`. The *key* argument, if supplied, must be in keyword + form (for example, ``max(a,b,c,key=func)``). + + .. versionchanged:: 2.5 + Added support for the optional *key* argument. + + +.. function:: min(iterable[, args...][key]) + + With a single argument *iterable*, return the smallest item of a non-empty + iterable (such as a string, tuple or list). With more than one argument, return + the smallest of the arguments. + + The optional *key* argument specifies a one-argument ordering function like that + used for :meth:`list.sort`. The *key* argument, if supplied, must be in keyword + form (for example, ``min(a,b,c,key=func)``). + + .. versionchanged:: 2.5 + Added support for the optional *key* argument. + + +.. function:: next(iterator[, default]) + + Retrieve the next item from the *iterable* by calling its :meth:`__next__` + method. If *default* is given, it is returned if the iterator is exhausted, + otherwise :exc:`StopIteration` is raised. + + +.. function:: object() + + Return a new featureless object. :class:`object` is a base for all new style + classes. It has the methods that are common to all instances of new style + classes. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.3 + This function does not accept any arguments. Formerly, it accepted arguments but + ignored them. + + +.. function:: oct(x) + + Convert an integer number to an octal string. The result is a valid Python + expression. If *x* is not a Python :class:`int` object, it has to define an + :meth:`__index__` method that returns an integer. + + .. versionchanged:: 2.4 + Formerly only returned an unsigned literal. + + +.. function:: open(filename[, mode[, bufsize]]) + + Open a file, returning an object of the :class:`file` type described in + section :ref:`bltin-file-objects`. If the file cannot be opened, + :exc:`IOError` is raised. When opening a file, it's preferable to use + :func:`open` instead of invoking the :class:`file` constructor directly. + + The first two arguments are the same as for ``stdio``'s :cfunc:`fopen`: + *filename* is the file name to be opened, and *mode* is a string indicating how + the file is to be opened. + + The most commonly-used values of *mode* are ``'r'`` for reading, ``'w'`` for + writing (truncating the file if it already exists), and ``'a'`` for appending + (which on *some* Unix systems means that *all* writes append to the end of the + file regardless of the current seek position). If *mode* is omitted, it + defaults to ``'r'``. When opening a binary file, you should append ``'b'`` to + the *mode* value to open the file in binary mode, which will improve + portability. (Appending ``'b'`` is useful even on systems that don't treat + binary and text files differently, where it serves as documentation.) See below + for more possible values of *mode*. + + .. index:: + single: line-buffered I/O + single: unbuffered I/O + single: buffer size, I/O + single: I/O control; buffering + + The optional *bufsize* argument specifies the file's desired buffer size: 0 + means unbuffered, 1 means line buffered, any other positive value means use a + buffer of (approximately) that size. A negative *bufsize* means to use the + system default, which is usually line buffered for tty devices and fully + buffered for other files. If omitted, the system default is used. [#]_ + + Modes ``'r+'``, ``'w+'`` and ``'a+'`` open the file for updating (note that + ``'w+'`` truncates the file). Append ``'b'`` to the mode to open the file in + binary mode, on systems that differentiate between binary and text files; on + systems that don't have this distinction, adding the ``'b'`` has no effect. + + In addition to the standard :cfunc:`fopen` values *mode* may be ``'U'`` or + ``'rU'``. Python is usually built with universal newline support; supplying + ``'U'`` opens the file as a text file, but lines may be terminated by any of the + following: the Unix end-of-line convention ``'\n'``, the Macintosh convention + ``'\r'``, or the Windows convention ``'\r\n'``. All of these external + representations are seen as ``'\n'`` by the Python program. If Python is built + without universal newline support a *mode* with ``'U'`` is the same as normal + text mode. Note that file objects so opened also have an attribute called + :attr:`newlines` which has a value of ``None`` (if no newlines have yet been + seen), ``'\n'``, ``'\r'``, ``'\r\n'``, or a tuple containing all the newline + types seen. + + Python enforces that the mode, after stripping ``'U'``, begins with ``'r'``, + ``'w'`` or ``'a'``. + + See also the :mod:`fileinput` module. + + .. versionchanged:: 2.5 + Restriction on first letter of mode string introduced. + + +.. function:: ord(c) + + Given a string of length one, return an integer representing the Unicode code + point of the character when the argument is a unicode object, or the value of + the byte when the argument is an 8-bit string. For example, ``ord('a')`` returns + the integer ``97``, ``ord(u'\u2020')`` returns ``8224``. This is the inverse of + :func:`chr` for 8-bit strings and of :func:`unichr` for unicode objects. If a + unicode argument is given and Python was built with UCS2 Unicode, then the + character's code point must be in the range [0..65535] inclusive; otherwise the + string length is two, and a :exc:`TypeError` will be raised. + + +.. function:: pow(x, y[, z]) + + Return *x* to the power *y*; if *z* is present, return *x* to the power *y*, + modulo *z* (computed more efficiently than ``pow(x, y) % z``). The two-argument + form ``pow(x, y)`` is equivalent to using the power operator: ``x**y``. + + The arguments must have numeric types. With mixed operand types, the coercion + rules for binary arithmetic operators apply. For int and long int operands, the + result has the same type as the operands (after coercion) unless the second + argument is negative; in that case, all arguments are converted to float and a + float result is delivered. For example, ``10**2`` returns ``100``, but + ``10**-2`` returns ``0.01``. (This last feature was added in Python 2.2. In + Python 2.1 and before, if both arguments were of integer types and the second + argument was negative, an exception was raised.) If the second argument is + negative, the third argument must be omitted. If *z* is present, *x* and *y* + must be of integer types, and *y* must be non-negative. (This restriction was + added in Python 2.2. In Python 2.1 and before, floating 3-argument ``pow()`` + returned platform-dependent results depending on floating-point rounding + accidents.) + + +.. function:: property([fget[, fset[, fdel[, doc]]]]) + + Return a property attribute for new-style classes (classes that derive from + :class:`object`). + + *fget* is a function for getting an attribute value, likewise *fset* is a + function for setting, and *fdel* a function for del'ing, an attribute. Typical + use is to define a managed attribute x:: + + class C(object): + def __init__(self): self._x = None + def getx(self): return self._x + def setx(self, value): self._x = value + def delx(self): del self._x + x = property(getx, setx, delx, "I'm the 'x' property.") + + If given, *doc* will be the docstring of the property attribute. Otherwise, the + property will copy *fget*'s docstring (if it exists). This makes it possible to + create read-only properties easily using :func:`property` as a decorator:: + + class Parrot(object): + def __init__(self): + self._voltage = 100000 + + @property + def voltage(self): + """Get the current voltage.""" + return self._voltage + + turns the :meth:`voltage` method into a "getter" for a read-only attribute with + the same name. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.5 + Use *fget*'s docstring if no *doc* given. + + +.. function:: range([start,] stop[, step]) + + This is a versatile function to create sequences containing arithmetic + progressions. It is most often used in :keyword:`for` loops. The arguments + must be plain integers. If the *step* argument is omitted, it defaults to + ``1``. If the *start* argument is omitted, it defaults to ``0``. The full form + returns a list of plain integers ``[start, start + step, start + 2 * step, + ...]``. If *step* is positive, the last element is the largest ``start + i * + step`` less than *stop*; if *step* is negative, the last element is the smallest + ``start + i * step`` greater than *stop*. *step* must not be zero (or else + :exc:`ValueError` is raised). Example:: + + >>> list(range(10)) + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + >>> list(range(1, 11)) + [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] + >>> list(range(0, 30, 5)) + [0, 5, 10, 15, 20, 25] + >>> list(range(0, 10, 3)) + [0, 3, 6, 9] + >>> list(range(0, -10, -1)) + [0, -1, -2, -3, -4, -5, -6, -7, -8, -9] + >>> list(range(0)) + [] + >>> list(range(1, 0)) + [] + + +.. function:: repr(object) + + Return a string containing a printable representation of an object. This is the + same value yielded by conversions (reverse quotes). It is sometimes useful to be + able to access this operation as an ordinary function. For many types, this + function makes an attempt to return a string that would yield an object with the + same value when passed to :func:`eval`. + + +.. function:: reversed(seq) + + Return a reverse iterator. *seq* must be an object which supports the sequence + protocol (the :meth:`__len__` method and the :meth:`__getitem__` method with + integer arguments starting at ``0``). + + .. versionadded:: 2.4 + + +.. function:: round(x[, n]) + + Return the floating point value *x* rounded to *n* digits after the decimal + point. If *n* is omitted, it defaults to zero. The result is a floating point + number. Values are rounded to the closest multiple of 10 to the power minus + *n*; if two multiples are equally close, rounding is done away from 0 (so. for + example, ``round(0.5)`` is ``1.0`` and ``round(-0.5)`` is ``-1.0``). + + +.. function:: set([iterable]) + :noindex: + + Return a new set, optionally with elements are taken from *iterable*. + The set type is described in :ref:`types-set`. + + For other containers see the built in :class:`dict`, :class:`list`, and + :class:`tuple` classes, and the :mod:`collections` module. + + .. versionadded:: 2.4 + + +.. function:: setattr(object, name, value) + + This is the counterpart of :func:`getattr`. The arguments are an object, a + string and an arbitrary value. The string may name an existing attribute or a + new attribute. The function assigns the value to the attribute, provided the + object allows it. For example, ``setattr(x, 'foobar', 123)`` is equivalent to + ``x.foobar = 123``. + + +.. function:: slice([start,] stop[, step]) + + .. index:: single: Numerical Python + + Return a slice object representing the set of indices specified by + ``range(start, stop, step)``. The *start* and *step* arguments default to + ``None``. Slice objects have read-only data attributes :attr:`start`, + :attr:`stop` and :attr:`step` which merely return the argument values (or their + default). They have no other explicit functionality; however they are used by + Numerical Python and other third party extensions. Slice objects are also + generated when extended indexing syntax is used. For example: + ``a[start:stop:step]`` or ``a[start:stop, i]``. + + +.. function:: sorted(iterable[, cmp[, key[, reverse]]]) + + Return a new sorted list from the items in *iterable*. + + The optional arguments *cmp*, *key*, and *reverse* have the same meaning as + those for the :meth:`list.sort` method (described in section + :ref:`typesseq-mutable`). + + *cmp* specifies a custom comparison function of two arguments (iterable + elements) which should return a negative, zero or positive number depending on + whether the first argument is considered smaller than, equal to, or larger than + the second argument: ``cmp=lambda x,y: cmp(x.lower(), y.lower())`` + + *key* specifies a function of one argument that is used to extract a comparison + key from each list element: ``key=str.lower`` + + *reverse* is a boolean value. If set to ``True``, then the list elements are + sorted as if each comparison were reversed. + + In general, the *key* and *reverse* conversion processes are much faster than + specifying an equivalent *cmp* function. This is because *cmp* is called + multiple times for each list element while *key* and *reverse* touch each + element only once. + + .. versionadded:: 2.4 + + +.. function:: staticmethod(function) + + Return a static method for *function*. + + A static method does not receive an implicit first argument. To declare a static + method, use this idiom:: + + class C: + @staticmethod + def f(arg1, arg2, ...): ... + + The ``@staticmethod`` form is a function decorator -- see the description of + function definitions in :ref:`function` for details. + + It can be called either on the class (such as ``C.f()``) or on an instance (such + as ``C().f()``). The instance is ignored except for its class. + + Static methods in Python are similar to those found in Java or C++. For a more + advanced concept, see :func:`classmethod` in this section. + + For more information on static methods, consult the documentation on the + standard type hierarchy in :ref:`types`. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.4 + Function decorator syntax added. + + +.. function:: str([object[, encoding[, errors]]]) + + Return a string version of an object, using one of the following modes: + + If *encoding* and/or *errors* are given, :func:`str` will decode the + *object* which can either be a byte string or a character buffer using + the codec for *encoding*. The *encoding* parameter is a string giving + the name of an encoding; if the encoding is not known, :exc:`LookupError` + is raised. Error handling is done according to *errors*; this specifies the + treatment of characters which are invalid in the input encoding. If + *errors* is ``'strict'`` (the default), a :exc:`ValueError` is raised on + errors, while a value of ``'ignore'`` causes errors to be silently ignored, + and a value of ``'replace'`` causes the official Unicode replacement character, + U+FFFD, to be used to replace input characters which cannot be decoded. + See also the :mod:`codecs` module. + + When only *object* is given, this returns its nicely printable representation. + For strings, this is the string itself. The difference with ``repr(object)`` + is that ``str(object)`` does not always attempt to return a string that is + acceptable to :func:`eval`; its goal is to return a printable string. + With no arguments, this returns the empty string. + + Objects can specify what ``str(object)`` returns by defining a :meth:`__str__` + special method. + + For more information on strings see :ref:`typesseq` which describes sequence + functionality (strings are sequences), and also the string-specific methods + described in the :ref:`string-methods` section. To output formatted strings + use template strings or the ``%`` operator described in the + :ref:`string-formatting` section. In addition see the :ref:`stringservices` + section. See also :func:`unicode`. + + +.. function:: sum(iterable[, start]) + + Sums *start* and the items of an *iterable* from left to right and returns the + total. *start* defaults to ``0``. The *iterable*'s items are normally numbers, + and are not allowed to be strings. The fast, correct way to concatenate a + sequence of strings is by calling ``''.join(sequence)``. + + .. versionadded:: 2.3 + + +.. function:: super(type[, object-or-type]) + + Return the superclass of *type*. If the second argument is omitted the super + object returned is unbound. If the second argument is an object, + ``isinstance(obj, type)`` must be true. If the second argument is a type, + ``issubclass(type2, type)`` must be true. :func:`super` only works for new-style + classes. + + A typical use for calling a cooperative superclass method is:: + + class C(B): + def meth(self, arg): + super(C, self).meth(arg) + + Note that :func:`super` is implemented as part of the binding process for + explicit dotted attribute lookups such as ``super(C, self).__getitem__(name)``. + Accordingly, :func:`super` is undefined for implicit lookups using statements or + operators such as ``super(C, self)[name]``. + + .. versionadded:: 2.2 + + +.. function:: tuple([iterable]) + + Return a tuple whose items are the same and in the same order as *iterable*'s + items. *iterable* may be a sequence, a container that supports iteration, or an + iterator object. If *iterable* is already a tuple, it is returned unchanged. + For instance, ``tuple('abc')`` returns ``('a', 'b', 'c')`` and ``tuple([1, 2, + 3])`` returns ``(1, 2, 3)``. If no argument is given, returns a new empty + tuple, ``()``. + + :class:`tuple` is an immutable sequence type, as documented in + :ref:`typesseq`. For other containers see the built in :class:`dict`, + :class:`list`, and :class:`set` classes, and the :mod:`collections` module. + + +.. function:: type(object) + + .. index:: object: type + + Return the type of an *object*. The return value is a type object. The + :func:`isinstance` built-in function is recommended for testing the type of an + object. + + With three arguments, :func:`type` functions as a constructor as detailed below. + + +.. function:: type(name, bases, dict) + :noindex: + + Return a new type object. This is essentially a dynamic form of the + :keyword:`class` statement. The *name* string is the class name and becomes the + :attr:`__name__` attribute; the *bases* tuple itemizes the base classes and + becomes the :attr:`__bases__` attribute; and the *dict* dictionary is the + namespace containing definitions for class body and becomes the :attr:`__dict__` + attribute. For example, the following two statements create identical + :class:`type` objects:: + + >>> class X(object): + ... a = 1 + ... + >>> X = type('X', (object,), dict(a=1)) + + .. versionadded:: 2.2 + + +.. function:: vars([object]) + + Without arguments, return a dictionary corresponding to the current local symbol + table. With a module, class or class instance object as argument (or anything + else that has a :attr:`__dict__` attribute), returns a dictionary corresponding + to the object's symbol table. The returned dictionary should not be modified: + the effects on the corresponding symbol table are undefined. [#]_ + + +.. function:: zip([iterable, ...]) + + This function returns a list of tuples, where the *i*-th tuple contains the + *i*-th element from each of the argument sequences or iterables. The returned + list is truncated in length to the length of the shortest argument sequence. + When there are multiple arguments which are all of the same length, :func:`zip` + is similar to :func:`map` with an initial argument of ``None``. With a single + sequence argument, it returns a list of 1-tuples. With no arguments, it returns + an empty list. + + .. versionadded:: 2.0 + + .. versionchanged:: 2.4 + Formerly, :func:`zip` required at least one argument and ``zip()`` raised a + :exc:`TypeError` instead of returning an empty list. + +.. % --------------------------------------------------------------------------- + + +.. _non-essential-built-in-funcs: + +Non-essential Built-in Functions +================================ + +There are several built-in functions that are no longer essential to learn, know +or use in modern Python programming. They have been kept here to maintain +backwards compatibility with programs written for older versions of Python. + +Python programmers, trainers, students and bookwriters should feel free to +bypass these functions without concerns about missing something important. + + +.. function:: buffer(object[, offset[, size]]) + + The *object* argument must be an object that supports the buffer call interface + (such as strings, arrays, and buffers). A new buffer object will be created + which references the *object* argument. The buffer object will be a slice from + the beginning of *object* (or from the specified *offset*). The slice will + extend to the end of *object* (or will have a length given by the *size* + argument). + + + +.. rubric:: Footnotes + +.. [#] Specifying a buffer size currently has no effect on systems that don't have + :cfunc:`setvbuf`. The interface to specify the buffer size is not done using a + method that calls :cfunc:`setvbuf`, because that may dump core when called after + any I/O has been performed, and there's no reliable way to determine whether + this is the case. + +.. [#] In the current implementation, local variable bindings cannot normally be + affected this way, but variables retrieved from other scopes (such as modules) + can be. This may change. + diff --git a/Doc/library/functools.rst b/Doc/library/functools.rst new file mode 100644 index 0000000..4874b55 --- /dev/null +++ b/Doc/library/functools.rst @@ -0,0 +1,145 @@ +:mod:`functools` --- Higher order functions and operations on callable objects +============================================================================== + +.. module:: functools + :synopsis: Higher order functions and operations on callable objects. +.. moduleauthor:: Peter Harris <scav@blueyonder.co.uk> +.. moduleauthor:: Raymond Hettinger <python@rcn.com> +.. moduleauthor:: Nick Coghlan <ncoghlan@gmail.com> +.. sectionauthor:: Peter Harris <scav@blueyonder.co.uk> + + +.. versionadded:: 2.5 + +The :mod:`functools` module is for higher-order functions: functions that act on +or return other functions. In general, any callable object can be treated as a +function for the purposes of this module. + +The :mod:`functools` module defines the following function: + + +.. function:: partial(func[,*args][, **keywords]) + + Return a new :class:`partial` object which when called will behave like *func* + called with the positional arguments *args* and keyword arguments *keywords*. If + more arguments are supplied to the call, they are appended to *args*. If + additional keyword arguments are supplied, they extend and override *keywords*. + Roughly equivalent to:: + + def partial(func, *args, **keywords): + def newfunc(*fargs, **fkeywords): + newkeywords = keywords.copy() + newkeywords.update(fkeywords) + return func(*(args + fargs), **newkeywords) + newfunc.func = func + newfunc.args = args + newfunc.keywords = keywords + return newfunc + + The :func:`partial` is used for partial function application which "freezes" + some portion of a function's arguments and/or keywords resulting in a new object + with a simplified signature. For example, :func:`partial` can be used to create + a callable that behaves like the :func:`int` function where the *base* argument + defaults to two:: + + >>> basetwo = partial(int, base=2) + >>> basetwo.__doc__ = 'Convert base 2 string to an int.' + >>> basetwo('10010') + 18 + + +.. function:: reduce(function, sequence[, initializer]) + + Apply *function* of two arguments cumulatively to the items of *sequence*, from + left to right, so as to reduce the sequence to a single value. For example, + ``reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])`` calculates ``((((1+2)+3)+4)+5)``. + The left argument, *x*, is the accumulated value and the right argument, *y*, is + the update value from the *sequence*. If the optional *initializer* is present, + it is placed before the items of the sequence in the calculation, and serves as + a default when the sequence is empty. If *initializer* is not given and + *sequence* contains only one item, the first item is returned. + + +.. function:: update_wrapper(wrapper, wrapped[, assigned][, updated]) + + Update a *wrapper* function to look like the *wrapped* function. The optional + arguments are tuples to specify which attributes of the original function are + assigned directly to the matching attributes on the wrapper function and which + attributes of the wrapper function are updated with the corresponding attributes + from the original function. The default values for these arguments are the + module level constants *WRAPPER_ASSIGNMENTS* (which assigns to the wrapper + function's *__name__*, *__module__* and *__doc__*, the documentation string) and + *WRAPPER_UPDATES* (which updates the wrapper function's *__dict__*, i.e. the + instance dictionary). + + The main intended use for this function is in decorator functions which wrap the + decorated function and return the wrapper. If the wrapper function is not + updated, the metadata of the returned function will reflect the wrapper + definition rather than the original function definition, which is typically less + than helpful. + + +.. function:: wraps(wrapped[, assigned][, updated]) + + This is a convenience function for invoking ``partial(update_wrapper, + wrapped=wrapped, assigned=assigned, updated=updated)`` as a function decorator + when defining a wrapper function. For example:: + + >>> def my_decorator(f): + ... @wraps(f) + ... def wrapper(*args, **kwds): + ... print 'Calling decorated function' + ... return f(*args, **kwds) + ... return wrapper + ... + >>> @my_decorator + ... def example(): + ... """Docstring""" + ... print 'Called example function' + ... + >>> example() + Calling decorated function + Called example function + >>> example.__name__ + 'example' + >>> example.__doc__ + 'Docstring' + + Without the use of this decorator factory, the name of the example function + would have been ``'wrapper'``, and the docstring of the original :func:`example` + would have been lost. + + +.. _partial-objects: + +:class:`partial` Objects +------------------------ + +:class:`partial` objects are callable objects created by :func:`partial`. They +have three read-only attributes: + + +.. attribute:: partial.func + + A callable object or function. Calls to the :class:`partial` object will be + forwarded to :attr:`func` with new arguments and keywords. + + +.. attribute:: partial.args + + The leftmost positional arguments that will be prepended to the positional + arguments provided to a :class:`partial` object call. + + +.. attribute:: partial.keywords + + The keyword arguments that will be supplied when the :class:`partial` object is + called. + +:class:`partial` objects are like :class:`function` objects in that they are +callable, weak referencable, and can have attributes. There are some important +differences. For instance, the :attr:`__name__` and :attr:`__doc__` attributes +are not created automatically. Also, :class:`partial` objects defined in +classes behave like static methods and do not transform into bound methods +during instance attribute look-up. + diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst new file mode 100644 index 0000000..70e4a6b --- /dev/null +++ b/Doc/library/gc.rst @@ -0,0 +1,211 @@ + +:mod:`gc` --- Garbage Collector interface +========================================= + +.. module:: gc + :synopsis: Interface to the cycle-detecting garbage collector. +.. moduleauthor:: Neil Schemenauer <nas@arctrix.com> +.. sectionauthor:: Neil Schemenauer <nas@arctrix.com> + + +This module provides an interface to the optional garbage collector. It +provides the ability to disable the collector, tune the collection frequency, +and set debugging options. It also provides access to unreachable objects that +the collector found but cannot free. Since the collector supplements the +reference counting already used in Python, you can disable the collector if you +are sure your program does not create reference cycles. Automatic collection +can be disabled by calling ``gc.disable()``. To debug a leaking program call +``gc.set_debug(gc.DEBUG_LEAK)``. Notice that this includes +``gc.DEBUG_SAVEALL``, causing garbage-collected objects to be saved in +gc.garbage for inspection. + +The :mod:`gc` module provides the following functions: + + +.. function:: enable() + + Enable automatic garbage collection. + + +.. function:: disable() + + Disable automatic garbage collection. + + +.. function:: isenabled() + + Returns true if automatic collection is enabled. + + +.. function:: collect([generation]) + + With no arguments, run a full collection. The optional argument *generation* + may be an integer specifying which generation to collect (from 0 to 2). A + :exc:`ValueError` is raised if the generation number is invalid. The number of + unreachable objects found is returned. + + .. versionchanged:: 2.5 + The optional *generation* argument was added. + + +.. function:: set_debug(flags) + + Set the garbage collection debugging flags. Debugging information will be + written to ``sys.stderr``. See below for a list of debugging flags which can be + combined using bit operations to control debugging. + + +.. function:: get_debug() + + Return the debugging flags currently set. + + +.. function:: get_objects() + + Returns a list of all objects tracked by the collector, excluding the list + returned. + + .. versionadded:: 2.2 + + +.. function:: set_threshold(threshold0[, threshold1[, threshold2]]) + + Set the garbage collection thresholds (the collection frequency). Setting + *threshold0* to zero disables collection. + + The GC classifies objects into three generations depending on how many + collection sweeps they have survived. New objects are placed in the youngest + generation (generation ``0``). If an object survives a collection it is moved + into the next older generation. Since generation ``2`` is the oldest + generation, objects in that generation remain there after a collection. In + order to decide when to run, the collector keeps track of the number object + allocations and deallocations since the last collection. When the number of + allocations minus the number of deallocations exceeds *threshold0*, collection + starts. Initially only generation ``0`` is examined. If generation ``0`` has + been examined more than *threshold1* times since generation ``1`` has been + examined, then generation ``1`` is examined as well. Similarly, *threshold2* + controls the number of collections of generation ``1`` before collecting + generation ``2``. + + +.. function:: get_count() + + Return the current collection counts as a tuple of ``(count0, count1, + count2)``. + + .. versionadded:: 2.5 + + +.. function:: get_threshold() + + Return the current collection thresholds as a tuple of ``(threshold0, + threshold1, threshold2)``. + + +.. function:: get_referrers(*objs) + + Return the list of objects that directly refer to any of objs. This function + will only locate those containers which support garbage collection; extension + types which do refer to other objects but do not support garbage collection will + not be found. + + Note that objects which have already been dereferenced, but which live in cycles + and have not yet been collected by the garbage collector can be listed among the + resulting referrers. To get only currently live objects, call :func:`collect` + before calling :func:`get_referrers`. + + Care must be taken when using objects returned by :func:`get_referrers` because + some of them could still be under construction and hence in a temporarily + invalid state. Avoid using :func:`get_referrers` for any purpose other than + debugging. + + .. versionadded:: 2.2 + + +.. function:: get_referents(*objs) + + Return a list of objects directly referred to by any of the arguments. The + referents returned are those objects visited by the arguments' C-level + :attr:`tp_traverse` methods (if any), and may not be all objects actually + directly reachable. :attr:`tp_traverse` methods are supported only by objects + that support garbage collection, and are only required to visit objects that may + be involved in a cycle. So, for example, if an integer is directly reachable + from an argument, that integer object may or may not appear in the result list. + + .. versionadded:: 2.3 + +The following variable is provided for read-only access (you can mutate its +value but should not rebind it): + + +.. data:: garbage + + A list of objects which the collector found to be unreachable but could not be + freed (uncollectable objects). By default, this list contains only objects with + :meth:`__del__` methods. [#]_ Objects that have :meth:`__del__` methods and are + part of a reference cycle cause the entire reference cycle to be uncollectable, + including objects not necessarily in the cycle but reachable only from it. + Python doesn't collect such cycles automatically because, in general, it isn't + possible for Python to guess a safe order in which to run the :meth:`__del__` + methods. If you know a safe order, you can force the issue by examining the + *garbage* list, and explicitly breaking cycles due to your objects within the + list. Note that these objects are kept alive even so by virtue of being in the + *garbage* list, so they should be removed from *garbage* too. For example, + after breaking cycles, do ``del gc.garbage[:]`` to empty the list. It's + generally better to avoid the issue by not creating cycles containing objects + with :meth:`__del__` methods, and *garbage* can be examined in that case to + verify that no such cycles are being created. + + If :const:`DEBUG_SAVEALL` is set, then all unreachable objects will be added to + this list rather than freed. + +The following constants are provided for use with :func:`set_debug`: + + +.. data:: DEBUG_STATS + + Print statistics during collection. This information can be useful when tuning + the collection frequency. + + +.. data:: DEBUG_COLLECTABLE + + Print information on collectable objects found. + + +.. data:: DEBUG_UNCOLLECTABLE + + Print information of uncollectable objects found (objects which are not + reachable but cannot be freed by the collector). These objects will be added to + the ``garbage`` list. + + +.. data:: DEBUG_INSTANCES + + When :const:`DEBUG_COLLECTABLE` or :const:`DEBUG_UNCOLLECTABLE` is set, print + information about instance objects found. + + +.. data:: DEBUG_OBJECTS + + When :const:`DEBUG_COLLECTABLE` or :const:`DEBUG_UNCOLLECTABLE` is set, print + information about objects other than instance objects found. + + +.. data:: DEBUG_SAVEALL + + When set, all unreachable objects found will be appended to *garbage* rather + than being freed. This can be useful for debugging a leaking program. + + +.. data:: DEBUG_LEAK + + The debugging flags necessary for the collector to print information about a + leaking program (equal to ``DEBUG_COLLECTABLE | DEBUG_UNCOLLECTABLE | + DEBUG_INSTANCES | DEBUG_OBJECTS | DEBUG_SAVEALL``). + +.. rubric:: Footnotes + +.. [#] Prior to Python 2.2, the list contained all instance objects in unreachable + cycles, not only those with :meth:`__del__` methods. + diff --git a/Doc/library/gdbm.rst b/Doc/library/gdbm.rst new file mode 100644 index 0000000..ce27f6c --- /dev/null +++ b/Doc/library/gdbm.rst @@ -0,0 +1,122 @@ + +:mod:`gdbm` --- GNU's reinterpretation of dbm +============================================= + +.. module:: gdbm + :platform: Unix + :synopsis: GNU's reinterpretation of dbm. + + +.. index:: module: dbm + +This module is quite similar to the :mod:`dbm` module, but uses ``gdbm`` instead +to provide some additional functionality. Please note that the file formats +created by ``gdbm`` and ``dbm`` are incompatible. + +The :mod:`gdbm` module provides an interface to the GNU DBM library. ``gdbm`` +objects behave like mappings (dictionaries), except that keys and values are +always strings. Printing a ``gdbm`` object doesn't print the keys and values, +and the :meth:`items` and :meth:`values` methods are not supported. + +The module defines the following constant and functions: + + +.. exception:: error + + Raised on ``gdbm``\ -specific errors, such as I/O errors. :exc:`KeyError` is + raised for general mapping errors like specifying an incorrect key. + + +.. function:: open(filename, [flag, [mode]]) + + Open a ``gdbm`` database and return a ``gdbm`` object. The *filename* argument + is the name of the database file. + + The optional *flag* argument can be: + + +---------+-------------------------------------------+ + | Value | Meaning | + +=========+===========================================+ + | ``'r'`` | Open existing database for reading only | + | | (default) | + +---------+-------------------------------------------+ + | ``'w'`` | Open existing database for reading and | + | | writing | + +---------+-------------------------------------------+ + | ``'c'`` | Open database for reading and writing, | + | | creating it if it doesn't exist | + +---------+-------------------------------------------+ + | ``'n'`` | Always create a new, empty database, open | + | | for reading and writing | + +---------+-------------------------------------------+ + + The following additional characters may be appended to the flag to control + how the database is opened: + + +---------+--------------------------------------------+ + | Value | Meaning | + +=========+============================================+ + | ``'f'`` | Open the database in fast mode. Writes | + | | to the database will not be synchronized. | + +---------+--------------------------------------------+ + | ``'s'`` | Synchronized mode. This will cause changes | + | | to the database to be immediately written | + | | to the file. | + +---------+--------------------------------------------+ + | ``'u'`` | Do not lock database. | + +---------+--------------------------------------------+ + + Not all flags are valid for all versions of ``gdbm``. The module constant + :const:`open_flags` is a string of supported flag characters. The exception + :exc:`error` is raised if an invalid flag is specified. + + The optional *mode* argument is the Unix mode of the file, used only when the + database has to be created. It defaults to octal ``0666``. + +In addition to the dictionary-like methods, ``gdbm`` objects have the following +methods: + + +.. function:: firstkey() + + It's possible to loop over every key in the database using this method and the + :meth:`nextkey` method. The traversal is ordered by ``gdbm``'s internal hash + values, and won't be sorted by the key values. This method returns the starting + key. + + +.. function:: nextkey(key) + + Returns the key that follows *key* in the traversal. The following code prints + every key in the database ``db``, without having to create a list in memory that + contains them all:: + + k = db.firstkey() + while k != None: + print k + k = db.nextkey(k) + + +.. function:: reorganize() + + If you have carried out a lot of deletions and would like to shrink the space + used by the ``gdbm`` file, this routine will reorganize the database. ``gdbm`` + will not shorten the length of a database file except by using this + reorganization; otherwise, deleted file space will be kept and reused as new + (key, value) pairs are added. + + +.. function:: sync() + + When the database has been opened in fast mode, this method forces any + unwritten data to be written to the disk. + + +.. seealso:: + + Module :mod:`anydbm` + Generic interface to ``dbm``\ -style databases. + + Module :mod:`whichdb` + Utility module used to determine the type of an existing database. + diff --git a/Doc/library/gensuitemodule.rst b/Doc/library/gensuitemodule.rst new file mode 100644 index 0000000..3fc5254 --- /dev/null +++ b/Doc/library/gensuitemodule.rst @@ -0,0 +1,63 @@ + +:mod:`gensuitemodule` --- Generate OSA stub packages +==================================================== + +.. module:: gensuitemodule + :platform: Mac + :synopsis: Create a stub package from an OSA dictionary +.. sectionauthor:: Jack Jansen <Jack.Jansen@cwi.nl> + + +.. % \moduleauthor{Jack Jansen?}{email} + +The :mod:`gensuitemodule` module creates a Python package implementing stub code +for the AppleScript suites that are implemented by a specific application, +according to its AppleScript dictionary. + +It is usually invoked by the user through the :program:`PythonIDE`, but it can +also be run as a script from the command line (pass :option:`--help` for help on +the options) or imported from Python code. For an example of its use see +:file:`Mac/scripts/genallsuites.py` in a source distribution, which generates +the stub packages that are included in the standard library. + +It defines the following public functions: + + +.. function:: is_scriptable(application) + + Returns true if ``application``, which should be passed as a pathname, appears + to be scriptable. Take the return value with a grain of salt: :program:`Internet + Explorer` appears not to be scriptable but definitely is. + + +.. function:: processfile(application[, output, basepkgname, edit_modnames, creatorsignature, dump, verbose]) + + Create a stub package for ``application``, which should be passed as a full + pathname. For a :file:`.app` bundle this is the pathname to the bundle, not to + the executable inside the bundle; for an unbundled CFM application you pass the + filename of the application binary. + + This function asks the application for its OSA terminology resources, decodes + these resources and uses the resultant data to create the Python code for the + package implementing the client stubs. + + ``output`` is the pathname where the resulting package is stored, if not + specified a standard "save file as" dialog is presented to the user. + ``basepkgname`` is the base package on which this package will build, and + defaults to :mod:`StdSuites`. Only when generating :mod:`StdSuites` itself do + you need to specify this. ``edit_modnames`` is a dictionary that can be used to + change modulenames that are too ugly after name mangling. ``creator_signature`` + can be used to override the 4-char creator code, which is normally obtained from + the :file:`PkgInfo` file in the package or from the CFM file creator signature. + When ``dump`` is given it should refer to a file object, and ``processfile`` + will stop after decoding the resources and dump the Python representation of the + terminology resources to this file. ``verbose`` should also be a file object, + and specifying it will cause ``processfile`` to tell you what it is doing. + + +.. function:: processfile_fromresource(application[, output, basepkgname, edit_modnames, creatorsignature, dump, verbose]) + + This function does the same as ``processfile``, except that it uses a different + method to get the terminology resources. It opens ``application`` as a resource + file and reads all ``"aete"`` and ``"aeut"`` resources from this file. + diff --git a/Doc/library/getopt.rst b/Doc/library/getopt.rst new file mode 100644 index 0000000..0d9641d --- /dev/null +++ b/Doc/library/getopt.rst @@ -0,0 +1,147 @@ + +:mod:`getopt` --- Parser for command line options +================================================= + +.. module:: getopt + :synopsis: Portable parser for command line options; support both short and long option + names. + + +This module helps scripts to parse the command line arguments in ``sys.argv``. +It supports the same conventions as the Unix :cfunc:`getopt` function (including +the special meanings of arguments of the form '``-``' and '``-``\ ``-``'). Long +options similar to those supported by GNU software may be used as well via an +optional third argument. This module provides a single function and an +exception: + +.. % That's to fool latex2html into leaving the two hyphens alone! + + +.. function:: getopt(args, options[, long_options]) + + Parses command line options and parameter list. *args* is the argument list to + be parsed, without the leading reference to the running program. Typically, this + means ``sys.argv[1:]``. *options* is the string of option letters that the + script wants to recognize, with options that require an argument followed by a + colon (``':'``; i.e., the same format that Unix :cfunc:`getopt` uses). + + .. note:: + + Unlike GNU :cfunc:`getopt`, after a non-option argument, all further arguments + are considered also non-options. This is similar to the way non-GNU Unix systems + work. + + *long_options*, if specified, must be a list of strings with the names of the + long options which should be supported. The leading ``'-``\ ``-'`` characters + should not be included in the option name. Long options which require an + argument should be followed by an equal sign (``'='``). To accept only long + options, *options* should be an empty string. Long options on the command line + can be recognized so long as they provide a prefix of the option name that + matches exactly one of the accepted options. For example, if *long_options* is + ``['foo', 'frob']``, the option :option:`--fo` will match as :option:`--foo`, + but :option:`--f` will not match uniquely, so :exc:`GetoptError` will be raised. + + The return value consists of two elements: the first is a list of ``(option, + value)`` pairs; the second is the list of program arguments left after the + option list was stripped (this is a trailing slice of *args*). Each + option-and-value pair returned has the option as its first element, prefixed + with a hyphen for short options (e.g., ``'-x'``) or two hyphens for long + options (e.g., ``'-``\ ``-long-option'``), and the option argument as its + second element, or an empty string if the option has no argument. The + options occur in the list in the same order in which they were found, thus + allowing multiple occurrences. Long and short options may be mixed. + + +.. function:: gnu_getopt(args, options[, long_options]) + + This function works like :func:`getopt`, except that GNU style scanning mode is + used by default. This means that option and non-option arguments may be + intermixed. The :func:`getopt` function stops processing options as soon as a + non-option argument is encountered. + + If the first character of the option string is '+', or if the environment + variable POSIXLY_CORRECT is set, then option processing stops as soon as a + non-option argument is encountered. + + .. versionadded:: 2.3 + + +.. exception:: GetoptError + + This is raised when an unrecognized option is found in the argument list or when + an option requiring an argument is given none. The argument to the exception is + a string indicating the cause of the error. For long options, an argument given + to an option which does not require one will also cause this exception to be + raised. The attributes :attr:`msg` and :attr:`opt` give the error message and + related option; if there is no specific option to which the exception relates, + :attr:`opt` is an empty string. + + .. versionchanged:: 1.6 + Introduced :exc:`GetoptError` as a synonym for :exc:`error`. + + +.. exception:: error + + Alias for :exc:`GetoptError`; for backward compatibility. + +An example using only Unix style options:: + + >>> import getopt + >>> args = '-a -b -cfoo -d bar a1 a2'.split() + >>> args + ['-a', '-b', '-cfoo', '-d', 'bar', 'a1', 'a2'] + >>> optlist, args = getopt.getopt(args, 'abc:d:') + >>> optlist + [('-a', ''), ('-b', ''), ('-c', 'foo'), ('-d', 'bar')] + >>> args + ['a1', 'a2'] + +Using long option names is equally easy:: + + >>> s = '--condition=foo --testing --output-file abc.def -x a1 a2' + >>> args = s.split() + >>> args + ['--condition=foo', '--testing', '--output-file', 'abc.def', '-x', 'a1', 'a2'] + >>> optlist, args = getopt.getopt(args, 'x', [ + ... 'condition=', 'output-file=', 'testing']) + >>> optlist + [('--condition', 'foo'), ('--testing', ''), ('--output-file', 'abc.def'), ('-x', + '')] + >>> args + ['a1', 'a2'] + +In a script, typical usage is something like this:: + + import getopt, sys + + def main(): + try: + opts, args = getopt.getopt(sys.argv[1:], "ho:v", ["help", "output="]) + except getopt.GetoptError as err: + # print help information and exit: + print str(err) # will print something like "option -a not recognized" + usage() + sys.exit(2) + output = None + verbose = False + for o, a in opts: + if o == "-v": + verbose = True + elif o in ("-h", "--help"): + usage() + sys.exit() + elif o in ("-o", "--output"): + output = a + else: + assert False, "unhandled option" + # ... + + if __name__ == "__main__": + main() + + +.. seealso:: + + Module :mod:`optparse` + More object-oriented command line option parsing. + diff --git a/Doc/library/getpass.rst b/Doc/library/getpass.rst new file mode 100644 index 0000000..45c6e53 --- /dev/null +++ b/Doc/library/getpass.rst @@ -0,0 +1,38 @@ + +:mod:`getpass` --- Portable password input +========================================== + +.. module:: getpass + :synopsis: Portable reading of passwords and retrieval of the userid. +.. moduleauthor:: Piers Lauder <piers@cs.su.oz.au> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. % Windows (& Mac?) support by Guido van Rossum. + +The :mod:`getpass` module provides two functions: + + +.. function:: getpass([prompt[, stream]]) + + Prompt the user for a password without echoing. The user is prompted using the + string *prompt*, which defaults to ``'Password: '``. On Unix, the prompt is + written to the file-like object *stream*, which defaults to ``sys.stdout`` (this + argument is ignored on Windows). + + Availability: Macintosh, Unix, Windows. + + .. versionchanged:: 2.5 + The *stream* parameter was added. + + +.. function:: getuser() + + Return the "login name" of the user. Availability: Unix, Windows. + + This function checks the environment variables :envvar:`LOGNAME`, + :envvar:`USER`, :envvar:`LNAME` and :envvar:`USERNAME`, in order, and returns + the value of the first one which is set to a non-empty string. If none are set, + the login name from the password database is returned on systems which support + the :mod:`pwd` module, otherwise, an exception is raised. + diff --git a/Doc/library/gettext.rst b/Doc/library/gettext.rst new file mode 100644 index 0000000..51628e6 --- /dev/null +++ b/Doc/library/gettext.rst @@ -0,0 +1,765 @@ + +:mod:`gettext` --- Multilingual internationalization services +============================================================= + +.. module:: gettext + :synopsis: Multilingual internationalization services. +.. moduleauthor:: Barry A. Warsaw <barry@zope.com> +.. sectionauthor:: Barry A. Warsaw <barry@zope.com> + + +The :mod:`gettext` module provides internationalization (I18N) and localization +(L10N) services for your Python modules and applications. It supports both the +GNU ``gettext`` message catalog API and a higher level, class-based API that may +be more appropriate for Python files. The interface described below allows you +to write your module and application messages in one natural language, and +provide a catalog of translated messages for running under different natural +languages. + +Some hints on localizing your Python modules and applications are also given. + + +GNU :program:`gettext` API +-------------------------- + +The :mod:`gettext` module defines the following API, which is very similar to +the GNU :program:`gettext` API. If you use this API you will affect the +translation of your entire application globally. Often this is what you want if +your application is monolingual, with the choice of language dependent on the +locale of your user. If you are localizing a Python module, or if your +application needs to switch languages on the fly, you probably want to use the +class-based API instead. + + +.. function:: bindtextdomain(domain[, localedir]) + + Bind the *domain* to the locale directory *localedir*. More concretely, + :mod:`gettext` will look for binary :file:`.mo` files for the given domain using + the path (on Unix): :file:`localedir/language/LC_MESSAGES/domain.mo`, where + *languages* is searched for in the environment variables :envvar:`LANGUAGE`, + :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG` respectively. + + If *localedir* is omitted or ``None``, then the current binding for *domain* is + returned. [#]_ + + +.. function:: bind_textdomain_codeset(domain[, codeset]) + + Bind the *domain* to *codeset*, changing the encoding of strings returned by the + :func:`gettext` family of functions. If *codeset* is omitted, then the current + binding is returned. + + .. versionadded:: 2.4 + + +.. function:: textdomain([domain]) + + Change or query the current global domain. If *domain* is ``None``, then the + current global domain is returned, otherwise the global domain is set to + *domain*, which is returned. + + +.. function:: gettext(message) + + Return the localized translation of *message*, based on the current global + domain, language, and locale directory. This function is usually aliased as + :func:`_` in the local namespace (see examples below). + + +.. function:: lgettext(message) + + Equivalent to :func:`gettext`, but the translation is returned in the preferred + system encoding, if no other encoding was explicitly set with + :func:`bind_textdomain_codeset`. + + .. versionadded:: 2.4 + + +.. function:: dgettext(domain, message) + + Like :func:`gettext`, but look the message up in the specified *domain*. + + +.. function:: ldgettext(domain, message) + + Equivalent to :func:`dgettext`, but the translation is returned in the preferred + system encoding, if no other encoding was explicitly set with + :func:`bind_textdomain_codeset`. + + .. versionadded:: 2.4 + + +.. function:: ngettext(singular, plural, n) + + Like :func:`gettext`, but consider plural forms. If a translation is found, + apply the plural formula to *n*, and return the resulting message (some + languages have more than two plural forms). If no translation is found, return + *singular* if *n* is 1; return *plural* otherwise. + + The Plural formula is taken from the catalog header. It is a C or Python + expression that has a free variable *n*; the expression evaluates to the index + of the plural in the catalog. See the GNU gettext documentation for the precise + syntax to be used in :file:`.po` files and the formulas for a variety of + languages. + + .. versionadded:: 2.3 + + +.. function:: lngettext(singular, plural, n) + + Equivalent to :func:`ngettext`, but the translation is returned in the preferred + system encoding, if no other encoding was explicitly set with + :func:`bind_textdomain_codeset`. + + .. versionadded:: 2.4 + + +.. function:: dngettext(domain, singular, plural, n) + + Like :func:`ngettext`, but look the message up in the specified *domain*. + + .. versionadded:: 2.3 + + +.. function:: ldngettext(domain, singular, plural, n) + + Equivalent to :func:`dngettext`, but the translation is returned in the + preferred system encoding, if no other encoding was explicitly set with + :func:`bind_textdomain_codeset`. + + .. versionadded:: 2.4 + +Note that GNU :program:`gettext` also defines a :func:`dcgettext` method, but +this was deemed not useful and so it is currently unimplemented. + +Here's an example of typical usage for this API:: + + import gettext + gettext.bindtextdomain('myapplication', '/path/to/my/language/directory') + gettext.textdomain('myapplication') + _ = gettext.gettext + # ... + print _('This is a translatable string.') + + +Class-based API +--------------- + +The class-based API of the :mod:`gettext` module gives you more flexibility and +greater convenience than the GNU :program:`gettext` API. It is the recommended +way of localizing your Python applications and modules. :mod:`gettext` defines +a "translations" class which implements the parsing of GNU :file:`.mo` format +files, and has methods for returning either standard 8-bit strings or Unicode +strings. Instances of this "translations" class can also install themselves in +the built-in namespace as the function :func:`_`. + + +.. function:: find(domain[, localedir[, languages[, all]]]) + + This function implements the standard :file:`.mo` file search algorithm. It + takes a *domain*, identical to what :func:`textdomain` takes. Optional + *localedir* is as in :func:`bindtextdomain` Optional *languages* is a list of + strings, where each string is a language code. + + If *localedir* is not given, then the default system locale directory is used. + [#]_ If *languages* is not given, then the following environment variables are + searched: :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and + :envvar:`LANG`. The first one returning a non-empty value is used for the + *languages* variable. The environment variables should contain a colon separated + list of languages, which will be split on the colon to produce the expected list + of language code strings. + + :func:`find` then expands and normalizes the languages, and then iterates + through them, searching for an existing file built of these components: + + :file:`localedir/language/LC_MESSAGES/domain.mo` + + The first such file name that exists is returned by :func:`find`. If no such + file is found, then ``None`` is returned. If *all* is given, it returns a list + of all file names, in the order in which they appear in the languages list or + the environment variables. + + +.. function:: translation(domain[, localedir[, languages[, class_[, fallback[, codeset]]]]]) + + Return a :class:`Translations` instance based on the *domain*, *localedir*, and + *languages*, which are first passed to :func:`find` to get a list of the + associated :file:`.mo` file paths. Instances with identical :file:`.mo` file + names are cached. The actual class instantiated is either *class_* if provided, + otherwise :class:`GNUTranslations`. The class's constructor must take a single + file object argument. If provided, *codeset* will change the charset used to + encode translated strings. + + If multiple files are found, later files are used as fallbacks for earlier ones. + To allow setting the fallback, :func:`copy.copy` is used to clone each + translation object from the cache; the actual instance data is still shared with + the cache. + + If no :file:`.mo` file is found, this function raises :exc:`IOError` if + *fallback* is false (which is the default), and returns a + :class:`NullTranslations` instance if *fallback* is true. + + .. versionchanged:: 2.4 + Added the *codeset* parameter. + + +.. function:: install(domain[, localedir[, unicode [, codeset[, names]]]]) + + This installs the function :func:`_` in Python's builtin namespace, based on + *domain*, *localedir*, and *codeset* which are passed to the function + :func:`translation`. The *unicode* flag is passed to the resulting translation + object's :meth:`install` method. + + For the *names* parameter, please see the description of the translation + object's :meth:`install` method. + + As seen below, you usually mark the strings in your application that are + candidates for translation, by wrapping them in a call to the :func:`_` + function, like this:: + + print _('This string will be translated.') + + For convenience, you want the :func:`_` function to be installed in Python's + builtin namespace, so it is easily accessible in all modules of your + application. + + .. versionchanged:: 2.4 + Added the *codeset* parameter. + + .. versionchanged:: 2.5 + Added the *names* parameter. + + +The :class:`NullTranslations` class +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Translation classes are what actually implement the translation of original +source file message strings to translated message strings. The base class used +by all translation classes is :class:`NullTranslations`; this provides the basic +interface you can use to write your own specialized translation classes. Here +are the methods of :class:`NullTranslations`: + + +.. method:: NullTranslations.__init__([fp]) + + Takes an optional file object *fp*, which is ignored by the base class. + Initializes "protected" instance variables *_info* and *_charset* which are set + by derived classes, as well as *_fallback*, which is set through + :meth:`add_fallback`. It then calls ``self._parse(fp)`` if *fp* is not + ``None``. + + +.. method:: NullTranslations._parse(fp) + + No-op'd in the base class, this method takes file object *fp*, and reads the + data from the file, initializing its message catalog. If you have an + unsupported message catalog file format, you should override this method to + parse your format. + + +.. method:: NullTranslations.add_fallback(fallback) + + Add *fallback* as the fallback object for the current translation object. A + translation object should consult the fallback if it cannot provide a + translation for a given message. + + +.. method:: NullTranslations.gettext(message) + + If a fallback has been set, forward :meth:`gettext` to the fallback. Otherwise, + return the translated message. Overridden in derived classes. + + +.. method:: NullTranslations.lgettext(message) + + If a fallback has been set, forward :meth:`lgettext` to the fallback. Otherwise, + return the translated message. Overridden in derived classes. + + .. versionadded:: 2.4 + + +.. method:: NullTranslations.ugettext(message) + + If a fallback has been set, forward :meth:`ugettext` to the fallback. Otherwise, + return the translated message as a Unicode string. Overridden in derived + classes. + + +.. method:: NullTranslations.ngettext(singular, plural, n) + + If a fallback has been set, forward :meth:`ngettext` to the fallback. Otherwise, + return the translated message. Overridden in derived classes. + + .. versionadded:: 2.3 + + +.. method:: NullTranslations.lngettext(singular, plural, n) + + If a fallback has been set, forward :meth:`ngettext` to the fallback. Otherwise, + return the translated message. Overridden in derived classes. + + .. versionadded:: 2.4 + + +.. method:: NullTranslations.ungettext(singular, plural, n) + + If a fallback has been set, forward :meth:`ungettext` to the fallback. + Otherwise, return the translated message as a Unicode string. Overridden in + derived classes. + + .. versionadded:: 2.3 + + +.. method:: NullTranslations.info() + + Return the "protected" :attr:`_info` variable. + + +.. method:: NullTranslations.charset() + + Return the "protected" :attr:`_charset` variable. + + +.. method:: NullTranslations.output_charset() + + Return the "protected" :attr:`_output_charset` variable, which defines the + encoding used to return translated messages. + + .. versionadded:: 2.4 + + +.. method:: NullTranslations.set_output_charset(charset) + + Change the "protected" :attr:`_output_charset` variable, which defines the + encoding used to return translated messages. + + .. versionadded:: 2.4 + + +.. method:: NullTranslations.install([unicode [, names]]) + + If the *unicode* flag is false, this method installs :meth:`self.gettext` into + the built-in namespace, binding it to ``_``. If *unicode* is true, it binds + :meth:`self.ugettext` instead. By default, *unicode* is false. + + If the *names* parameter is given, it must be a sequence containing the names of + functions you want to install in the builtin namespace in addition to :func:`_`. + Supported names are ``'gettext'`` (bound to :meth:`self.gettext` or + :meth:`self.ugettext` according to the *unicode* flag), ``'ngettext'`` (bound to + :meth:`self.ngettext` or :meth:`self.ungettext` according to the *unicode* + flag), ``'lgettext'`` and ``'lngettext'``. + + Note that this is only one way, albeit the most convenient way, to make the + :func:`_` function available to your application. Because it affects the entire + application globally, and specifically the built-in namespace, localized modules + should never install :func:`_`. Instead, they should use this code to make + :func:`_` available to their module:: + + import gettext + t = gettext.translation('mymodule', ...) + _ = t.gettext + + This puts :func:`_` only in the module's global namespace and so only affects + calls within this module. + + .. versionchanged:: 2.5 + Added the *names* parameter. + + +The :class:`GNUTranslations` class +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The :mod:`gettext` module provides one additional class derived from +:class:`NullTranslations`: :class:`GNUTranslations`. This class overrides +:meth:`_parse` to enable reading GNU :program:`gettext` format :file:`.mo` files +in both big-endian and little-endian format. It also coerces both message ids +and message strings to Unicode. + +:class:`GNUTranslations` parses optional meta-data out of the translation +catalog. It is convention with GNU :program:`gettext` to include meta-data as +the translation for the empty string. This meta-data is in :rfc:`822`\ -style +``key: value`` pairs, and should contain the ``Project-Id-Version`` key. If the +key ``Content-Type`` is found, then the ``charset`` property is used to +initialize the "protected" :attr:`_charset` instance variable, defaulting to +``None`` if not found. If the charset encoding is specified, then all message +ids and message strings read from the catalog are converted to Unicode using +this encoding. The :meth:`ugettext` method always returns a Unicode, while the +:meth:`gettext` returns an encoded 8-bit string. For the message id arguments +of both methods, either Unicode strings or 8-bit strings containing only +US-ASCII characters are acceptable. Note that the Unicode version of the +methods (i.e. :meth:`ugettext` and :meth:`ungettext`) are the recommended +interface to use for internationalized Python programs. + +The entire set of key/value pairs are placed into a dictionary and set as the +"protected" :attr:`_info` instance variable. + +If the :file:`.mo` file's magic number is invalid, or if other problems occur +while reading the file, instantiating a :class:`GNUTranslations` class can raise +:exc:`IOError`. + +The following methods are overridden from the base class implementation: + + +.. method:: GNUTranslations.gettext(message) + + Look up the *message* id in the catalog and return the corresponding message + string, as an 8-bit string encoded with the catalog's charset encoding, if + known. If there is no entry in the catalog for the *message* id, and a fallback + has been set, the look up is forwarded to the fallback's :meth:`gettext` method. + Otherwise, the *message* id is returned. + + +.. method:: GNUTranslations.lgettext(message) + + Equivalent to :meth:`gettext`, but the translation is returned in the preferred + system encoding, if no other encoding was explicitly set with + :meth:`set_output_charset`. + + .. versionadded:: 2.4 + + +.. method:: GNUTranslations.ugettext(message) + + Look up the *message* id in the catalog and return the corresponding message + string, as a Unicode string. If there is no entry in the catalog for the + *message* id, and a fallback has been set, the look up is forwarded to the + fallback's :meth:`ugettext` method. Otherwise, the *message* id is returned. + + +.. method:: GNUTranslations.ngettext(singular, plural, n) + + Do a plural-forms lookup of a message id. *singular* is used as the message id + for purposes of lookup in the catalog, while *n* is used to determine which + plural form to use. The returned message string is an 8-bit string encoded with + the catalog's charset encoding, if known. + + If the message id is not found in the catalog, and a fallback is specified, the + request is forwarded to the fallback's :meth:`ngettext` method. Otherwise, when + *n* is 1 *singular* is returned, and *plural* is returned in all other cases. + + .. versionadded:: 2.3 + + +.. method:: GNUTranslations.lngettext(singular, plural, n) + + Equivalent to :meth:`gettext`, but the translation is returned in the preferred + system encoding, if no other encoding was explicitly set with + :meth:`set_output_charset`. + + .. versionadded:: 2.4 + + +.. method:: GNUTranslations.ungettext(singular, plural, n) + + Do a plural-forms lookup of a message id. *singular* is used as the message id + for purposes of lookup in the catalog, while *n* is used to determine which + plural form to use. The returned message string is a Unicode string. + + If the message id is not found in the catalog, and a fallback is specified, the + request is forwarded to the fallback's :meth:`ungettext` method. Otherwise, + when *n* is 1 *singular* is returned, and *plural* is returned in all other + cases. + + Here is an example:: + + n = len(os.listdir('.')) + cat = GNUTranslations(somefile) + message = cat.ungettext( + 'There is %(num)d file in this directory', + 'There are %(num)d files in this directory', + n) % {'num': n} + + .. versionadded:: 2.3 + + +Solaris message catalog support +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Solaris operating system defines its own binary :file:`.mo` file format, but +since no documentation can be found on this format, it is not supported at this +time. + + +The Catalog constructor +^^^^^^^^^^^^^^^^^^^^^^^ + +.. index:: single: GNOME + +GNOME uses a version of the :mod:`gettext` module by James Henstridge, but this +version has a slightly different API. Its documented usage was:: + + import gettext + cat = gettext.Catalog(domain, localedir) + _ = cat.gettext + print _('hello world') + +For compatibility with this older module, the function :func:`Catalog` is an +alias for the :func:`translation` function described above. + +One difference between this module and Henstridge's: his catalog objects +supported access through a mapping API, but this appears to be unused and so is +not currently supported. + + +Internationalizing your programs and modules +-------------------------------------------- + +Internationalization (I18N) refers to the operation by which a program is made +aware of multiple languages. Localization (L10N) refers to the adaptation of +your program, once internationalized, to the local language and cultural habits. +In order to provide multilingual messages for your Python programs, you need to +take the following steps: + +#. prepare your program or module by specially marking translatable strings + +#. run a suite of tools over your marked files to generate raw messages catalogs + +#. create language specific translations of the message catalogs + +#. use the :mod:`gettext` module so that message strings are properly translated + +In order to prepare your code for I18N, you need to look at all the strings in +your files. Any string that needs to be translated should be marked by wrapping +it in ``_('...')`` --- that is, a call to the function :func:`_`. For example:: + + filename = 'mylog.txt' + message = _('writing a log message') + fp = open(filename, 'w') + fp.write(message) + fp.close() + +In this example, the string ``'writing a log message'`` is marked as a candidate +for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not. + +The Python distribution comes with two tools which help you generate the message +catalogs once you've prepared your source code. These may or may not be +available from a binary distribution, but they can be found in a source +distribution, in the :file:`Tools/i18n` directory. + +The :program:`pygettext` [#]_ program scans all your Python source code looking +for the strings you previously marked as translatable. It is similar to the GNU +:program:`gettext` program except that it understands all the intricacies of +Python source code, but knows nothing about C or C++ source code. You don't +need GNU ``gettext`` unless you're also going to be translating C code (such as +C extension modules). + +:program:`pygettext` generates textual Uniforum-style human readable message +catalog :file:`.pot` files, essentially structured human readable files which +contain every marked string in the source code, along with a placeholder for the +translation strings. :program:`pygettext` is a command line script that supports +a similar command line interface as :program:`xgettext`; for details on its use, +run:: + + pygettext.py --help + +Copies of these :file:`.pot` files are then handed over to the individual human +translators who write language-specific versions for every supported natural +language. They send you back the filled in language-specific versions as a +:file:`.po` file. Using the :program:`msgfmt.py` [#]_ program (in the +:file:`Tools/i18n` directory), you take the :file:`.po` files from your +translators and generate the machine-readable :file:`.mo` binary catalog files. +The :file:`.mo` files are what the :mod:`gettext` module uses for the actual +translation processing during run-time. + +How you use the :mod:`gettext` module in your code depends on whether you are +internationalizing a single module or your entire application. The next two +sections will discuss each case. + + +Localizing your module +^^^^^^^^^^^^^^^^^^^^^^ + +If you are localizing your module, you must take care not to make global +changes, e.g. to the built-in namespace. You should not use the GNU ``gettext`` +API but instead the class-based API. + +Let's say your module is called "spam" and the module's various natural language +translation :file:`.mo` files reside in :file:`/usr/share/locale` in GNU +:program:`gettext` format. Here's what you would put at the top of your +module:: + + import gettext + t = gettext.translation('spam', '/usr/share/locale') + _ = t.lgettext + +If your translators were providing you with Unicode strings in their :file:`.po` +files, you'd instead do:: + + import gettext + t = gettext.translation('spam', '/usr/share/locale') + _ = t.ugettext + + +Localizing your application +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you are localizing your application, you can install the :func:`_` function +globally into the built-in namespace, usually in the main driver file of your +application. This will let all your application-specific files just use +``_('...')`` without having to explicitly install it in each file. + +In the simple case then, you need only add the following bit of code to the main +driver file of your application:: + + import gettext + gettext.install('myapplication') + +If you need to set the locale directory or the *unicode* flag, you can pass +these into the :func:`install` function:: + + import gettext + gettext.install('myapplication', '/usr/share/locale', unicode=1) + + +Changing languages on the fly +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If your program needs to support many languages at the same time, you may want +to create multiple translation instances and then switch between them +explicitly, like so:: + + import gettext + + lang1 = gettext.translation('myapplication', languages=['en']) + lang2 = gettext.translation('myapplication', languages=['fr']) + lang3 = gettext.translation('myapplication', languages=['de']) + + # start by using language1 + lang1.install() + + # ... time goes by, user selects language 2 + lang2.install() + + # ... more time goes by, user selects language 3 + lang3.install() + + +Deferred translations +^^^^^^^^^^^^^^^^^^^^^ + +In most coding situations, strings are translated where they are coded. +Occasionally however, you need to mark strings for translation, but defer actual +translation until later. A classic example is:: + + animals = ['mollusk', + 'albatross', + 'rat', + 'penguin', + 'python', + ] + # ... + for a in animals: + print a + +Here, you want to mark the strings in the ``animals`` list as being +translatable, but you don't actually want to translate them until they are +printed. + +Here is one way you can handle this situation:: + + def _(message): return message + + animals = [_('mollusk'), + _('albatross'), + _('rat'), + _('penguin'), + _('python'), + ] + + del _ + + # ... + for a in animals: + print _(a) + +This works because the dummy definition of :func:`_` simply returns the string +unchanged. And this dummy definition will temporarily override any definition +of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take +care, though if you have a previous definition of :func:`_` in the local +namespace. + +Note that the second use of :func:`_` will not identify "a" as being +translatable to the :program:`pygettext` program, since it is not a string. + +Another way to handle this is with the following example:: + + def N_(message): return message + + animals = [N_('mollusk'), + N_('albatross'), + N_('rat'), + N_('penguin'), + N_('python'), + ] + + # ... + for a in animals: + print _(a) + +In this case, you are marking translatable strings with the function :func:`N_`, +[#]_ which won't conflict with any definition of :func:`_`. However, you will +need to teach your message extraction program to look for translatable strings +marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support +this through the use of command line switches. + + +:func:`gettext` vs. :func:`lgettext` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In Python 2.4 the :func:`lgettext` family of functions were introduced. The +intention of these functions is to provide an alternative which is more +compliant with the current implementation of GNU gettext. Unlike +:func:`gettext`, which returns strings encoded with the same codeset used in the +translation file, :func:`lgettext` will return strings encoded with the +preferred system encoding, as returned by :func:`locale.getpreferredencoding`. +Also notice that Python 2.4 introduces new functions to explicitly choose the +codeset used in translated strings. If a codeset is explicitly set, even +:func:`lgettext` will return translated strings in the requested codeset, as +would be expected in the GNU gettext implementation. + + +Acknowledgements +---------------- + +The following people contributed code, feedback, design suggestions, previous +implementations, and valuable experience to the creation of this module: + +* Peter Funk + +* James Henstridge + +* Juan David Ibáñez Palomar + +* Marc-André Lemburg + +* Martin von Löwis + +* François Pinard + +* Barry Warsaw + +* Gustavo Niemeyer + +.. rubric:: Footnotes + +.. [#] The default locale directory is system dependent; for example, on RedHat Linux + it is :file:`/usr/share/locale`, but on Solaris it is :file:`/usr/lib/locale`. + The :mod:`gettext` module does not try to support these system dependent + defaults; instead its default is :file:`sys.prefix/share/locale`. For this + reason, it is always best to call :func:`bindtextdomain` with an explicit + absolute path at the start of your application. + +.. [#] See the footnote for :func:`bindtextdomain` above. + +.. [#] François Pinard has written a program called :program:`xpot` which does a + similar job. It is available as part of his :program:`po-utils` package at http + ://po-utils.progiciels-bpi.ca/. + +.. [#] :program:`msgfmt.py` is binary compatible with GNU :program:`msgfmt` except that + it provides a simpler, all-Python implementation. With this and + :program:`pygettext.py`, you generally won't need to install the GNU + :program:`gettext` package to internationalize your Python applications. + +.. [#] The choice of :func:`N_` here is totally arbitrary; it could have just as easily + been :func:`MarkThisStringForTranslation`. + diff --git a/Doc/library/glob.rst b/Doc/library/glob.rst new file mode 100644 index 0000000..80bdac2 --- /dev/null +++ b/Doc/library/glob.rst @@ -0,0 +1,54 @@ + +:mod:`glob` --- Unix style pathname pattern expansion +===================================================== + +.. module:: glob + :synopsis: Unix shell style pathname pattern expansion. + + +.. index:: single: filenames; pathname expansion + +The :mod:`glob` module finds all the pathnames matching a specified pattern +according to the rules used by the Unix shell. No tilde expansion is done, but +``*``, ``?``, and character ranges expressed with ``[]`` will be correctly +matched. This is done by using the :func:`os.listdir` and +:func:`fnmatch.fnmatch` functions in concert, and not by actually invoking a +subshell. (For tilde and shell variable expansion, use +:func:`os.path.expanduser` and :func:`os.path.expandvars`.) + + +.. function:: glob(pathname) + + Return a possibly-empty list of path names that match *pathname*, which must be + a string containing a path specification. *pathname* can be either absolute + (like :file:`/usr/src/Python-1.5/Makefile`) or relative (like + :file:`../../Tools/\*/\*.gif`), and can contain shell-style wildcards. Broken + symlinks are included in the results (as in the shell). + + +.. function:: iglob(pathname) + + Return an iterator which yields the same values as :func:`glob` without actually + storing them all simultaneously. + + .. versionadded:: 2.5 + +For example, consider a directory containing only the following files: +:file:`1.gif`, :file:`2.txt`, and :file:`card.gif`. :func:`glob` will produce +the following results. Notice how any leading components of the path are +preserved. :: + + >>> import glob + >>> glob.glob('./[0-9].*') + ['./1.gif', './2.txt'] + >>> glob.glob('*.gif') + ['1.gif', 'card.gif'] + >>> glob.glob('?.gif') + ['1.gif'] + + +.. seealso:: + + Module :mod:`fnmatch` + Shell-style filename (not path) expansion + diff --git a/Doc/library/grp.rst b/Doc/library/grp.rst new file mode 100644 index 0000000..a71c308 --- /dev/null +++ b/Doc/library/grp.rst @@ -0,0 +1,63 @@ + +:mod:`grp` --- The group database +================================= + +.. module:: grp + :platform: Unix + :synopsis: The group database (getgrnam() and friends). + + +This module provides access to the Unix group database. It is available on all +Unix versions. + +Group database entries are reported as a tuple-like object, whose attributes +correspond to the members of the ``group`` structure (Attribute field below, see +``<pwd.h>``): + ++-------+-----------+---------------------------------+ +| Index | Attribute | Meaning | ++=======+===========+=================================+ +| 0 | gr_name | the name of the group | ++-------+-----------+---------------------------------+ +| 1 | gr_passwd | the (encrypted) group password; | +| | | often empty | ++-------+-----------+---------------------------------+ +| 2 | gr_gid | the numerical group ID | ++-------+-----------+---------------------------------+ +| 3 | gr_mem | all the group member's user | +| | | names | ++-------+-----------+---------------------------------+ + +The gid is an integer, name and password are strings, and the member list is a +list of strings. (Note that most users are not explicitly listed as members of +the group they are in according to the password database. Check both databases +to get complete membership information.) + +It defines the following items: + + +.. function:: getgrgid(gid) + + Return the group database entry for the given numeric group ID. :exc:`KeyError` + is raised if the entry asked for cannot be found. + + +.. function:: getgrnam(name) + + Return the group database entry for the given group name. :exc:`KeyError` is + raised if the entry asked for cannot be found. + + +.. function:: getgrall() + + Return a list of all available group entries, in arbitrary order. + + +.. seealso:: + + Module :mod:`pwd` + An interface to the user database, similar to this. + + Module :mod:`spwd` + An interface to the shadow password database, similar to this. + diff --git a/Doc/library/gzip.rst b/Doc/library/gzip.rst new file mode 100644 index 0000000..5978031 --- /dev/null +++ b/Doc/library/gzip.rst @@ -0,0 +1,68 @@ + +:mod:`gzip` --- Support for :program:`gzip` files +================================================= + +.. module:: gzip + :synopsis: Interfaces for gzip compression and decompression using file objects. + + +The data compression provided by the ``zlib`` module is compatible with that +used by the GNU compression program :program:`gzip`. Accordingly, the +:mod:`gzip` module provides the :class:`GzipFile` class to read and write +:program:`gzip`\ -format files, automatically compressing or decompressing the +data so it looks like an ordinary file object. Note that additional file +formats which can be decompressed by the :program:`gzip` and :program:`gunzip` +programs, such as those produced by :program:`compress` and :program:`pack`, +are not supported by this module. + +The module defines the following items: + + +.. class:: GzipFile([filename[, mode[, compresslevel[, fileobj]]]]) + + Constructor for the :class:`GzipFile` class, which simulates most of the methods + of a file object, with the exception of the :meth:`readinto` and + :meth:`truncate` methods. At least one of *fileobj* and *filename* must be + given a non-trivial value. + + The new class instance is based on *fileobj*, which can be a regular file, a + :class:`StringIO` object, or any other object which simulates a file. It + defaults to ``None``, in which case *filename* is opened to provide a file + object. + + When *fileobj* is not ``None``, the *filename* argument is only used to be + included in the :program:`gzip` file header, which may includes the original + filename of the uncompressed file. It defaults to the filename of *fileobj*, if + discernible; otherwise, it defaults to the empty string, and in this case the + original filename is not included in the header. + + The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``, + or ``'wb'``, depending on whether the file will be read or written. The default + is the mode of *fileobj* if discernible; otherwise, the default is ``'rb'``. If + not given, the 'b' flag will be added to the mode to ensure the file is opened + in binary mode for cross-platform portability. + + The *compresslevel* argument is an integer from ``1`` to ``9`` controlling the + level of compression; ``1`` is fastest and produces the least compression, and + ``9`` is slowest and produces the most compression. The default is ``9``. + + Calling a :class:`GzipFile` object's :meth:`close` method does not close + *fileobj*, since you might wish to append more material after the compressed + data. This also allows you to pass a :class:`StringIO` object opened for + writing as *fileobj*, and retrieve the resulting memory buffer using the + :class:`StringIO` object's :meth:`getvalue` method. + + +.. function:: open(filename[, mode[, compresslevel]]) + + This is a shorthand for ``GzipFile(filename,`` ``mode,`` ``compresslevel)``. + The *filename* argument is required; *mode* defaults to ``'rb'`` and + *compresslevel* defaults to ``9``. + + +.. seealso:: + + Module :mod:`zlib` + The basic data compression module needed to support the :program:`gzip` file + format. + diff --git a/Doc/library/hashlib.rst b/Doc/library/hashlib.rst new file mode 100644 index 0000000..f255554 --- /dev/null +++ b/Doc/library/hashlib.rst @@ -0,0 +1,121 @@ + +:mod:`hashlib` --- Secure hashes and message digests +==================================================== + +.. module:: hashlib + :synopsis: Secure hash and message digest algorithms. +.. moduleauthor:: Gregory P. Smith <greg@users.sourceforge.net> +.. sectionauthor:: Gregory P. Smith <greg@users.sourceforge.net> + + +.. versionadded:: 2.5 + +.. index:: + single: message digest, MD5 + single: secure hash algorithm, SHA1, SHA224, SHA256, SHA384, SHA512 + +This module implements a common interface to many different secure hash and +message digest algorithms. Included are the FIPS secure hash algorithms SHA1, +SHA224, SHA256, SHA384, and SHA512 (defined in FIPS 180-2) as well as RSA's MD5 +algorithm (defined in Internet :rfc:`1321`). The terms secure hash and message +digest are interchangeable. Older algorithms were called message digests. The +modern term is secure hash. + +.. warning:: + + Some algorithms have known hash collision weaknesses, see the FAQ at the end. + +There is one constructor method named for each type of :dfn:`hash`. All return +a hash object with the same simple interface. For example: use :func:`sha1` to +create a SHA1 hash object. You can now feed this object with arbitrary strings +using the :meth:`update` method. At any point you can ask it for the +:dfn:`digest` of the concatenation of the strings fed to it so far using the +:meth:`digest` or :meth:`hexdigest` methods. + +.. index:: single: OpenSSL + +Constructors for hash algorithms that are always present in this module are +:func:`md5`, :func:`sha1`, :func:`sha224`, :func:`sha256`, :func:`sha384`, and +:func:`sha512`. Additional algorithms may also be available depending upon the +OpenSSL library that Python uses on your platform. + +For example, to obtain the digest of the string ``'Nobody inspects the spammish +repetition'``:: + + >>> import hashlib + >>> m = hashlib.md5() + >>> m.update("Nobody inspects") + >>> m.update(" the spammish repetition") + >>> m.digest() + '\xbbd\x9c\x83\xdd\x1e\xa5\xc9\xd9\xde\xc9\xa1\x8d\xf0\xff\xe9' + +More condensed:: + + >>> hashlib.sha224("Nobody inspects the spammish repetition").hexdigest() + 'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2' + +A generic :func:`new` constructor that takes the string name of the desired +algorithm as its first parameter also exists to allow access to the above listed +hashes as well as any other algorithms that your OpenSSL library may offer. The +named constructors are much faster than :func:`new` and should be preferred. + +Using :func:`new` with an algorithm provided by OpenSSL:: + + >>> h = hashlib.new('ripemd160') + >>> h.update("Nobody inspects the spammish repetition") + >>> h.hexdigest() + 'cc4a5ce1b3df48aec5d22d1f16b894a0b894eccc' + +The following values are provided as constant attributes of the hash objects +returned by the constructors: + + +.. data:: digest_size + + The size of the resulting digest in bytes. + +A hash object has the following methods: + + +.. method:: hash.update(arg) + + Update the hash object with the string *arg*. Repeated calls are equivalent to + a single call with the concatenation of all the arguments: ``m.update(a); + m.update(b)`` is equivalent to ``m.update(a+b)``. + + +.. method:: hash.digest() + + Return the digest of the strings passed to the :meth:`update` method so far. + This is a string of :attr:`digest_size` bytes which may contain non-ASCII + characters, including null bytes. + + +.. method:: hash.hexdigest() + + Like :meth:`digest` except the digest is returned as a string of double length, + containing only hexadecimal digits. This may be used to exchange the value + safely in email or other non-binary environments. + + +.. method:: hash.copy() + + Return a copy ("clone") of the hash object. This can be used to efficiently + compute the digests of strings that share a common initial substring. + + +.. seealso:: + + Module :mod:`hmac` + A module to generate message authentication codes using hashes. + + Module :mod:`base64` + Another way to encode binary hashes for non-binary environments. + + http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf + The FIPS 180-2 publication on Secure Hash Algorithms. + + http://www.cryptography.com/cnews/hash.html + Hash Collision FAQ with information on which algorithms have known issues and + what that means regarding their use. + diff --git a/Doc/library/heapq.rst b/Doc/library/heapq.rst new file mode 100644 index 0000000..2d38c26 --- /dev/null +++ b/Doc/library/heapq.rst @@ -0,0 +1,224 @@ + +:mod:`heapq` --- Heap queue algorithm +===================================== + +.. module:: heapq + :synopsis: Heap queue algorithm (a.k.a. priority queue). +.. moduleauthor:: Kevin O'Connor +.. sectionauthor:: Guido van Rossum <guido@python.org> +.. sectionauthor:: François Pinard + + +.. % Theoretical explanation: + +.. versionadded:: 2.3 + +This module provides an implementation of the heap queue algorithm, also known +as the priority queue algorithm. + +Heaps are arrays for which ``heap[k] <= heap[2*k+1]`` and ``heap[k] <= +heap[2*k+2]`` for all *k*, counting elements from zero. For the sake of +comparison, non-existing elements are considered to be infinite. The +interesting property of a heap is that ``heap[0]`` is always its smallest +element. + +The API below differs from textbook heap algorithms in two aspects: (a) We use +zero-based indexing. This makes the relationship between the index for a node +and the indexes for its children slightly less obvious, but is more suitable +since Python uses zero-based indexing. (b) Our pop method returns the smallest +item, not the largest (called a "min heap" in textbooks; a "max heap" is more +common in texts because of its suitability for in-place sorting). + +These two make it possible to view the heap as a regular Python list without +surprises: ``heap[0]`` is the smallest item, and ``heap.sort()`` maintains the +heap invariant! + +To create a heap, use a list initialized to ``[]``, or you can transform a +populated list into a heap via function :func:`heapify`. + +The following functions are provided: + + +.. function:: heappush(heap, item) + + Push the value *item* onto the *heap*, maintaining the heap invariant. + + +.. function:: heappop(heap) + + Pop and return the smallest item from the *heap*, maintaining the heap + invariant. If the heap is empty, :exc:`IndexError` is raised. + + +.. function:: heapify(x) + + Transform list *x* into a heap, in-place, in linear time. + + +.. function:: heapreplace(heap, item) + + Pop and return the smallest item from the *heap*, and also push the new *item*. + The heap size doesn't change. If the heap is empty, :exc:`IndexError` is raised. + This is more efficient than :func:`heappop` followed by :func:`heappush`, and + can be more appropriate when using a fixed-size heap. Note that the value + returned may be larger than *item*! That constrains reasonable uses of this + routine unless written as part of a conditional replacement:: + + if item > heap[0]: + item = heapreplace(heap, item) + +Example of use:: + + >>> from heapq import heappush, heappop + >>> heap = [] + >>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0] + >>> for item in data: + ... heappush(heap, item) + ... + >>> ordered = [] + >>> while heap: + ... ordered.append(heappop(heap)) + ... + >>> print ordered + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + >>> data.sort() + >>> print data == ordered + True + >>> + +The module also offers three general purpose functions based on heaps. + + +.. function:: merge(*iterables) + + Merge multiple sorted inputs into a single sorted output (for example, merge + timestamped entries from multiple log files). Returns an iterator over over the + sorted values. + + Similar to ``sorted(itertools.chain(*iterables))`` but returns an iterable, does + not pull the data into memory all at once, and assumes that each of the input + streams is already sorted (smallest to largest). + + .. versionadded:: 2.6 + + +.. function:: nlargest(n, iterable[, key]) + + Return a list with the *n* largest elements from the dataset defined by + *iterable*. *key*, if provided, specifies a function of one argument that is + used to extract a comparison key from each element in the iterable: + ``key=str.lower`` Equivalent to: ``sorted(iterable, key=key, + reverse=True)[:n]`` + + .. versionadded:: 2.4 + + .. versionchanged:: 2.5 + Added the optional *key* argument. + + +.. function:: nsmallest(n, iterable[, key]) + + Return a list with the *n* smallest elements from the dataset defined by + *iterable*. *key*, if provided, specifies a function of one argument that is + used to extract a comparison key from each element in the iterable: + ``key=str.lower`` Equivalent to: ``sorted(iterable, key=key)[:n]`` + + .. versionadded:: 2.4 + + .. versionchanged:: 2.5 + Added the optional *key* argument. + +The latter two functions perform best for smaller values of *n*. For larger +values, it is more efficient to use the :func:`sorted` function. Also, when +``n==1``, it is more efficient to use the builtin :func:`min` and :func:`max` +functions. + + +Theory +------ + +(This explanation is due to François Pinard. The Python code for this module +was contributed by Kevin O'Connor.) + +Heaps are arrays for which ``a[k] <= a[2*k+1]`` and ``a[k] <= a[2*k+2]`` for all +*k*, counting elements from 0. For the sake of comparison, non-existing +elements are considered to be infinite. The interesting property of a heap is +that ``a[0]`` is always its smallest element. + +The strange invariant above is meant to be an efficient memory representation +for a tournament. The numbers below are *k*, not ``a[k]``:: + + 0 + + 1 2 + + 3 4 5 6 + + 7 8 9 10 11 12 13 14 + + 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 + +In the tree above, each cell *k* is topping ``2*k+1`` and ``2*k+2``. In an usual +binary tournament we see in sports, each cell is the winner over the two cells +it tops, and we can trace the winner down the tree to see all opponents s/he +had. However, in many computer applications of such tournaments, we do not need +to trace the history of a winner. To be more memory efficient, when a winner is +promoted, we try to replace it by something else at a lower level, and the rule +becomes that a cell and the two cells it tops contain three different items, but +the top cell "wins" over the two topped cells. + +If this heap invariant is protected at all time, index 0 is clearly the overall +winner. The simplest algorithmic way to remove it and find the "next" winner is +to move some loser (let's say cell 30 in the diagram above) into the 0 position, +and then percolate this new 0 down the tree, exchanging values, until the +invariant is re-established. This is clearly logarithmic on the total number of +items in the tree. By iterating over all items, you get an O(n log n) sort. + +A nice feature of this sort is that you can efficiently insert new items while +the sort is going on, provided that the inserted items are not "better" than the +last 0'th element you extracted. This is especially useful in simulation +contexts, where the tree holds all incoming events, and the "win" condition +means the smallest scheduled time. When an event schedule other events for +execution, they are scheduled into the future, so they can easily go into the +heap. So, a heap is a good structure for implementing schedulers (this is what +I used for my MIDI sequencer :-). + +Various structures for implementing schedulers have been extensively studied, +and heaps are good for this, as they are reasonably speedy, the speed is almost +constant, and the worst case is not much different than the average case. +However, there are other representations which are more efficient overall, yet +the worst cases might be terrible. + +Heaps are also very useful in big disk sorts. You most probably all know that a +big sort implies producing "runs" (which are pre-sorted sequences, which size is +usually related to the amount of CPU memory), followed by a merging passes for +these runs, which merging is often very cleverly organised [#]_. It is very +important that the initial sort produces the longest runs possible. Tournaments +are a good way to that. If, using all the memory available to hold a +tournament, you replace and percolate items that happen to fit the current run, +you'll produce runs which are twice the size of the memory for random input, and +much better for input fuzzily ordered. + +Moreover, if you output the 0'th item on disk and get an input which may not fit +in the current tournament (because the value "wins" over the last output value), +it cannot fit in the heap, so the size of the heap decreases. The freed memory +could be cleverly reused immediately for progressively building a second heap, +which grows at exactly the same rate the first heap is melting. When the first +heap completely vanishes, you switch heaps and start a new run. Clever and +quite effective! + +In a word, heaps are useful memory structures to know. I use them in a few +applications, and I think it is good to keep a 'heap' module around. :-) + +.. rubric:: Footnotes + +.. [#] The disk balancing algorithms which are current, nowadays, are more annoying + than clever, and this is a consequence of the seeking capabilities of the disks. + On devices which cannot seek, like big tape drives, the story was quite + different, and one had to be very clever to ensure (far in advance) that each + tape movement will be the most effective possible (that is, will best + participate at "progressing" the merge). Some tapes were even able to read + backwards, and this was also used to avoid the rewinding time. Believe me, real + good tape sorts were quite spectacular to watch! From all times, sorting has + always been a Great Art! :-) + diff --git a/Doc/library/hmac.rst b/Doc/library/hmac.rst new file mode 100644 index 0000000..10d41f7 --- /dev/null +++ b/Doc/library/hmac.rst @@ -0,0 +1,61 @@ + +:mod:`hmac` --- Keyed-Hashing for Message Authentication +======================================================== + +.. module:: hmac + :synopsis: Keyed-Hashing for Message Authentication (HMAC) implementation for Python. +.. moduleauthor:: Gerhard Häring <ghaering@users.sourceforge.net> +.. sectionauthor:: Gerhard Häring <ghaering@users.sourceforge.net> + + +.. versionadded:: 2.2 + +This module implements the HMAC algorithm as described by :rfc:`2104`. + + +.. function:: new(key[, msg[, digestmod]]) + + Return a new hmac object. If *msg* is present, the method call ``update(msg)`` + is made. *digestmod* is the digest constructor or module for the HMAC object to + use. It defaults to the :func:`hashlib.md5` constructor. + + .. note:: + + The md5 hash has known weaknesses but remains the default for backwards + compatibility. Choose a better one for your application. + +An HMAC object has the following methods: + + +.. method:: hmac.update(msg) + + Update the hmac object with the string *msg*. Repeated calls are equivalent to + a single call with the concatenation of all the arguments: ``m.update(a); + m.update(b)`` is equivalent to ``m.update(a + b)``. + + +.. method:: hmac.digest() + + Return the digest of the strings passed to the :meth:`update` method so far. + This string will be the same length as the *digest_size* of the digest given to + the constructor. It may contain non-ASCII characters, including NUL bytes. + + +.. method:: hmac.hexdigest() + + Like :meth:`digest` except the digest is returned as a string twice the length + containing only hexadecimal digits. This may be used to exchange the value + safely in email or other non-binary environments. + + +.. method:: hmac.copy() + + Return a copy ("clone") of the hmac object. This can be used to efficiently + compute the digests of strings that share a common initial substring. + + +.. seealso:: + + Module :mod:`hashlib` + The python module providing secure hash functions. + diff --git a/Doc/library/hotshot.rst b/Doc/library/hotshot.rst new file mode 100644 index 0000000..f6b5b13 --- /dev/null +++ b/Doc/library/hotshot.rst @@ -0,0 +1,152 @@ + +:mod:`hotshot` --- High performance logging profiler +==================================================== + +.. module:: hotshot + :synopsis: High performance logging profiler, mostly written in C. +.. moduleauthor:: Fred L. Drake, Jr. <fdrake@acm.org> +.. sectionauthor:: Anthony Baxter <anthony@interlink.com.au> + + +.. versionadded:: 2.2 + +This module provides a nicer interface to the :mod:`_hotshot` C module. Hotshot +is a replacement for the existing :mod:`profile` module. As it's written mostly +in C, it should result in a much smaller performance impact than the existing +:mod:`profile` module. + +.. note:: + + The :mod:`hotshot` module focuses on minimizing the overhead while profiling, at + the expense of long data post-processing times. For common usages it is + recommended to use :mod:`cProfile` instead. :mod:`hotshot` is not maintained and + might be removed from the standard library in the future. + +.. versionchanged:: 2.5 + the results should be more meaningful than in the past: the timing core + contained a critical bug. + +.. warning:: + + The :mod:`hotshot` profiler does not yet work well with threads. It is useful to + use an unthreaded script to run the profiler over the code you're interested in + measuring if at all possible. + + +.. class:: Profile(logfile[, lineevents[, linetimings]]) + + The profiler object. The argument *logfile* is the name of a log file to use for + logged profile data. The argument *lineevents* specifies whether to generate + events for every source line, or just on function call/return. It defaults to + ``0`` (only log function call/return). The argument *linetimings* specifies + whether to record timing information. It defaults to ``1`` (store timing + information). + + +.. _hotshot-objects: + +Profile Objects +--------------- + +Profile objects have the following methods: + + +.. method:: Profile.addinfo(key, value) + + Add an arbitrary labelled value to the profile output. + + +.. method:: Profile.close() + + Close the logfile and terminate the profiler. + + +.. method:: Profile.fileno() + + Return the file descriptor of the profiler's log file. + + +.. method:: Profile.run(cmd) + + Profile an :func:`exec`\ -compatible string in the script environment. The + globals from the :mod:`__main__` module are used as both the globals and locals + for the script. + + +.. method:: Profile.runcall(func, *args, **keywords) + + Profile a single call of a callable. Additional positional and keyword arguments + may be passed along; the result of the call is returned, and exceptions are + allowed to propagate cleanly, while ensuring that profiling is disabled on the + way out. + + +.. method:: Profile.runctx(cmd, globals, locals) + + Profile an :func:`exec`\ -compatible string in a specific environment. The + string is compiled before profiling begins. + + +.. method:: Profile.start() + + Start the profiler. + + +.. method:: Profile.stop() + + Stop the profiler. + + +Using hotshot data +------------------ + +.. module:: hotshot.stats + :synopsis: Statistical analysis for Hotshot + + +.. versionadded:: 2.2 + +This module loads hotshot profiling data into the standard :mod:`pstats` Stats +objects. + + +.. function:: load(filename) + + Load hotshot data from *filename*. Returns an instance of the + :class:`pstats.Stats` class. + + +.. seealso:: + + Module :mod:`profile` + The :mod:`profile` module's :class:`Stats` class + + +.. _hotshot-example: + +Example Usage +------------- + +Note that this example runs the python "benchmark" pystones. It can take some +time to run, and will produce large output files. :: + + >>> import hotshot, hotshot.stats, test.pystone + >>> prof = hotshot.Profile("stones.prof") + >>> benchtime, stones = prof.runcall(test.pystone.pystones) + >>> prof.close() + >>> stats = hotshot.stats.load("stones.prof") + >>> stats.strip_dirs() + >>> stats.sort_stats('time', 'calls') + >>> stats.print_stats(20) + 850004 function calls in 10.090 CPU seconds + + Ordered by: internal time, call count + + ncalls tottime percall cumtime percall filename:lineno(function) + 1 3.295 3.295 10.090 10.090 pystone.py:79(Proc0) + 150000 1.315 0.000 1.315 0.000 pystone.py:203(Proc7) + 50000 1.313 0.000 1.463 0.000 pystone.py:229(Func2) + . + . + . + diff --git a/Doc/library/htmllib.rst b/Doc/library/htmllib.rst new file mode 100644 index 0000000..96a7d08 --- /dev/null +++ b/Doc/library/htmllib.rst @@ -0,0 +1,186 @@ + +:mod:`htmllib` --- A parser for HTML documents +============================================== + +.. module:: htmllib + :synopsis: A parser for HTML documents. + + +.. index:: + single: HTML + single: hypertext + +.. index:: + module: sgmllib + module: formatter + single: SGMLParser (in module sgmllib) + +This module defines a class which can serve as a base for parsing text files +formatted in the HyperText Mark-up Language (HTML). The class is not directly +concerned with I/O --- it must be provided with input in string form via a +method, and makes calls to methods of a "formatter" object in order to produce +output. The :class:`HTMLParser` class is designed to be used as a base class +for other classes in order to add functionality, and allows most of its methods +to be extended or overridden. In turn, this class is derived from and extends +the :class:`SGMLParser` class defined in module :mod:`sgmllib`. The +:class:`HTMLParser` implementation supports the HTML 2.0 language as described +in :rfc:`1866`. Two implementations of formatter objects are provided in the +:mod:`formatter` module; refer to the documentation for that module for +information on the formatter interface. + +The following is a summary of the interface defined by +:class:`sgmllib.SGMLParser`: + +* The interface to feed data to an instance is through the :meth:`feed` method, + which takes a string argument. This can be called with as little or as much + text at a time as desired; ``p.feed(a); p.feed(b)`` has the same effect as + ``p.feed(a+b)``. When the data contains complete HTML markup constructs, these + are processed immediately; incomplete constructs are saved in a buffer. To + force processing of all unprocessed data, call the :meth:`close` method. + + For example, to parse the entire contents of a file, use:: + + parser.feed(open('myfile.html').read()) + parser.close() + +* The interface to define semantics for HTML tags is very simple: derive a class + and define methods called :meth:`start_tag`, :meth:`end_tag`, or :meth:`do_tag`. + The parser will call these at appropriate moments: :meth:`start_tag` or + :meth:`do_tag` is called when an opening tag of the form ``<tag ...>`` is + encountered; :meth:`end_tag` is called when a closing tag of the form ``<tag>`` + is encountered. If an opening tag requires a corresponding closing tag, like + ``<H1>`` ... ``</H1>``, the class should define the :meth:`start_tag` method; if + a tag requires no closing tag, like ``<P>``, the class should define the + :meth:`do_tag` method. + +The module defines a parser class and an exception: + + +.. class:: HTMLParser(formatter) + + This is the basic HTML parser class. It supports all entity names required by + the XHTML 1.0 Recommendation (http://www.w3.org/TR/xhtml1). It also defines + handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements. + + +.. exception:: HTMLParseError + + Exception raised by the :class:`HTMLParser` class when it encounters an error + while parsing. + + .. versionadded:: 2.4 + + +.. seealso:: + + Module :mod:`formatter` + Interface definition for transforming an abstract flow of formatting events into + specific output events on writer objects. + + Module :mod:`HTMLParser` + Alternate HTML parser that offers a slightly lower-level view of the input, but + is designed to work with XHTML, and does not implement some of the SGML syntax + not used in "HTML as deployed" and which isn't legal for XHTML. + + Module :mod:`htmlentitydefs` + Definition of replacement text for XHTML 1.0 entities. + + Module :mod:`sgmllib` + Base class for :class:`HTMLParser`. + + +.. _html-parser-objects: + +HTMLParser Objects +------------------ + +In addition to tag methods, the :class:`HTMLParser` class provides some +additional methods and instance variables for use within tag methods. + + +.. attribute:: HTMLParser.formatter + + This is the formatter instance associated with the parser. + + +.. attribute:: HTMLParser.nofill + + Boolean flag which should be true when whitespace should not be collapsed, or + false when it should be. In general, this should only be true when character + data is to be treated as "preformatted" text, as within a ``<PRE>`` element. + The default value is false. This affects the operation of :meth:`handle_data` + and :meth:`save_end`. + + +.. method:: HTMLParser.anchor_bgn(href, name, type) + + This method is called at the start of an anchor region. The arguments + correspond to the attributes of the ``<A>`` tag with the same names. The + default implementation maintains a list of hyperlinks (defined by the ``HREF`` + attribute for ``<A>`` tags) within the document. The list of hyperlinks is + available as the data attribute :attr:`anchorlist`. + + +.. method:: HTMLParser.anchor_end() + + This method is called at the end of an anchor region. The default + implementation adds a textual footnote marker using an index into the list of + hyperlinks created by :meth:`anchor_bgn`. + + +.. method:: HTMLParser.handle_image(source, alt[, ismap[, align[, width[, height]]]]) + + This method is called to handle images. The default implementation simply + passes the *alt* value to the :meth:`handle_data` method. + + +.. method:: HTMLParser.save_bgn() + + Begins saving character data in a buffer instead of sending it to the formatter + object. Retrieve the stored data via :meth:`save_end`. Use of the + :meth:`save_bgn` / :meth:`save_end` pair may not be nested. + + +.. method:: HTMLParser.save_end() + + Ends buffering character data and returns all data saved since the preceding + call to :meth:`save_bgn`. If the :attr:`nofill` flag is false, whitespace is + collapsed to single spaces. A call to this method without a preceding call to + :meth:`save_bgn` will raise a :exc:`TypeError` exception. + + +:mod:`htmlentitydefs` --- Definitions of HTML general entities +============================================================== + +.. module:: htmlentitydefs + :synopsis: Definitions of HTML general entities. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``, +and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to +provide the :attr:`entitydefs` member of the :class:`HTMLParser` class. The +definition provided here contains all the entities defined by XHTML 1.0 that +can be handled using simple textual substitution in the Latin-1 character set +(ISO-8859-1). + + +.. data:: entitydefs + + A dictionary mapping XHTML 1.0 entity definitions to their replacement text in + ISO Latin-1. + + +.. data:: name2codepoint + + A dictionary that maps HTML entity names to the Unicode codepoints. + + .. versionadded:: 2.3 + + +.. data:: codepoint2name + + A dictionary that maps Unicode codepoints to HTML entity names. + + .. versionadded:: 2.3 + diff --git a/Doc/library/htmlparser.rst b/Doc/library/htmlparser.rst new file mode 100644 index 0000000..85a38fb --- /dev/null +++ b/Doc/library/htmlparser.rst @@ -0,0 +1,183 @@ + +:mod:`HTMLParser` --- Simple HTML and XHTML parser +================================================== + +.. module:: HTMLParser + :synopsis: A simple parser that can handle HTML and XHTML. + + +.. versionadded:: 2.2 + +.. index:: + single: HTML + single: XHTML + +This module defines a class :class:`HTMLParser` which serves as the basis for +parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. +Unlike the parser in :mod:`htmllib`, this parser is not based on the SGML parser +in :mod:`sgmllib`. + + +.. class:: HTMLParser() + + The :class:`HTMLParser` class is instantiated without arguments. + + An HTMLParser instance is fed HTML data and calls handler functions when tags + begin and end. The :class:`HTMLParser` class is meant to be overridden by the + user to provide a desired behavior. + + Unlike the parser in :mod:`htmllib`, this parser does not check that end tags + match start tags or call the end-tag handler for elements which are closed + implicitly by closing an outer element. + +An exception is defined as well: + + +.. exception:: HTMLParseError + + Exception raised by the :class:`HTMLParser` class when it encounters an error + while parsing. This exception provides three attributes: :attr:`msg` is a brief + message explaining the error, :attr:`lineno` is the number of the line on which + the broken construct was detected, and :attr:`offset` is the number of + characters into the line at which the construct starts. + +:class:`HTMLParser` instances have the following methods: + + +.. method:: HTMLParser.reset() + + Reset the instance. Loses all unprocessed data. This is called implicitly at + instantiation time. + + +.. method:: HTMLParser.feed(data) + + Feed some text to the parser. It is processed insofar as it consists of + complete elements; incomplete data is buffered until more data is fed or + :meth:`close` is called. + + +.. method:: HTMLParser.close() + + Force processing of all buffered data as if it were followed by an end-of-file + mark. This method may be redefined by a derived class to define additional + processing at the end of the input, but the redefined version should always call + the :class:`HTMLParser` base class method :meth:`close`. + + +.. method:: HTMLParser.getpos() + + Return current line number and offset. + + +.. method:: HTMLParser.get_starttag_text() + + Return the text of the most recently opened start tag. This should not normally + be needed for structured processing, but may be useful in dealing with HTML "as + deployed" or for re-generating input with minimal changes (whitespace between + attributes can be preserved, etc.). + + +.. method:: HTMLParser.handle_starttag(tag, attrs) + + This method is called to handle the start of a tag. It is intended to be + overridden by a derived class; the base class implementation does nothing. + + The *tag* argument is the name of the tag converted to lower case. The *attrs* + argument is a list of ``(name, value)`` pairs containing the attributes found + inside the tag's ``<>`` brackets. The *name* will be translated to lower case, + and quotes in the *value* have been removed, and character and entity references + have been replaced. For instance, for the tag ``<A + HREF="http://www.cwi.nl/">``, this method would be called as + ``handle_starttag('a', [('href', 'http://www.cwi.nl/')])``. + + .. versionchanged:: 2.6 + All entity references from htmlentitydefs are now replaced in the attribute + values. + + +.. method:: HTMLParser.handle_startendtag(tag, attrs) + + Similar to :meth:`handle_starttag`, but called when the parser encounters an + XHTML-style empty tag (``<a .../>``). This method may be overridden by + subclasses which require this particular lexical information; the default + implementation simple calls :meth:`handle_starttag` and :meth:`handle_endtag`. + + +.. method:: HTMLParser.handle_endtag(tag) + + This method is called to handle the end tag of an element. It is intended to be + overridden by a derived class; the base class implementation does nothing. The + *tag* argument is the name of the tag converted to lower case. + + +.. method:: HTMLParser.handle_data(data) + + This method is called to process arbitrary data. It is intended to be + overridden by a derived class; the base class implementation does nothing. + + +.. method:: HTMLParser.handle_charref(name) + + This method is called to process a character reference of the form ``&#ref;``. + It is intended to be overridden by a derived class; the base class + implementation does nothing. + + +.. method:: HTMLParser.handle_entityref(name) + + This method is called to process a general entity reference of the form + ``&name;`` where *name* is an general entity reference. It is intended to be + overridden by a derived class; the base class implementation does nothing. + + +.. method:: HTMLParser.handle_comment(data) + + This method is called when a comment is encountered. The *comment* argument is + a string containing the text between the ``--`` and ``--`` delimiters, but not + the delimiters themselves. For example, the comment ``<!--text-->`` will cause + this method to be called with the argument ``'text'``. It is intended to be + overridden by a derived class; the base class implementation does nothing. + + +.. method:: HTMLParser.handle_decl(decl) + + Method called when an SGML declaration is read by the parser. The *decl* + parameter will be the entire contents of the declaration inside the ``<!``...\ + ``>`` markup. It is intended to be overridden by a derived class; the base + class implementation does nothing. + + +.. method:: HTMLParser.handle_pi(data) + + Method called when a processing instruction is encountered. The *data* + parameter will contain the entire processing instruction. For example, for the + processing instruction ``<?proc color='red'>``, this method would be called as + ``handle_pi("proc color='red'")``. It is intended to be overridden by a derived + class; the base class implementation does nothing. + + .. note:: + + The :class:`HTMLParser` class uses the SGML syntactic rules for processing + instructions. An XHTML processing instruction using the trailing ``'?'`` will + cause the ``'?'`` to be included in *data*. + + +.. _htmlparser-example: + +Example HTML Parser Application +------------------------------- + +As a basic example, below is a very basic HTML parser that uses the +:class:`HTMLParser` class to print out tags as they are encountered:: + + from HTMLParser import HTMLParser + + class MyHTMLParser(HTMLParser): + + def handle_starttag(self, tag, attrs): + print "Encountered the beginning of a %s tag" % tag + + def handle_endtag(self, tag): + print "Encountered the end of a %s tag" % tag + diff --git a/Doc/library/httplib.rst b/Doc/library/httplib.rst new file mode 100644 index 0000000..aae2219 --- /dev/null +++ b/Doc/library/httplib.rst @@ -0,0 +1,552 @@ + +:mod:`httplib` --- HTTP protocol client +======================================= + +.. module:: httplib + :synopsis: HTTP and HTTPS protocol client (requires sockets). + + +.. index:: + pair: HTTP; protocol + single: HTTP; httplib (standard module) + +.. index:: module: urllib + +This module defines classes which implement the client side of the HTTP and +HTTPS protocols. It is normally not used directly --- the module :mod:`urllib` +uses it to handle URLs that use HTTP and HTTPS. + +.. note:: + + HTTPS support is only available if the :mod:`socket` module was compiled with + SSL support. + +.. note:: + + The public interface for this module changed substantially in Python 2.0. The + :class:`HTTP` class is retained only for backward compatibility with 1.5.2. It + should not be used in new code. Refer to the online docstrings for usage. + +The module provides the following classes: + + +.. class:: HTTPConnection(host[, port[, strict[, timeout]]]) + + An :class:`HTTPConnection` instance represents one transaction with an HTTP + server. It should be instantiated passing it a host and optional port number. + If no port number is passed, the port is extracted from the host string if it + has the form ``host:port``, else the default HTTP port (80) is used. When True, + the optional parameter *strict* causes ``BadStatusLine`` to be raised if the + status line can't be parsed as a valid HTTP/1.0 or 1.1 status line. If the + optional *timeout* parameter is given, connection attempts will timeout after + that many seconds (if it is not given or ``None``, the global default timeout + setting is used). + + For example, the following calls all create instances that connect to the server + at the same host and port:: + + >>> h1 = httplib.HTTPConnection('www.cwi.nl') + >>> h2 = httplib.HTTPConnection('www.cwi.nl:80') + >>> h3 = httplib.HTTPConnection('www.cwi.nl', 80) + >>> h3 = httplib.HTTPConnection('www.cwi.nl', 80, timeout=10) + + .. versionadded:: 2.0 + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. class:: HTTPSConnection(host[, port[, key_file[, cert_file[, strict[, timeout]]]]]) + + A subclass of :class:`HTTPConnection` that uses SSL for communication with + secure servers. Default port is ``443``. *key_file* is the name of a PEM + formatted file that contains your private key. *cert_file* is a PEM formatted + certificate chain file. + + .. warning:: + + This does not do any certificate verification! + + .. versionadded:: 2.0 + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. class:: HTTPResponse(sock[, debuglevel=0][, strict=0]) + + Class whose instances are returned upon successful connection. Not instantiated + directly by user. + + .. versionadded:: 2.0 + +The following exceptions are raised as appropriate: + + +.. exception:: HTTPException + + The base class of the other exceptions in this module. It is a subclass of + :exc:`Exception`. + + .. versionadded:: 2.0 + + +.. exception:: NotConnected + + A subclass of :exc:`HTTPException`. + + .. versionadded:: 2.0 + + +.. exception:: InvalidURL + + A subclass of :exc:`HTTPException`, raised if a port is given and is either + non-numeric or empty. + + .. versionadded:: 2.3 + + +.. exception:: UnknownProtocol + + A subclass of :exc:`HTTPException`. + + .. versionadded:: 2.0 + + +.. exception:: UnknownTransferEncoding + + A subclass of :exc:`HTTPException`. + + .. versionadded:: 2.0 + + +.. exception:: UnimplementedFileMode + + A subclass of :exc:`HTTPException`. + + .. versionadded:: 2.0 + + +.. exception:: IncompleteRead + + A subclass of :exc:`HTTPException`. + + .. versionadded:: 2.0 + + +.. exception:: ImproperConnectionState + + A subclass of :exc:`HTTPException`. + + .. versionadded:: 2.0 + + +.. exception:: CannotSendRequest + + A subclass of :exc:`ImproperConnectionState`. + + .. versionadded:: 2.0 + + +.. exception:: CannotSendHeader + + A subclass of :exc:`ImproperConnectionState`. + + .. versionadded:: 2.0 + + +.. exception:: ResponseNotReady + + A subclass of :exc:`ImproperConnectionState`. + + .. versionadded:: 2.0 + + +.. exception:: BadStatusLine + + A subclass of :exc:`HTTPException`. Raised if a server responds with a HTTP + status code that we don't understand. + + .. versionadded:: 2.0 + +The constants defined in this module are: + + +.. data:: HTTP_PORT + + The default port for the HTTP protocol (always ``80``). + + +.. data:: HTTPS_PORT + + The default port for the HTTPS protocol (always ``443``). + +and also the following constants for integer status codes: + ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| Constant | Value | Definition | ++==========================================+=========+=======================================================================+ +| :const:`CONTINUE` | ``100`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.1.1 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.1.1>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`SWITCHING_PROTOCOLS` | ``101`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.1.2 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.1.2>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`PROCESSING` | ``102`` | WEBDAV, `RFC 2518, Section 10.1 | +| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_102>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`OK` | ``200`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.2.1 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`CREATED` | ``201`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.2.2 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.2>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`ACCEPTED` | ``202`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.2.3 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.3>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NON_AUTHORITATIVE_INFORMATION` | ``203`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.2.4 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.4>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NO_CONTENT` | ``204`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.2.5 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`RESET_CONTENT` | ``205`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.2.6 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.6>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`PARTIAL_CONTENT` | ``206`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.2.7 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.7>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`MULTI_STATUS` | ``207`` | WEBDAV `RFC 2518, Section 10.2 | +| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_207>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`IM_USED` | ``226`` | Delta encoding in HTTP, | +| | | :rfc:`3229`, Section 10.4.1 | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`MULTIPLE_CHOICES` | ``300`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.3.1 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.1>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`MOVED_PERMANENTLY` | ``301`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.3.2 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.2>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`FOUND` | ``302`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.3.3 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`SEE_OTHER` | ``303`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.3.4 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.4>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NOT_MODIFIED` | ``304`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.3.5 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`USE_PROXY` | ``305`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.3.6 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.6>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`TEMPORARY_REDIRECT` | ``307`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.3.8 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.8>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`BAD_REQUEST` | ``400`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.1 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`UNAUTHORIZED` | ``401`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.2 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`PAYMENT_REQUIRED` | ``402`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.3 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.3>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`FORBIDDEN` | ``403`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.4 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NOT_FOUND` | ``404`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.5 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`METHOD_NOT_ALLOWED` | ``405`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.6 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.6>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NOT_ACCEPTABLE` | ``406`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.7 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.7>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`PROXY_AUTHENTICATION_REQUIRED` | ``407`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.8 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.8>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`REQUEST_TIMEOUT` | ``408`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.9 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.9>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`CONFLICT` | ``409`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.10 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.10>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`GONE` | ``410`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.11 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.11>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`LENGTH_REQUIRED` | ``411`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.12 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.12>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`PRECONDITION_FAILED` | ``412`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.13 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.13>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`REQUEST_ENTITY_TOO_LARGE` | ``413`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.14 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.14>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`REQUEST_URI_TOO_LONG` | ``414`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.15 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.15>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`UNSUPPORTED_MEDIA_TYPE` | ``415`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.16 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.16>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`REQUESTED_RANGE_NOT_SATISFIABLE` | ``416`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.17 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.17>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`EXPECTATION_FAILED` | ``417`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.4.18 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.18>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`UNPROCESSABLE_ENTITY` | ``422`` | WEBDAV, `RFC 2518, Section 10.3 | +| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_422>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`LOCKED` | ``423`` | WEBDAV `RFC 2518, Section 10.4 | +| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_423>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`FAILED_DEPENDENCY` | ``424`` | WEBDAV, `RFC 2518, Section 10.5 | +| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_424>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`UPGRADE_REQUIRED` | ``426`` | HTTP Upgrade to TLS, | +| | | :rfc:`2817`, Section 6 | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`INTERNAL_SERVER_ERROR` | ``500`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.5.1 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.1>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NOT_IMPLEMENTED` | ``501`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.5.2 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.2>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`BAD_GATEWAY` | ``502`` | HTTP/1.1 `RFC 2616, Section | +| | | 10.5.3 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.3>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`SERVICE_UNAVAILABLE` | ``503`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.5.4 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.4>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`GATEWAY_TIMEOUT` | ``504`` | HTTP/1.1 `RFC 2616, Section | +| | | 10.5.5 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.5>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`HTTP_VERSION_NOT_SUPPORTED` | ``505`` | HTTP/1.1, `RFC 2616, Section | +| | | 10.5.6 | +| | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.6>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`INSUFFICIENT_STORAGE` | ``507`` | WEBDAV, `RFC 2518, Section 10.6 | +| | | <http://www.webdav.org/specs/rfc2518.html#STATUS_507>`_ | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NOT_EXTENDED` | ``510`` | An HTTP Extension Framework, | +| | | :rfc:`2774`, Section 7 | ++------------------------------------------+---------+-----------------------------------------------------------------------+ + + +.. data:: responses + + This dictionary maps the HTTP 1.1 status codes to the W3C names. + + Example: ``httplib.responses[httplib.NOT_FOUND]`` is ``'Not Found'``. + + .. versionadded:: 2.5 + + +.. _httpconnection-objects: + +HTTPConnection Objects +---------------------- + +:class:`HTTPConnection` instances have the following methods: + + +.. method:: HTTPConnection.request(method, url[, body[, headers]]) + + This will send a request to the server using the HTTP request method *method* + and the selector *url*. If the *body* argument is present, it should be a + string of data to send after the headers are finished. Alternatively, it may + be an open file object, in which case the contents of the file is sent; this + file object should support ``fileno()`` and ``read()`` methods. The header + Content-Length is automatically set to the correct value. The *headers* + argument should be a mapping of extra HTTP headers to send with the request. + + .. versionchanged:: 2.6 + *body* can be a file object. + + +.. method:: HTTPConnection.getresponse() + + Should be called after a request is sent to get the response from the server. + Returns an :class:`HTTPResponse` instance. + + .. note:: + + Note that you must have read the whole response before you can send a new + request to the server. + + +.. method:: HTTPConnection.set_debuglevel(level) + + Set the debugging level (the amount of debugging output printed). The default + debug level is ``0``, meaning no debugging output is printed. + + +.. method:: HTTPConnection.connect() + + Connect to the server specified when the object was created. + + +.. method:: HTTPConnection.close() + + Close the connection to the server. + +As an alternative to using the :meth:`request` method described above, you can +also send your request step by step, by using the four functions below. + + +.. method:: HTTPConnection.putrequest(request, selector[, skip_host[, skip_accept_encoding]]) + + This should be the first call after the connection to the server has been made. + It sends a line to the server consisting of the *request* string, the *selector* + string, and the HTTP version (``HTTP/1.1``). To disable automatic sending of + ``Host:`` or ``Accept-Encoding:`` headers (for example to accept additional + content encodings), specify *skip_host* or *skip_accept_encoding* with non-False + values. + + .. versionchanged:: 2.4 + *skip_accept_encoding* argument added. + + +.. method:: HTTPConnection.putheader(header, argument[, ...]) + + Send an :rfc:`822`\ -style header to the server. It sends a line to the server + consisting of the header, a colon and a space, and the first argument. If more + arguments are given, continuation lines are sent, each consisting of a tab and + an argument. + + +.. method:: HTTPConnection.endheaders() + + Send a blank line to the server, signalling the end of the headers. + + +.. method:: HTTPConnection.send(data) + + Send data to the server. This should be used directly only after the + :meth:`endheaders` method has been called and before :meth:`getresponse` is + called. + + +.. _httpresponse-objects: + +HTTPResponse Objects +-------------------- + +:class:`HTTPResponse` instances have the following methods and attributes: + + +.. method:: HTTPResponse.read([amt]) + + Reads and returns the response body, or up to the next *amt* bytes. + + +.. method:: HTTPResponse.getheader(name[, default]) + + Get the contents of the header *name*, or *default* if there is no matching + header. + + +.. method:: HTTPResponse.getheaders() + + Return a list of (header, value) tuples. + + .. versionadded:: 2.4 + + +.. attribute:: HTTPResponse.msg + + A :class:`mimetools.Message` instance containing the response headers. + + +.. attribute:: HTTPResponse.version + + HTTP protocol version used by server. 10 for HTTP/1.0, 11 for HTTP/1.1. + + +.. attribute:: HTTPResponse.status + + Status code returned by server. + + +.. attribute:: HTTPResponse.reason + + Reason phrase returned by server. + + +.. _httplib-examples: + +Examples +-------- + +Here is an example session that uses the ``GET`` method:: + + >>> import httplib + >>> conn = httplib.HTTPConnection("www.python.org") + >>> conn.request("GET", "/index.html") + >>> r1 = conn.getresponse() + >>> print r1.status, r1.reason + 200 OK + >>> data1 = r1.read() + >>> conn.request("GET", "/parrot.spam") + >>> r2 = conn.getresponse() + >>> print r2.status, r2.reason + 404 Not Found + >>> data2 = r2.read() + >>> conn.close() + +Here is an example session that shows how to ``POST`` requests:: + + >>> import httplib, urllib + >>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) + >>> headers = {"Content-type": "application/x-www-form-urlencoded", + ... "Accept": "text/plain"} + >>> conn = httplib.HTTPConnection("musi-cal.mojam.com:80") + >>> conn.request("POST", "/cgi-bin/query", params, headers) + >>> response = conn.getresponse() + >>> print response.status, response.reason + 200 OK + >>> data = response.read() + >>> conn.close() + diff --git a/Doc/library/i18n.rst b/Doc/library/i18n.rst new file mode 100644 index 0000000..8e57102 --- /dev/null +++ b/Doc/library/i18n.rst @@ -0,0 +1,19 @@ + +.. _i18n: + +******************** +Internationalization +******************** + +The modules described in this chapter help you write software that is +independent of language and locale by providing mechanisms for selecting a +language to be used in program messages or by tailoring output to match local +conventions. + +The list of modules described in this chapter is: + + +.. toctree:: + + gettext.rst + locale.rst diff --git a/Doc/library/ic.rst b/Doc/library/ic.rst new file mode 100644 index 0000000..d5e03bd --- /dev/null +++ b/Doc/library/ic.rst @@ -0,0 +1,119 @@ + +:mod:`ic` --- Access to the Mac OS X Internet Config +==================================================== + +.. module:: ic + :platform: Mac + :synopsis: Access to the Mac OS X Internet Config. + + +This module provides access to various internet-related preferences set through +:program:`System Preferences` or the :program:`Finder`. + +.. index:: module: icglue + +There is a low-level companion module :mod:`icglue` which provides the basic +Internet Config access functionality. This low-level module is not documented, +but the docstrings of the routines document the parameters and the routine names +are the same as for the Pascal or C API to Internet Config, so the standard IC +programmers' documentation can be used if this module is needed. + +The :mod:`ic` module defines the :exc:`error` exception and symbolic names for +all error codes Internet Config can produce; see the source for details. + + +.. exception:: error + + Exception raised on errors in the :mod:`ic` module. + +The :mod:`ic` module defines the following class and function: + + +.. class:: IC([signature[, ic]]) + + Create an Internet Config object. The signature is a 4-character creator code of + the current application (default ``'Pyth'``) which may influence some of ICs + settings. The optional *ic* argument is a low-level ``icglue.icinstance`` + created beforehand, this may be useful if you want to get preferences from a + different config file, etc. + + +.. function:: launchurl(url[, hint]) + parseurl(data[, start[, end[, hint]]]) + mapfile(file) + maptypecreator(type, creator[, filename]) + settypecreator(file) + + These functions are "shortcuts" to the methods of the same name, described + below. + + +IC Objects +---------- + +:class:`IC` objects have a mapping interface, hence to obtain the mail address +you simply get ``ic['MailAddress']``. Assignment also works, and changes the +option in the configuration file. + +The module knows about various datatypes, and converts the internal IC +representation to a "logical" Python data structure. Running the :mod:`ic` +module standalone will run a test program that lists all keys and values in your +IC database, this will have to serve as documentation. + +If the module does not know how to represent the data it returns an instance of +the ``ICOpaqueData`` type, with the raw data in its :attr:`data` attribute. +Objects of this type are also acceptable values for assignment. + +Besides the dictionary interface, :class:`IC` objects have the following +methods: + + +.. method:: IC.launchurl(url[, hint]) + + Parse the given URL, launch the correct application and pass it the URL. The + optional *hint* can be a scheme name such as ``'mailto:'``, in which case + incomplete URLs are completed with this scheme. If *hint* is not provided, + incomplete URLs are invalid. + + +.. method:: IC.parseurl(data[, start[, end[, hint]]]) + + Find an URL somewhere in *data* and return start position, end position and the + URL. The optional *start* and *end* can be used to limit the search, so for + instance if a user clicks in a long text field you can pass the whole text field + and the click-position in *start* and this routine will return the whole URL in + which the user clicked. As above, *hint* is an optional scheme used to complete + incomplete URLs. + + +.. method:: IC.mapfile(file) + + Return the mapping entry for the given *file*, which can be passed as either a + filename or an :func:`FSSpec` result, and which need not exist. + + The mapping entry is returned as a tuple ``(version, type, creator, postcreator, + flags, extension, appname, postappname, mimetype, entryname)``, where *version* + is the entry version number, *type* is the 4-character filetype, *creator* is + the 4-character creator type, *postcreator* is the 4-character creator code of + an optional application to post-process the file after downloading, *flags* are + various bits specifying whether to transfer in binary or ascii and such, + *extension* is the filename extension for this file type, *appname* is the + printable name of the application to which this file belongs, *postappname* is + the name of the postprocessing application, *mimetype* is the MIME type of this + file and *entryname* is the name of this entry. + + +.. method:: IC.maptypecreator(type, creator[, filename]) + + Return the mapping entry for files with given 4-character *type* and *creator* + codes. The optional *filename* may be specified to further help finding the + correct entry (if the creator code is ``'????'``, for instance). + + The mapping entry is returned in the same format as for *mapfile*. + + +.. method:: IC.settypecreator(file) + + Given an existing *file*, specified either as a filename or as an :func:`FSSpec` + result, set its creator and type correctly based on its extension. The finder + is told about the change, so the finder icon will be updated quickly. diff --git a/Doc/library/idle.rst b/Doc/library/idle.rst new file mode 100644 index 0000000..44b59e9 --- /dev/null +++ b/Doc/library/idle.rst @@ -0,0 +1,288 @@ +.. _idle: + +Idle +==== + +.. moduleauthor:: Guido van Rossum <guido@Python.org> + + +.. % \declaremodule{standard}{idle} +.. % \modulesynopsis{A Python Integrated Development Environment} + +.. index:: + single: Idle + single: Python Editor + single: Integrated Development Environment + +Idle is the Python IDE built with the :mod:`Tkinter` GUI toolkit. + +IDLE has the following features: + +* coded in 100% pure Python, using the :mod:`Tkinter` GUI toolkit + +* cross-platform: works on Windows and Unix (on Mac OS, there are currently + problems with Tcl/Tk) + +* multi-window text editor with multiple undo, Python colorizing and many other + features, e.g. smart indent and call tips + +* Python shell window (a.k.a. interactive interpreter) + +* debugger (not complete, but you can set breakpoints, view and step) + + +Menus +----- + + +File menu +^^^^^^^^^ + +New window + create a new editing window + +Open... + open an existing file + +Open module... + open an existing module (searches sys.path) + +Class browser + show classes and methods in current file + +Path browser + show sys.path directories, modules, classes and methods + +.. index:: + single: Class browser + single: Path browser + +Save + save current window to the associated file (unsaved windows have a \* before and + after the window title) + +Save As... + save current window to new file, which becomes the associated file + +Save Copy As... + save current window to different file without changing the associated file + +Close + close current window (asks to save if unsaved) + +Exit + close all windows and quit IDLE (asks to save if unsaved) + + +Edit menu +^^^^^^^^^ + +Undo + Undo last change to current window (max 1000 changes) + +Redo + Redo last undone change to current window + +Cut + Copy selection into system-wide clipboard; then delete selection + +Copy + Copy selection into system-wide clipboard + +Paste + Insert system-wide clipboard into window + +Select All + Select the entire contents of the edit buffer + +Find... + Open a search dialog box with many options + +Find again + Repeat last search + +Find selection + Search for the string in the selection + +Find in Files... + Open a search dialog box for searching files + +Replace... + Open a search-and-replace dialog box + +Go to line + Ask for a line number and show that line + +Indent region + Shift selected lines right 4 spaces + +Dedent region + Shift selected lines left 4 spaces + +Comment out region + Insert ## in front of selected lines + +Uncomment region + Remove leading # or ## from selected lines + +Tabify region + Turns *leading* stretches of spaces into tabs + +Untabify region + Turn *all* tabs into the right number of spaces + +Expand word + Expand the word you have typed to match another word in the same buffer; repeat + to get a different expansion + +Format Paragraph + Reformat the current blank-line-separated paragraph + +Import module + Import or reload the current module + +Run script + Execute the current file in the __main__ namespace + +.. index:: + single: Import module + single: Run script + + +Windows menu +^^^^^^^^^^^^ + +Zoom Height + toggles the window between normal size (24x80) and maximum height. + +The rest of this menu lists the names of all open windows; select one to bring +it to the foreground (deiconifying it if necessary). + + +Debug menu (in the Python Shell window only) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Go to file/line + look around the insert point for a filename and linenumber, open the file, and + show the line. + +Open stack viewer + show the stack traceback of the last exception + +Debugger toggle + Run commands in the shell under the debugger + +JIT Stack viewer toggle + Open stack viewer on traceback + +.. index:: + single: stack viewer + single: debugger + + +Basic editing and navigation +---------------------------- + +* :kbd:`Backspace` deletes to the left; :kbd:`Del` deletes to the right + +* Arrow keys and :kbd:`Page Up`/:kbd:`Page Down` to move around + +* :kbd:`Home`/:kbd:`End` go to begin/end of line + +* :kbd:`C-Home`/:kbd:`C-End` go to begin/end of file + +* Some :program:`Emacs` bindings may also work, including :kbd:`C-B`, + :kbd:`C-P`, :kbd:`C-A`, :kbd:`C-E`, :kbd:`C-D`, :kbd:`C-L` + + +Automatic indentation +^^^^^^^^^^^^^^^^^^^^^ + +After a block-opening statement, the next line is indented by 4 spaces (in the +Python Shell window by one tab). After certain keywords (break, return etc.) +the next line is dedented. In leading indentation, :kbd:`Backspace` deletes up +to 4 spaces if they are there. :kbd:`Tab` inserts 1-4 spaces (in the Python +Shell window one tab). See also the indent/dedent region commands in the edit +menu. + + +Python Shell window +^^^^^^^^^^^^^^^^^^^ + +* :kbd:`C-C` interrupts executing command + +* :kbd:`C-D` sends end-of-file; closes window if typed at a ``>>>`` prompt + +* :kbd:`Alt-p` retrieves previous command matching what you have typed + +* :kbd:`Alt-n` retrieves next + +* :kbd:`Return` while on any previous command retrieves that command + +* :kbd:`Alt-/` (Expand word) is also useful here + +.. index:: single: indentation + + +Syntax colors +------------- + +The coloring is applied in a background "thread," so you may occasionally see +uncolorized text. To change the color scheme, edit the ``[Colors]`` section in +:file:`config.txt`. + +Python syntax colors: + Keywords + orange + + Strings + green + + Comments + red + + Definitions + blue + +Shell colors: + Console output + brown + + stdout + blue + + stderr + dark green + + stdin + black + + +Command line usage +^^^^^^^^^^^^^^^^^^ + +:: + + idle.py [-c command] [-d] [-e] [-s] [-t title] [arg] ... + + -c command run this command + -d enable debugger + -e edit mode; arguments are files to be edited + -s run $IDLESTARTUP or $PYTHONSTARTUP first + -t title set title of shell window + +If there are arguments: + +#. If :option:`-e` is used, arguments are files opened for editing and + ``sys.argv`` reflects the arguments passed to IDLE itself. + +#. Otherwise, if :option:`-c` is used, all arguments are placed in + ``sys.argv[1:...]``, with ``sys.argv[0]`` set to ``'-c'``. + +#. Otherwise, if neither :option:`-e` nor :option:`-c` is used, the first + argument is a script which is executed with the remaining arguments in + ``sys.argv[1:...]`` and ``sys.argv[0]`` set to the script name. If the script + name is '-', no script is executed but an interactive Python session is started; + the arguments are still available in ``sys.argv``. + + diff --git a/Doc/library/imaplib.rst b/Doc/library/imaplib.rst new file mode 100644 index 0000000..fc7c230 --- /dev/null +++ b/Doc/library/imaplib.rst @@ -0,0 +1,540 @@ + +:mod:`imaplib` --- IMAP4 protocol client +======================================== + +.. module:: imaplib + :synopsis: IMAP4 protocol client (requires sockets). +.. moduleauthor:: Piers Lauder <piers@communitysolutions.com.au> +.. sectionauthor:: Piers Lauder <piers@communitysolutions.com.au> + + +.. index:: + pair: IMAP4; protocol + pair: IMAP4_SSL; protocol + pair: IMAP4_stream; protocol + +.. % Based on HTML documentation by Piers Lauder +.. % <piers@communitysolutions.com.au>; +.. % converted by Fred L. Drake, Jr. <fdrake@acm.org>. +.. % Revised by ESR, January 2000. +.. % Changes for IMAP4_SSL by Tino Lange <Tino.Lange@isg.de>, March 2002 +.. % Changes for IMAP4_stream by Piers Lauder +.. % <piers@communitysolutions.com.au>, November 2002 + +This module defines three classes, :class:`IMAP4`, :class:`IMAP4_SSL` and +:class:`IMAP4_stream`, which encapsulate a connection to an IMAP4 server and +implement a large subset of the IMAP4rev1 client protocol as defined in +:rfc:`2060`. It is backward compatible with IMAP4 (:rfc:`1730`) servers, but +note that the ``STATUS`` command is not supported in IMAP4. + +Three classes are provided by the :mod:`imaplib` module, :class:`IMAP4` is the +base class: + + +.. class:: IMAP4([host[, port]]) + + This class implements the actual IMAP4 protocol. The connection is created and + protocol version (IMAP4 or IMAP4rev1) is determined when the instance is + initialized. If *host* is not specified, ``''`` (the local host) is used. If + *port* is omitted, the standard IMAP4 port (143) is used. + +Three exceptions are defined as attributes of the :class:`IMAP4` class: + + +.. exception:: IMAP4.error + + Exception raised on any errors. The reason for the exception is passed to the + constructor as a string. + + +.. exception:: IMAP4.abort + + IMAP4 server errors cause this exception to be raised. This is a sub-class of + :exc:`IMAP4.error`. Note that closing the instance and instantiating a new one + will usually allow recovery from this exception. + + +.. exception:: IMAP4.readonly + + This exception is raised when a writable mailbox has its status changed by the + server. This is a sub-class of :exc:`IMAP4.error`. Some other client now has + write permission, and the mailbox will need to be re-opened to re-obtain write + permission. + +There's also a subclass for secure connections: + + +.. class:: IMAP4_SSL([host[, port[, keyfile[, certfile]]]]) + + This is a subclass derived from :class:`IMAP4` that connects over an SSL + encrypted socket (to use this class you need a socket module that was compiled + with SSL support). If *host* is not specified, ``''`` (the local host) is used. + If *port* is omitted, the standard IMAP4-over-SSL port (993) is used. *keyfile* + and *certfile* are also optional - they can contain a PEM formatted private key + and certificate chain file for the SSL connection. + +The second subclass allows for connections created by a child process: + + +.. class:: IMAP4_stream(command) + + This is a subclass derived from :class:`IMAP4` that connects to the + ``stdin/stdout`` file descriptors created by passing *command* to + ``os.popen2()``. + + .. versionadded:: 2.3 + +The following utility functions are defined: + + +.. function:: Internaldate2tuple(datestr) + + Converts an IMAP4 INTERNALDATE string to Coordinated Universal Time. Returns a + :mod:`time` module tuple. + + +.. function:: Int2AP(num) + + Converts an integer into a string representation using characters from the set + [``A`` .. ``P``]. + + +.. function:: ParseFlags(flagstr) + + Converts an IMAP4 ``FLAGS`` response to a tuple of individual flags. + + +.. function:: Time2Internaldate(date_time) + + Converts a :mod:`time` module tuple to an IMAP4 ``INTERNALDATE`` representation. + Returns a string in the form: ``"DD-Mmm-YYYY HH:MM:SS +HHMM"`` (including + double-quotes). + +Note that IMAP4 message numbers change as the mailbox changes; in particular, +after an ``EXPUNGE`` command performs deletions the remaining messages are +renumbered. So it is highly advisable to use UIDs instead, with the UID command. + +At the end of the module, there is a test section that contains a more extensive +example of usage. + + +.. seealso:: + + Documents describing the protocol, and sources and binaries for servers + implementing it, can all be found at the University of Washington's *IMAP + Information Center* (http://www.cac.washington.edu/imap/). + + +.. _imap4-objects: + +IMAP4 Objects +------------- + +All IMAP4rev1 commands are represented by methods of the same name, either +upper-case or lower-case. + +All arguments to commands are converted to strings, except for ``AUTHENTICATE``, +and the last argument to ``APPEND`` which is passed as an IMAP4 literal. If +necessary (the string contains IMAP4 protocol-sensitive characters and isn't +enclosed with either parentheses or double quotes) each string is quoted. +However, the *password* argument to the ``LOGIN`` command is always quoted. If +you want to avoid having an argument string quoted (eg: the *flags* argument to +``STORE``) then enclose the string in parentheses (eg: ``r'(\Deleted)'``). + +Each command returns a tuple: ``(type, [data, ...])`` where *type* is usually +``'OK'`` or ``'NO'``, and *data* is either the text from the command response, +or mandated results from the command. Each *data* is either a string, or a +tuple. If a tuple, then the first part is the header of the response, and the +second part contains the data (ie: 'literal' value). + +The *message_set* options to commands below is a string specifying one or more +messages to be acted upon. It may be a simple message number (``'1'``), a range +of message numbers (``'2:4'``), or a group of non-contiguous ranges separated by +commas (``'1:3,6:9'``). A range can contain an asterisk to indicate an infinite +upper bound (``'3:*'``). + +An :class:`IMAP4` instance has the following methods: + + +.. method:: IMAP4.append(mailbox, flags, date_time, message) + + Append *message* to named mailbox. + + +.. method:: IMAP4.authenticate(mechanism, authobject) + + Authenticate command --- requires response processing. + + *mechanism* specifies which authentication mechanism is to be used - it should + appear in the instance variable ``capabilities`` in the form ``AUTH=mechanism``. + + *authobject* must be a callable object:: + + data = authobject(response) + + It will be called to process server continuation responses. It should return + ``data`` that will be encoded and sent to server. It should return ``None`` if + the client abort response ``*`` should be sent instead. + + +.. method:: IMAP4.check() + + Checkpoint mailbox on server. + + +.. method:: IMAP4.close() + + Close currently selected mailbox. Deleted messages are removed from writable + mailbox. This is the recommended command before ``LOGOUT``. + + +.. method:: IMAP4.copy(message_set, new_mailbox) + + Copy *message_set* messages onto end of *new_mailbox*. + + +.. method:: IMAP4.create(mailbox) + + Create new mailbox named *mailbox*. + + +.. method:: IMAP4.delete(mailbox) + + Delete old mailbox named *mailbox*. + + +.. method:: IMAP4.deleteacl(mailbox, who) + + Delete the ACLs (remove any rights) set for who on mailbox. + + .. versionadded:: 2.4 + + +.. method:: IMAP4.expunge() + + Permanently remove deleted items from selected mailbox. Generates an ``EXPUNGE`` + response for each deleted message. Returned data contains a list of ``EXPUNGE`` + message numbers in order received. + + +.. method:: IMAP4.fetch(message_set, message_parts) + + Fetch (parts of) messages. *message_parts* should be a string of message part + names enclosed within parentheses, eg: ``"(UID BODY[TEXT])"``. Returned data + are tuples of message part envelope and data. + + +.. method:: IMAP4.getacl(mailbox) + + Get the ``ACL``\ s for *mailbox*. The method is non-standard, but is supported + by the ``Cyrus`` server. + + +.. method:: IMAP4.getannotation(mailbox, entry, attribute) + + Retrieve the specified ``ANNOTATION``\ s for *mailbox*. The method is + non-standard, but is supported by the ``Cyrus`` server. + + .. versionadded:: 2.5 + + +.. method:: IMAP4.getquota(root) + + Get the ``quota`` *root*'s resource usage and limits. This method is part of the + IMAP4 QUOTA extension defined in rfc2087. + + .. versionadded:: 2.3 + + +.. method:: IMAP4.getquotaroot(mailbox) + + Get the list of ``quota`` ``roots`` for the named *mailbox*. This method is part + of the IMAP4 QUOTA extension defined in rfc2087. + + .. versionadded:: 2.3 + + +.. method:: IMAP4.list([directory[, pattern]]) + + List mailbox names in *directory* matching *pattern*. *directory* defaults to + the top-level mail folder, and *pattern* defaults to match anything. Returned + data contains a list of ``LIST`` responses. + + +.. method:: IMAP4.login(user, password) + + Identify the client using a plaintext password. The *password* will be quoted. + + +.. method:: IMAP4.login_cram_md5(user, password) + + Force use of ``CRAM-MD5`` authentication when identifying the client to protect + the password. Will only work if the server ``CAPABILITY`` response includes the + phrase ``AUTH=CRAM-MD5``. + + .. versionadded:: 2.3 + + +.. method:: IMAP4.logout() + + Shutdown connection to server. Returns server ``BYE`` response. + + +.. method:: IMAP4.lsub([directory[, pattern]]) + + List subscribed mailbox names in directory matching pattern. *directory* + defaults to the top level directory and *pattern* defaults to match any mailbox. + Returned data are tuples of message part envelope and data. + + +.. method:: IMAP4.myrights(mailbox) + + Show my ACLs for a mailbox (i.e. the rights that I have on mailbox). + + .. versionadded:: 2.4 + + +.. method:: IMAP4.namespace() + + Returns IMAP namespaces as defined in RFC2342. + + .. versionadded:: 2.3 + + +.. method:: IMAP4.noop() + + Send ``NOOP`` to server. + + +.. method:: IMAP4.open(host, port) + + Opens socket to *port* at *host*. The connection objects established by this + method will be used in the ``read``, ``readline``, ``send``, and ``shutdown`` + methods. You may override this method. + + +.. method:: IMAP4.partial(message_num, message_part, start, length) + + Fetch truncated part of a message. Returned data is a tuple of message part + envelope and data. + + +.. method:: IMAP4.proxyauth(user) + + Assume authentication as *user*. Allows an authorised administrator to proxy + into any user's mailbox. + + .. versionadded:: 2.3 + + +.. method:: IMAP4.read(size) + + Reads *size* bytes from the remote server. You may override this method. + + +.. method:: IMAP4.readline() + + Reads one line from the remote server. You may override this method. + + +.. method:: IMAP4.recent() + + Prompt server for an update. Returned data is ``None`` if no new messages, else + value of ``RECENT`` response. + + +.. method:: IMAP4.rename(oldmailbox, newmailbox) + + Rename mailbox named *oldmailbox* to *newmailbox*. + + +.. method:: IMAP4.response(code) + + Return data for response *code* if received, or ``None``. Returns the given + code, instead of the usual type. + + +.. method:: IMAP4.search(charset, criterion[, ...]) + + Search mailbox for matching messages. *charset* may be ``None``, in which case + no ``CHARSET`` will be specified in the request to the server. The IMAP + protocol requires that at least one criterion be specified; an exception will be + raised when the server returns an error. + + Example:: + + # M is a connected IMAP4 instance... + typ, msgnums = M.search(None, 'FROM', '"LDJ"') + + # or: + typ, msgnums = M.search(None, '(FROM "LDJ")') + + +.. method:: IMAP4.select([mailbox[, readonly]]) + + Select a mailbox. Returned data is the count of messages in *mailbox* + (``EXISTS`` response). The default *mailbox* is ``'INBOX'``. If the *readonly* + flag is set, modifications to the mailbox are not allowed. + + +.. method:: IMAP4.send(data) + + Sends ``data`` to the remote server. You may override this method. + + +.. method:: IMAP4.setacl(mailbox, who, what) + + Set an ``ACL`` for *mailbox*. The method is non-standard, but is supported by + the ``Cyrus`` server. + + +.. method:: IMAP4.setannotation(mailbox, entry, attribute[, ...]) + + Set ``ANNOTATION``\ s for *mailbox*. The method is non-standard, but is + supported by the ``Cyrus`` server. + + .. versionadded:: 2.5 + + +.. method:: IMAP4.setquota(root, limits) + + Set the ``quota`` *root*'s resource *limits*. This method is part of the IMAP4 + QUOTA extension defined in rfc2087. + + .. versionadded:: 2.3 + + +.. method:: IMAP4.shutdown() + + Close connection established in ``open``. You may override this method. + + +.. method:: IMAP4.socket() + + Returns socket instance used to connect to server. + + +.. method:: IMAP4.sort(sort_criteria, charset, search_criterion[, ...]) + + The ``sort`` command is a variant of ``search`` with sorting semantics for the + results. Returned data contains a space separated list of matching message + numbers. + + Sort has two arguments before the *search_criterion* argument(s); a + parenthesized list of *sort_criteria*, and the searching *charset*. Note that + unlike ``search``, the searching *charset* argument is mandatory. There is also + a ``uid sort`` command which corresponds to ``sort`` the way that ``uid search`` + corresponds to ``search``. The ``sort`` command first searches the mailbox for + messages that match the given searching criteria using the charset argument for + the interpretation of strings in the searching criteria. It then returns the + numbers of matching messages. + + This is an ``IMAP4rev1`` extension command. + + +.. method:: IMAP4.status(mailbox, names) + + Request named status conditions for *mailbox*. + + +.. method:: IMAP4.store(message_set, command, flag_list) + + Alters flag dispositions for messages in mailbox. *command* is specified by + section 6.4.6 of :rfc:`2060` as being one of "FLAGS", "+FLAGS", or "-FLAGS", + optionally with a suffix of ".SILENT". + + For example, to set the delete flag on all messages:: + + typ, data = M.search(None, 'ALL') + for num in data[0].split(): + M.store(num, '+FLAGS', '\\Deleted') + M.expunge() + + +.. method:: IMAP4.subscribe(mailbox) + + Subscribe to new mailbox. + + +.. method:: IMAP4.thread(threading_algorithm, charset, search_criterion[, ...]) + + The ``thread`` command is a variant of ``search`` with threading semantics for + the results. Returned data contains a space separated list of thread members. + + Thread members consist of zero or more messages numbers, delimited by spaces, + indicating successive parent and child. + + Thread has two arguments before the *search_criterion* argument(s); a + *threading_algorithm*, and the searching *charset*. Note that unlike + ``search``, the searching *charset* argument is mandatory. There is also a + ``uid thread`` command which corresponds to ``thread`` the way that ``uid + search`` corresponds to ``search``. The ``thread`` command first searches the + mailbox for messages that match the given searching criteria using the charset + argument for the interpretation of strings in the searching criteria. It then + returns the matching messages threaded according to the specified threading + algorithm. + + This is an ``IMAP4rev1`` extension command. + + .. versionadded:: 2.4 + + +.. method:: IMAP4.uid(command, arg[, ...]) + + Execute command args with messages identified by UID, rather than message + number. Returns response appropriate to command. At least one argument must be + supplied; if none are provided, the server will return an error and an exception + will be raised. + + +.. method:: IMAP4.unsubscribe(mailbox) + + Unsubscribe from old mailbox. + + +.. method:: IMAP4.xatom(name[, arg[, ...]]) + + Allow simple extension commands notified by server in ``CAPABILITY`` response. + +Instances of :class:`IMAP4_SSL` have just one additional method: + + +.. method:: IMAP4_SSL.ssl() + + Returns SSLObject instance used for the secure connection with the server. + +The following attributes are defined on instances of :class:`IMAP4`: + + +.. attribute:: IMAP4.PROTOCOL_VERSION + + The most recent supported protocol in the ``CAPABILITY`` response from the + server. + + +.. attribute:: IMAP4.debug + + Integer value to control debugging output. The initialize value is taken from + the module variable ``Debug``. Values greater than three trace each command. + + +.. _imap4-example: + +IMAP4 Example +------------- + +Here is a minimal example (without error checking) that opens a mailbox and +retrieves and prints all messages:: + + import getpass, imaplib + + M = imaplib.IMAP4() + M.login(getpass.getuser(), getpass.getpass()) + M.select() + typ, data = M.search(None, 'ALL') + for num in data[0].split(): + typ, data = M.fetch(num, '(RFC822)') + print 'Message %s\n%s\n' % (num, data[0][1]) + M.close() + M.logout() + diff --git a/Doc/library/imghdr.rst b/Doc/library/imghdr.rst new file mode 100644 index 0000000..90a8304 --- /dev/null +++ b/Doc/library/imghdr.rst @@ -0,0 +1,71 @@ + +:mod:`imghdr` --- Determine the type of an image +================================================ + +.. module:: imghdr + :synopsis: Determine the type of image contained in a file or byte stream. + + +The :mod:`imghdr` module determines the type of image contained in a file or +byte stream. + +The :mod:`imghdr` module defines the following function: + + +.. function:: what(filename[, h]) + + Tests the image data contained in the file named by *filename*, and returns a + string describing the image type. If optional *h* is provided, the *filename* + is ignored and *h* is assumed to contain the byte stream to test. + +The following image types are recognized, as listed below with the return value +from :func:`what`: + ++------------+-----------------------------------+ +| Value | Image format | ++============+===================================+ +| ``'rgb'`` | SGI ImgLib Files | ++------------+-----------------------------------+ +| ``'gif'`` | GIF 87a and 89a Files | ++------------+-----------------------------------+ +| ``'pbm'`` | Portable Bitmap Files | ++------------+-----------------------------------+ +| ``'pgm'`` | Portable Graymap Files | ++------------+-----------------------------------+ +| ``'ppm'`` | Portable Pixmap Files | ++------------+-----------------------------------+ +| ``'tiff'`` | TIFF Files | ++------------+-----------------------------------+ +| ``'rast'`` | Sun Raster Files | ++------------+-----------------------------------+ +| ``'xbm'`` | X Bitmap Files | ++------------+-----------------------------------+ +| ``'jpeg'`` | JPEG data in JFIF or Exif formats | ++------------+-----------------------------------+ +| ``'bmp'`` | BMP files | ++------------+-----------------------------------+ +| ``'png'`` | Portable Network Graphics | ++------------+-----------------------------------+ + +.. versionadded:: 2.5 + Exif detection. + +You can extend the list of file types :mod:`imghdr` can recognize by appending +to this variable: + + +.. data:: tests + + A list of functions performing the individual tests. Each function takes two + arguments: the byte-stream and an open file-like object. When :func:`what` is + called with a byte-stream, the file-like object will be ``None``. + + The test function should return a string describing the image type if the test + succeeded, or ``None`` if it failed. + +Example:: + + >>> import imghdr + >>> imghdr.what('/tmp/bass.gif') + 'gif' + diff --git a/Doc/library/imp.rst b/Doc/library/imp.rst new file mode 100644 index 0000000..f80bea3 --- /dev/null +++ b/Doc/library/imp.rst @@ -0,0 +1,298 @@ + +:mod:`imp` --- Access the :keyword:`import` internals +===================================================== + +.. module:: imp + :synopsis: Access the implementation of the import statement. + + +.. index:: statement: import + +This module provides an interface to the mechanisms used to implement the +:keyword:`import` statement. It defines the following constants and functions: + + +.. function:: get_magic() + + .. index:: pair: file; byte-code + + Return the magic string value used to recognize byte-compiled code files + (:file:`.pyc` files). (This value may be different for each Python version.) + + +.. function:: get_suffixes() + + Return a list of triples, each describing a particular type of module. Each + triple has the form ``(suffix, mode, type)``, where *suffix* is a string to be + appended to the module name to form the filename to search for, *mode* is the + mode string to pass to the built-in :func:`open` function to open the file (this + can be ``'r'`` for text files or ``'rb'`` for binary files), and *type* is the + file type, which has one of the values :const:`PY_SOURCE`, :const:`PY_COMPILED`, + or :const:`C_EXTENSION`, described below. + + +.. function:: find_module(name[, path]) + + Try to find the module *name* on the search path *path*. If *path* is a list of + directory names, each directory is searched for files with any of the suffixes + returned by :func:`get_suffixes` above. Invalid names in the list are silently + ignored (but all list items must be strings). If *path* is omitted or ``None``, + the list of directory names given by ``sys.path`` is searched, but first it + searches a few special places: it tries to find a built-in module with the given + name (:const:`C_BUILTIN`), then a frozen module (:const:`PY_FROZEN`), and on + some systems some other places are looked in as well (on the Mac, it looks for a + resource (:const:`PY_RESOURCE`); on Windows, it looks in the registry which may + point to a specific file). + + If search is successful, the return value is a triple ``(file, pathname, + description)`` where *file* is an open file object positioned at the beginning, + *pathname* is the pathname of the file found, and *description* is a triple as + contained in the list returned by :func:`get_suffixes` describing the kind of + module found. If the module does not live in a file, the returned *file* is + ``None``, *filename* is the empty string, and the *description* tuple contains + empty strings for its suffix and mode; the module type is as indicate in + parentheses above. If the search is unsuccessful, :exc:`ImportError` is raised. + Other exceptions indicate problems with the arguments or environment. + + This function does not handle hierarchical module names (names containing dots). + In order to find *P*.*M*, that is, submodule *M* of package *P*, use + :func:`find_module` and :func:`load_module` to find and load package *P*, and + then use :func:`find_module` with the *path* argument set to ``P.__path__``. + When *P* itself has a dotted name, apply this recipe recursively. + + +.. function:: load_module(name, file, filename, description) + + Load a module that was previously found by :func:`find_module` (or by an + otherwise conducted search yielding compatible results). This function does + more than importing the module: if the module was already imported, it will + reload the module! The *name* argument indicates the full module name (including + the package name, if this is a submodule of a package). The *file* argument is + an open file, and *filename* is the corresponding file name; these can be + ``None`` and ``''``, respectively, when the module is not being loaded from a + file. The *description* argument is a tuple, as would be returned by + :func:`get_suffixes`, describing what kind of module must be loaded. + + If the load is successful, the return value is the module object; otherwise, an + exception (usually :exc:`ImportError`) is raised. + + **Important:** the caller is responsible for closing the *file* argument, if it + was not ``None``, even when an exception is raised. This is best done using a + :keyword:`try` ... :keyword:`finally` statement. + + +.. function:: new_module(name) + + Return a new empty module object called *name*. This object is *not* inserted + in ``sys.modules``. + + +.. function:: lock_held() + + Return ``True`` if the import lock is currently held, else ``False``. On + platforms without threads, always return ``False``. + + On platforms with threads, a thread executing an import holds an internal lock + until the import is complete. This lock blocks other threads from doing an + import until the original import completes, which in turn prevents other threads + from seeing incomplete module objects constructed by the original thread while + in the process of completing its import (and the imports, if any, triggered by + that). + + +.. function:: acquire_lock() + + Acquires the interpreter's import lock for the current thread. This lock should + be used by import hooks to ensure thread-safety when importing modules. On + platforms without threads, this function does nothing. + + .. versionadded:: 2.3 + + +.. function:: release_lock() + + Release the interpreter's import lock. On platforms without threads, this + function does nothing. + + .. versionadded:: 2.3 + +The following constants with integer values, defined in this module, are used to +indicate the search result of :func:`find_module`. + + +.. data:: PY_SOURCE + + The module was found as a source file. + + +.. data:: PY_COMPILED + + The module was found as a compiled code object file. + + +.. data:: C_EXTENSION + + The module was found as dynamically loadable shared library. + + +.. data:: PY_RESOURCE + + The module was found as a Mac OS 9 resource. This value can only be returned on + a Mac OS 9 or earlier Macintosh. + + +.. data:: PKG_DIRECTORY + + The module was found as a package directory. + + +.. data:: C_BUILTIN + + The module was found as a built-in module. + + +.. data:: PY_FROZEN + + The module was found as a frozen module (see :func:`init_frozen`). + +The following constant and functions are obsolete; their functionality is +available through :func:`find_module` or :func:`load_module`. They are kept +around for backward compatibility: + + +.. data:: SEARCH_ERROR + + Unused. + + +.. function:: init_builtin(name) + + Initialize the built-in module called *name* and return its module object along + with storing it in ``sys.modules``. If the module was already initialized, it + will be initialized *again*. Re-initialization involves the copying of the + built-in module's ``__dict__`` from the cached module over the module's entry in + ``sys.modules``. If there is no built-in module called *name*, ``None`` is + returned. + + +.. function:: init_frozen(name) + + Initialize the frozen module called *name* and return its module object. If + the module was already initialized, it will be initialized *again*. If there + is no frozen module called *name*, ``None`` is returned. (Frozen modules are + modules written in Python whose compiled byte-code object is incorporated + into a custom-built Python interpreter by Python's :program:`freeze` + utility. See :file:`Tools/freeze/` for now.) + + +.. function:: is_builtin(name) + + Return ``1`` if there is a built-in module called *name* which can be + initialized again. Return ``-1`` if there is a built-in module called *name* + which cannot be initialized again (see :func:`init_builtin`). Return ``0`` if + there is no built-in module called *name*. + + +.. function:: is_frozen(name) + + Return ``True`` if there is a frozen module (see :func:`init_frozen`) called + *name*, or ``False`` if there is no such module. + + +.. function:: load_compiled(name, pathname, [file]) + + .. index:: pair: file; byte-code + + Load and initialize a module implemented as a byte-compiled code file and return + its module object. If the module was already initialized, it will be + initialized *again*. The *name* argument is used to create or access a module + object. The *pathname* argument points to the byte-compiled code file. The + *file* argument is the byte-compiled code file, open for reading in binary mode, + from the beginning. It must currently be a real file object, not a user-defined + class emulating a file. + + +.. function:: load_dynamic(name, pathname[, file]) + + Load and initialize a module implemented as a dynamically loadable shared + library and return its module object. If the module was already initialized, it + will be initialized *again*. Re-initialization involves copying the ``__dict__`` + attribute of the cached instance of the module over the value used in the module + cached in ``sys.modules``. The *pathname* argument must point to the shared + library. The *name* argument is used to construct the name of the + initialization function: an external C function called ``initname()`` in the + shared library is called. The optional *file* argument is ignored. (Note: + using shared libraries is highly system dependent, and not all systems support + it.) + + +.. function:: load_source(name, pathname[, file]) + + Load and initialize a module implemented as a Python source file and return its + module object. If the module was already initialized, it will be initialized + *again*. The *name* argument is used to create or access a module object. The + *pathname* argument points to the source file. The *file* argument is the + source file, open for reading as text, from the beginning. It must currently be + a real file object, not a user-defined class emulating a file. Note that if a + properly matching byte-compiled file (with suffix :file:`.pyc` or :file:`.pyo`) + exists, it will be used instead of parsing the given source file. + + +.. class:: NullImporter(path_string) + + The :class:`NullImporter` type is a :pep:`302` import hook that handles + non-directory path strings by failing to find any modules. Calling this type + with an existing directory or empty string raises :exc:`ImportError`. + Otherwise, a :class:`NullImporter` instance is returned. + + Python adds instances of this type to ``sys.path_importer_cache`` for any path + entries that are not directories and are not handled by any other path hooks on + ``sys.path_hooks``. Instances have only one method: + + + .. method:: NullImporter.find_module(fullname [, path]) + + This method always returns ``None``, indicating that the requested module could + not be found. + + .. versionadded:: 2.5 + + +.. _examples-imp: + +Examples +-------- + +The following function emulates what was the standard import statement up to +Python 1.4 (no hierarchical module names). (This *implementation* wouldn't work +in that version, since :func:`find_module` has been extended and +:func:`load_module` has been added in 1.4.) :: + + import imp + import sys + + def __import__(name, globals=None, locals=None, fromlist=None): + # Fast path: see if the module has already been imported. + try: + return sys.modules[name] + except KeyError: + pass + + # If any of the following calls raises an exception, + # there's a problem we can't handle -- let the caller handle it. + + fp, pathname, description = imp.find_module(name) + + try: + return imp.load_module(name, fp, pathname, description) + finally: + # Since we may exit via an exception, close fp explicitly. + if fp: + fp.close() + +.. index:: module: knee + +A more complete example that implements hierarchical module names and includes a +:func:`reload` function can be found in the module :mod:`knee`. The :mod:`knee` +module can be found in :file:`Demo/imputil/` in the Python source distribution. + diff --git a/Doc/library/index.rst b/Doc/library/index.rst new file mode 100644 index 0000000..1e872ac --- /dev/null +++ b/Doc/library/index.rst @@ -0,0 +1,81 @@ +.. _library-index: + +############################### + The Python Standard Library +############################### + +:Release: |version| +:Date: |today| + +While the :ref:`reference-index` describes the exact syntax and +semantics of the Python language, this library reference manual +describes the standard library that is distributed with Python. It also +describes some of the optional components that are commonly included +in Python distributions. + +Python's standard library is very extensive, offering a wide range of +facilities as indicated by the long table of contents listed below. The +library contains built-in modules (written in C) that provide access to +system functionality such as file I/O that would otherwise be +inaccessible to Python programmers, as well as modules written in Python +that provide standardized solutions for many problems that occur in +everyday programming. Some of these modules are explicitly designed to +encourage and enhance the portability of Python programs by abstracting +away platform-specifics into platform-neutral APIs. + +The Python installers for the Windows and Mac platforms usually include +the entire standard library and often also include many additional +components. For Unix-like operating systems Python is normally provided +as a collection of packages, so it may be necessary to use the packaging +tools provided with the operating system to obtain some or all of the +optional components. + +In addition to the standard library, there is a growing collection of +over 2500 additional components available from the `Python Package Index +<http://pypi.python.org/pypi>`_. + + +.. toctree:: + :maxdepth: 2 + + intro.rst + functions.rst + constants.rst + objects.rst + stdtypes.rst + exceptions.rst + + strings.rst + datatypes.rst + numeric.rst + filesys.rst + persistence.rst + archiving.rst + fileformats.rst + crypto.rst + allos.rst + someos.rst + ipc.rst + netdata.rst + markup.rst + internet.rst + mm.rst + i18n.rst + frameworks.rst + tk.rst + development.rst + pdb.rst + profile.rst + hotshot.rst + timeit.rst + trace.rst + python.rst + custominterp.rst + modules.rst + language.rst + misc.rst + windows.rst + unix.rst + mac.rst + macosa.rst + undoc.rst diff --git a/Doc/library/inspect.rst b/Doc/library/inspect.rst new file mode 100644 index 0000000..edec9d5 --- /dev/null +++ b/Doc/library/inspect.rst @@ -0,0 +1,507 @@ + +:mod:`inspect` --- Inspect live objects +======================================= + +.. module:: inspect + :synopsis: Extract information and source code from live objects. +.. moduleauthor:: Ka-Ping Yee <ping@lfw.org> +.. sectionauthor:: Ka-Ping Yee <ping@lfw.org> + + +.. versionadded:: 2.1 + +The :mod:`inspect` module provides several useful functions to help get +information about live objects such as modules, classes, methods, functions, +tracebacks, frame objects, and code objects. For example, it can help you +examine the contents of a class, retrieve the source code of a method, extract +and format the argument list for a function, or get all the information you need +to display a detailed traceback. + +There are four main kinds of services provided by this module: type checking, +getting source code, inspecting classes and functions, and examining the +interpreter stack. + + +.. _inspect-types: + +Types and members +----------------- + +The :func:`getmembers` function retrieves the members of an object such as a +class or module. The eleven functions whose names begin with "is" are mainly +provided as convenient choices for the second argument to :func:`getmembers`. +They also help you determine when you can expect to find the following special +attributes: + ++-----------+-----------------+---------------------------+-------+ +| Type | Attribute | Description | Notes | ++===========+=================+===========================+=======+ +| module | __doc__ | documentation string | | ++-----------+-----------------+---------------------------+-------+ +| | __file__ | filename (missing for | | +| | | built-in modules) | | ++-----------+-----------------+---------------------------+-------+ +| class | __doc__ | documentation string | | ++-----------+-----------------+---------------------------+-------+ +| | __module__ | name of module in which | | +| | | this class was defined | | ++-----------+-----------------+---------------------------+-------+ +| method | __doc__ | documentation string | | ++-----------+-----------------+---------------------------+-------+ +| | __name__ | name with which this | | +| | | method was defined | | ++-----------+-----------------+---------------------------+-------+ +| | im_class | class object that asked | \(1) | +| | | for this method | | ++-----------+-----------------+---------------------------+-------+ +| | im_func | function object | | +| | | containing implementation | | +| | | of method | | ++-----------+-----------------+---------------------------+-------+ +| | im_self | instance to which this | | +| | | method is bound, or | | +| | | ``None`` | | ++-----------+-----------------+---------------------------+-------+ +| function | __doc__ | documentation string | | ++-----------+-----------------+---------------------------+-------+ +| | __name__ | name with which this | | +| | | function was defined | | ++-----------+-----------------+---------------------------+-------+ +| | __code__ | code object containing | | +| | | compiled function | | +| | | bytecode | | ++-----------+-----------------+---------------------------+-------+ +| | __defaults__ | tuple of any default | | +| | | values for arguments | | ++-----------+-----------------+---------------------------+-------+ +| | __globals__ | global namespace in which | | +| | | this function was defined | | ++-----------+-----------------+---------------------------+-------+ +| traceback | tb_frame | frame object at this | | +| | | level | | ++-----------+-----------------+---------------------------+-------+ +| | tb_lasti | index of last attempted | | +| | | instruction in bytecode | | ++-----------+-----------------+---------------------------+-------+ +| | tb_lineno | current line number in | | +| | | Python source code | | ++-----------+-----------------+---------------------------+-------+ +| | tb_next | next inner traceback | | +| | | object (called by this | | +| | | level) | | ++-----------+-----------------+---------------------------+-------+ +| frame | f_back | next outer frame object | | +| | | (this frame's caller) | | ++-----------+-----------------+---------------------------+-------+ +| | f_builtins | built-in namespace seen | | +| | | by this frame | | ++-----------+-----------------+---------------------------+-------+ +| | f_code | code object being | | +| | | executed in this frame | | ++-----------+-----------------+---------------------------+-------+ +| | f_exc_traceback | traceback if raised in | | +| | | this frame, or ``None`` | | ++-----------+-----------------+---------------------------+-------+ +| | f_exc_type | exception type if raised | | +| | | in this frame, or | | +| | | ``None`` | | ++-----------+-----------------+---------------------------+-------+ +| | f_exc_value | exception value if raised | | +| | | in this frame, or | | +| | | ``None`` | | ++-----------+-----------------+---------------------------+-------+ +| | f_globals | global namespace seen by | | +| | | this frame | | ++-----------+-----------------+---------------------------+-------+ +| | f_lasti | index of last attempted | | +| | | instruction in bytecode | | ++-----------+-----------------+---------------------------+-------+ +| | f_lineno | current line number in | | +| | | Python source code | | ++-----------+-----------------+---------------------------+-------+ +| | f_locals | local namespace seen by | | +| | | this frame | | ++-----------+-----------------+---------------------------+-------+ +| | f_restricted | 0 or 1 if frame is in | | +| | | restricted execution mode | | ++-----------+-----------------+---------------------------+-------+ +| | f_trace | tracing function for this | | +| | | frame, or ``None`` | | ++-----------+-----------------+---------------------------+-------+ +| code | co_argcount | number of arguments (not | | +| | | including \* or \*\* | | +| | | args) | | ++-----------+-----------------+---------------------------+-------+ +| | co_code | string of raw compiled | | +| | | bytecode | | ++-----------+-----------------+---------------------------+-------+ +| | co_consts | tuple of constants used | | +| | | in the bytecode | | ++-----------+-----------------+---------------------------+-------+ +| | co_filename | name of file in which | | +| | | this code object was | | +| | | created | | ++-----------+-----------------+---------------------------+-------+ +| | co_firstlineno | number of first line in | | +| | | Python source code | | ++-----------+-----------------+---------------------------+-------+ +| | co_flags | bitmap: 1=optimized ``|`` | | +| | | 2=newlocals ``|`` 4=\*arg | | +| | | ``|`` 8=\*\*arg | | ++-----------+-----------------+---------------------------+-------+ +| | co_lnotab | encoded mapping of line | | +| | | numbers to bytecode | | +| | | indices | | ++-----------+-----------------+---------------------------+-------+ +| | co_name | name with which this code | | +| | | object was defined | | ++-----------+-----------------+---------------------------+-------+ +| | co_names | tuple of names of local | | +| | | variables | | ++-----------+-----------------+---------------------------+-------+ +| | co_nlocals | number of local variables | | ++-----------+-----------------+---------------------------+-------+ +| | co_stacksize | virtual machine stack | | +| | | space required | | ++-----------+-----------------+---------------------------+-------+ +| | co_varnames | tuple of names of | | +| | | arguments and local | | +| | | variables | | ++-----------+-----------------+---------------------------+-------+ +| builtin | __doc__ | documentation string | | ++-----------+-----------------+---------------------------+-------+ +| | __name__ | original name of this | | +| | | function or method | | ++-----------+-----------------+---------------------------+-------+ +| | __self__ | instance to which a | | +| | | method is bound, or | | +| | | ``None`` | | ++-----------+-----------------+---------------------------+-------+ + +Note: + +(1) + .. versionchanged:: 2.2 + :attr:`im_class` used to refer to the class that defined the method. + + +.. function:: getmembers(object[, predicate]) + + Return all the members of an object in a list of (name, value) pairs sorted by + name. If the optional *predicate* argument is supplied, only members for which + the predicate returns a true value are included. + + +.. function:: getmoduleinfo(path) + + Return a tuple of values that describe how Python will interpret the file + identified by *path* if it is a module, or ``None`` if it would not be + identified as a module. The return tuple is ``(name, suffix, mode, mtype)``, + where *name* is the name of the module without the name of any enclosing + package, *suffix* is the trailing part of the file name (which may not be a + dot-delimited extension), *mode* is the :func:`open` mode that would be used + (``'r'`` or ``'rb'``), and *mtype* is an integer giving the type of the + module. *mtype* will have a value which can be compared to the constants + defined in the :mod:`imp` module; see the documentation for that module for + more information on module types. + + +.. function:: getmodulename(path) + + Return the name of the module named by the file *path*, without including the + names of enclosing packages. This uses the same algorithm as the interpreter + uses when searching for modules. If the name cannot be matched according to the + interpreter's rules, ``None`` is returned. + + +.. function:: ismodule(object) + + Return true if the object is a module. + + +.. function:: isclass(object) + + Return true if the object is a class. + + +.. function:: ismethod(object) + + Return true if the object is a method. + + +.. function:: isfunction(object) + + Return true if the object is a Python function or unnamed (lambda) function. + + +.. function:: istraceback(object) + + Return true if the object is a traceback. + + +.. function:: isframe(object) + + Return true if the object is a frame. + + +.. function:: iscode(object) + + Return true if the object is a code. + + +.. function:: isbuiltin(object) + + Return true if the object is a built-in function. + + +.. function:: isroutine(object) + + Return true if the object is a user-defined or built-in function or method. + + +.. function:: ismethoddescriptor(object) + + Return true if the object is a method descriptor, but not if ismethod() or + isclass() or isfunction() are true. + + This is new as of Python 2.2, and, for example, is true of int.__add__. An + object passing this test has a __get__ attribute but not a __set__ attribute, + but beyond that the set of attributes varies. __name__ is usually sensible, and + __doc__ often is. + + Methods implemented via descriptors that also pass one of the other tests return + false from the ismethoddescriptor() test, simply because the other tests promise + more -- you can, e.g., count on having the im_func attribute (etc) when an + object passes ismethod(). + + +.. function:: isdatadescriptor(object) + + Return true if the object is a data descriptor. + + Data descriptors have both a __get__ and a __set__ attribute. Examples are + properties (defined in Python), getsets, and members. The latter two are + defined in C and there are more specific tests available for those types, which + is robust across Python implementations. Typically, data descriptors will also + have __name__ and __doc__ attributes (properties, getsets, and members have both + of these attributes), but this is not guaranteed. + + .. versionadded:: 2.3 + + +.. function:: isgetsetdescriptor(object) + + Return true if the object is a getset descriptor. + + getsets are attributes defined in extension modules via ``PyGetSetDef`` + structures. For Python implementations without such types, this method will + always return ``False``. + + .. versionadded:: 2.5 + + +.. function:: ismemberdescriptor(object) + + Return true if the object is a member descriptor. + + Member descriptors are attributes defined in extension modules via + ``PyMemberDef`` structures. For Python implementations without such types, this + method will always return ``False``. + + .. versionadded:: 2.5 + + +.. _inspect-source: + +Retrieving source code +---------------------- + + +.. function:: getdoc(object) + + Get the documentation string for an object. All tabs are expanded to spaces. To + clean up docstrings that are indented to line up with blocks of code, any + whitespace than can be uniformly removed from the second line onwards is + removed. + + +.. function:: getcomments(object) + + Return in a single string any lines of comments immediately preceding the + object's source code (for a class, function, or method), or at the top of the + Python source file (if the object is a module). + + +.. function:: getfile(object) + + Return the name of the (text or binary) file in which an object was defined. + This will fail with a :exc:`TypeError` if the object is a built-in module, + class, or function. + + +.. function:: getmodule(object) + + Try to guess which module an object was defined in. + + +.. function:: getsourcefile(object) + + Return the name of the Python source file in which an object was defined. This + will fail with a :exc:`TypeError` if the object is a built-in module, class, or + function. + + +.. function:: getsourcelines(object) + + Return a list of source lines and starting line number for an object. The + argument may be a module, class, method, function, traceback, frame, or code + object. The source code is returned as a list of the lines corresponding to the + object and the line number indicates where in the original source file the first + line of code was found. An :exc:`IOError` is raised if the source code cannot + be retrieved. + + +.. function:: getsource(object) + + Return the text of the source code for an object. The argument may be a module, + class, method, function, traceback, frame, or code object. The source code is + returned as a single string. An :exc:`IOError` is raised if the source code + cannot be retrieved. + + +.. _inspect-classes-functions: + +Classes and functions +--------------------- + + +.. function:: getclasstree(classes[, unique]) + + Arrange the given list of classes into a hierarchy of nested lists. Where a + nested list appears, it contains classes derived from the class whose entry + immediately precedes the list. Each entry is a 2-tuple containing a class and a + tuple of its base classes. If the *unique* argument is true, exactly one entry + appears in the returned structure for each class in the given list. Otherwise, + classes using multiple inheritance and their descendants will appear multiple + times. + + +.. function:: getargspec(func) + + Get the names and default values of a function's arguments. A tuple of four + things is returned: ``(args, varargs, varkw, defaults)``. *args* is a list of + the argument names (it may contain nested lists). *varargs* and *varkw* are the + names of the ``*`` and ``**`` arguments or ``None``. *defaults* is a tuple of + default argument values or None if there are no default arguments; if this tuple + has *n* elements, they correspond to the last *n* elements listed in *args*. + + +.. function:: getargvalues(frame) + + Get information about arguments passed into a particular frame. A tuple of four + things is returned: ``(args, varargs, varkw, locals)``. *args* is a list of the + argument names (it may contain nested lists). *varargs* and *varkw* are the + names of the ``*`` and ``**`` arguments or ``None``. *locals* is the locals + dictionary of the given frame. + + +.. function:: formatargspec(args[, varargs, varkw, defaults, formatarg, formatvarargs, formatvarkw, formatvalue, join]) + + Format a pretty argument spec from the four values returned by + :func:`getargspec`. The format\* arguments are the corresponding optional + formatting functions that are called to turn names and values into strings. + + +.. function:: formatargvalues(args[, varargs, varkw, locals, formatarg, formatvarargs, formatvarkw, formatvalue, join]) + + Format a pretty argument spec from the four values returned by + :func:`getargvalues`. The format\* arguments are the corresponding optional + formatting functions that are called to turn names and values into strings. + + +.. function:: getmro(cls) + + Return a tuple of class cls's base classes, including cls, in method resolution + order. No class appears more than once in this tuple. Note that the method + resolution order depends on cls's type. Unless a very peculiar user-defined + metatype is in use, cls will be the first element of the tuple. + + +.. _inspect-stack: + +The interpreter stack +--------------------- + +When the following functions return "frame records," each record is a tuple of +six items: the frame object, the filename, the line number of the current line, +the function name, a list of lines of context from the source code, and the +index of the current line within that list. + +.. warning:: + + Keeping references to frame objects, as found in the first element of the frame + records these functions return, can cause your program to create reference + cycles. Once a reference cycle has been created, the lifespan of all objects + which can be accessed from the objects which form the cycle can become much + longer even if Python's optional cycle detector is enabled. If such cycles must + be created, it is important to ensure they are explicitly broken to avoid the + delayed destruction of objects and increased memory consumption which occurs. + + Though the cycle detector will catch these, destruction of the frames (and local + variables) can be made deterministic by removing the cycle in a + :keyword:`finally` clause. This is also important if the cycle detector was + disabled when Python was compiled or using :func:`gc.disable`. For example:: + + def handle_stackframe_without_leak(): + frame = inspect.currentframe() + try: + # do something with the frame + finally: + del frame + +The optional *context* argument supported by most of these functions specifies +the number of lines of context to return, which are centered around the current +line. + + +.. function:: getframeinfo(frame[, context]) + + Get information about a frame or traceback object. A 5-tuple is returned, the + last five elements of the frame's frame record. + + +.. function:: getouterframes(frame[, context]) + + Get a list of frame records for a frame and all outer frames. These frames + represent the calls that lead to the creation of *frame*. The first entry in the + returned list represents *frame*; the last entry represents the outermost call + on *frame*'s stack. + + +.. function:: getinnerframes(traceback[, context]) + + Get a list of frame records for a traceback's frame and all inner frames. These + frames represent calls made as a consequence of *frame*. The first entry in the + list represents *traceback*; the last entry represents where the exception was + raised. + + +.. function:: currentframe() + + Return the frame object for the caller's stack frame. + + +.. function:: stack([context]) + + Return a list of frame records for the caller's stack. The first entry in the + returned list represents the caller; the last entry represents the outermost + call on the stack. + + +.. function:: trace([context]) + + Return a list of frame records for the stack between the current frame and the + frame in which an exception currently being handled was raised in. The first + entry in the list represents the caller; the last entry represents where the + exception was raised. + diff --git a/Doc/library/internet.rst b/Doc/library/internet.rst new file mode 100644 index 0000000..16b0a44 --- /dev/null +++ b/Doc/library/internet.rst @@ -0,0 +1,47 @@ + +.. _internet: + +****************************** +Internet Protocols and Support +****************************** + +.. index:: + single: WWW + single: Internet + single: World Wide Web + +.. index:: module: socket + +The modules described in this chapter implement Internet protocols and support +for related technology. They are all implemented in Python. Most of these +modules require the presence of the system-dependent module :mod:`socket`, which +is currently supported on most popular platforms. Here is an overview: + + +.. toctree:: + + webbrowser.rst + cgi.rst + cgitb.rst + wsgiref.rst + urllib.rst + urllib2.rst + httplib.rst + ftplib.rst + poplib.rst + imaplib.rst + nntplib.rst + smtplib.rst + smtpd.rst + telnetlib.rst + uuid.rst + urlparse.rst + socketserver.rst + basehttpserver.rst + simplehttpserver.rst + cgihttpserver.rst + cookielib.rst + cookie.rst + xmlrpclib.rst + simplexmlrpcserver.rst + docxmlrpcserver.rst diff --git a/Doc/library/intro.rst b/Doc/library/intro.rst new file mode 100644 index 0000000..33bdefd --- /dev/null +++ b/Doc/library/intro.rst @@ -0,0 +1,51 @@ + +.. _library-intro: + +************ +Introduction +************ + +The "Python library" contains several different kinds of components. + +It contains data types that would normally be considered part of the "core" of a +language, such as numbers and lists. For these types, the Python language core +defines the form of literals and places some constraints on their semantics, but +does not fully define the semantics. (On the other hand, the language core does +define syntactic properties like the spelling and priorities of operators.) + +The library also contains built-in functions and exceptions --- objects that can +be used by all Python code without the need of an :keyword:`import` statement. +Some of these are defined by the core language, but many are not essential for +the core semantics and are only described here. + +The bulk of the library, however, consists of a collection of modules. There are +many ways to dissect this collection. Some modules are written in C and built +in to the Python interpreter; others are written in Python and imported in +source form. Some modules provide interfaces that are highly specific to +Python, like printing a stack trace; some provide interfaces that are specific +to particular operating systems, such as access to specific hardware; others +provide interfaces that are specific to a particular application domain, like +the World Wide Web. Some modules are available in all versions and ports of +Python; others are only available when the underlying system supports or +requires them; yet others are available only when a particular configuration +option was chosen at the time when Python was compiled and installed. + +This manual is organized "from the inside out:" it first describes the built-in +data types, then the built-in functions and exceptions, and finally the modules, +grouped in chapters of related modules. The ordering of the chapters as well as +the ordering of the modules within each chapter is roughly from most relevant to +least important. + +This means that if you start reading this manual from the start, and skip to the +next chapter when you get bored, you will get a reasonable overview of the +available modules and application areas that are supported by the Python +library. Of course, you don't *have* to read it like a novel --- you can also +browse the table of contents (in front of the manual), or look for a specific +function, module or term in the index (in the back). And finally, if you enjoy +learning about random subjects, you choose a random page number (see module +:mod:`random`) and read a section or two. Regardless of the order in which you +read the sections of this manual, it helps to start with chapter :ref:`builtin`, +as the remainder of the manual assumes familiarity with this material. + +Let the show begin! + diff --git a/Doc/library/ipc.rst b/Doc/library/ipc.rst new file mode 100644 index 0000000..fd425ed --- /dev/null +++ b/Doc/library/ipc.rst @@ -0,0 +1,24 @@ + +.. _ipc: + +***************************************** +Interprocess Communication and Networking +***************************************** + +The modules described in this chapter provide mechanisms for different processes +to communicate. + +Some modules only work for two processes that are on the same machine, e.g. +:mod:`signal` and :mod:`subprocess`. Other modules support networking protocols +that two or more processes can used to communicate across machines. + +The list of modules described in this chapter is: + + +.. toctree:: + + subprocess.rst + socket.rst + signal.rst + asyncore.rst + asynchat.rst diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst new file mode 100644 index 0000000..9f9cb24 --- /dev/null +++ b/Doc/library/itertools.rst @@ -0,0 +1,547 @@ + +:mod:`itertools` --- Functions creating iterators for efficient looping +======================================================================= + +.. module:: itertools + :synopsis: Functions creating iterators for efficient looping. +.. moduleauthor:: Raymond Hettinger <python@rcn.com> +.. sectionauthor:: Raymond Hettinger <python@rcn.com> + + +.. versionadded:: 2.3 + +This module implements a number of iterator building blocks inspired by +constructs from the Haskell and SML programming languages. Each has been recast +in a form suitable for Python. + +The module standardizes a core set of fast, memory efficient tools that are +useful by themselves or in combination. Standardization helps avoid the +readability and reliability problems which arise when many different individuals +create their own slightly varying implementations, each with their own quirks +and naming conventions. + +The tools are designed to combine readily with one another. This makes it easy +to construct more specialized tools succinctly and efficiently in pure Python. + +For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a +sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and +:func:`count` which can be combined to form ``imap(f, count())`` and produce an +equivalent result. + +Likewise, the functional tools are designed to work well with the high-speed +functions provided by the :mod:`operator` module. + +The module author welcomes suggestions for other basic building blocks to be +added to future versions of the module. + +Whether cast in pure python form or compiled code, tools that use iterators are +more memory efficient (and faster) than their list based counterparts. Adopting +the principles of just-in-time manufacturing, they create data when and where +needed instead of consuming memory with the computer equivalent of "inventory". + +The performance advantage of iterators becomes more acute as the number of +elements increases -- at some point, lists grow large enough to severely impact +memory cache performance and start running slowly. + + +.. seealso:: + + The Standard ML Basis Library, `The Standard ML Basis Library + <http://www.standardml.org/Basis/>`_. + + Haskell, A Purely Functional Language, `Definition of Haskell and the Standard + Libraries <http://www.haskell.org/definition/>`_. + + +.. _itertools-functions: + +Itertool functions +------------------ + +The following module functions all construct and return iterators. Some provide +streams of infinite length, so they should only be accessed by functions or +loops that truncate the stream. + + +.. function:: chain(*iterables) + + Make an iterator that returns elements from the first iterable until it is + exhausted, then proceeds to the next iterable, until all of the iterables are + exhausted. Used for treating consecutive sequences as a single sequence. + Equivalent to:: + + def chain(*iterables): + for it in iterables: + for element in it: + yield element + + +.. function:: count([n]) + + Make an iterator that returns consecutive integers starting with *n*. If not + specified *n* defaults to zero. Does not currently support python long + integers. Often used as an argument to :func:`imap` to generate consecutive + data points. Also, used with :func:`izip` to add sequence numbers. Equivalent + to:: + + def count(n=0): + while True: + yield n + n += 1 + + Note, :func:`count` does not check for overflow and will return negative numbers + after exceeding ``sys.maxint``. This behavior may change in the future. + + +.. function:: cycle(iterable) + + Make an iterator returning elements from the iterable and saving a copy of each. + When the iterable is exhausted, return elements from the saved copy. Repeats + indefinitely. Equivalent to:: + + def cycle(iterable): + saved = [] + for element in iterable: + yield element + saved.append(element) + while saved: + for element in saved: + yield element + + Note, this member of the toolkit may require significant auxiliary storage + (depending on the length of the iterable). + + +.. function:: dropwhile(predicate, iterable) + + Make an iterator that drops elements from the iterable as long as the predicate + is true; afterwards, returns every element. Note, the iterator does not produce + *any* output until the predicate first becomes false, so it may have a lengthy + start-up time. Equivalent to:: + + def dropwhile(predicate, iterable): + iterable = iter(iterable) + for x in iterable: + if not predicate(x): + yield x + break + for x in iterable: + yield x + + +.. function:: groupby(iterable[, key]) + + Make an iterator that returns consecutive keys and groups from the *iterable*. + The *key* is a function computing a key value for each element. If not + specified or is ``None``, *key* defaults to an identity function and returns + the element unchanged. Generally, the iterable needs to already be sorted on + the same key function. + + The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It + generates a break or new group every time the value of the key function changes + (which is why it is usually necessary to have sorted the data using the same key + function). That behavior differs from SQL's GROUP BY which aggregates common + elements regardless of their input order. + + The returned group is itself an iterator that shares the underlying iterable + with :func:`groupby`. Because the source is shared, when the :func:`groupby` + object is advanced, the previous group is no longer visible. So, if that data + is needed later, it should be stored as a list:: + + groups = [] + uniquekeys = [] + data = sorted(data, key=keyfunc) + for k, g in groupby(data, keyfunc): + groups.append(list(g)) # Store group iterator as a list + uniquekeys.append(k) + + :func:`groupby` is equivalent to:: + + class groupby(object): + def __init__(self, iterable, key=None): + if key is None: + key = lambda x: x + self.keyfunc = key + self.it = iter(iterable) + self.tgtkey = self.currkey = self.currvalue = [] + def __iter__(self): + return self + def __next__(self): + while self.currkey == self.tgtkey: + self.currvalue = next(self.it) # Exit on StopIteration + self.currkey = self.keyfunc(self.currvalue) + self.tgtkey = self.currkey + return (self.currkey, self._grouper(self.tgtkey)) + def _grouper(self, tgtkey): + while self.currkey == tgtkey: + yield self.currvalue + self.currvalue = next(self.it) # Exit on StopIteration + self.currkey = self.keyfunc(self.currvalue) + + .. versionadded:: 2.4 + + +.. function:: ifilter(predicate, iterable) + + Make an iterator that filters elements from iterable returning only those for + which the predicate is ``True``. If *predicate* is ``None``, return the items + that are true. Equivalent to:: + + def ifilter(predicate, iterable): + if predicate is None: + predicate = bool + for x in iterable: + if predicate(x): + yield x + + +.. function:: ifilterfalse(predicate, iterable) + + Make an iterator that filters elements from iterable returning only those for + which the predicate is ``False``. If *predicate* is ``None``, return the items + that are false. Equivalent to:: + + def ifilterfalse(predicate, iterable): + if predicate is None: + predicate = bool + for x in iterable: + if not predicate(x): + yield x + + +.. function:: imap(function, *iterables) + + Make an iterator that computes the function using arguments from each of the + iterables. If *function* is set to ``None``, then :func:`imap` returns the + arguments as a tuple. Like :func:`map` but stops when the shortest iterable is + exhausted instead of filling in ``None`` for shorter iterables. The reason for + the difference is that infinite iterator arguments are typically an error for + :func:`map` (because the output is fully evaluated) but represent a common and + useful way of supplying arguments to :func:`imap`. Equivalent to:: + + def imap(function, *iterables): + iterables = map(iter, iterables) + while True: + args = [next(i) for i in iterables] + if function is None: + yield tuple(args) + else: + yield function(*args) + + +.. function:: islice(iterable, [start,] stop [, step]) + + Make an iterator that returns selected elements from the iterable. If *start* is + non-zero, then elements from the iterable are skipped until start is reached. + Afterward, elements are returned consecutively unless *step* is set higher than + one which results in items being skipped. If *stop* is ``None``, then iteration + continues until the iterator is exhausted, if at all; otherwise, it stops at the + specified position. Unlike regular slicing, :func:`islice` does not support + negative values for *start*, *stop*, or *step*. Can be used to extract related + fields from data where the internal structure has been flattened (for example, a + multi-line report may list a name field on every third line). Equivalent to:: + + def islice(iterable, *args): + s = slice(*args) + it = iter(range(s.start or 0, s.stop or sys.maxint, s.step or 1)) + nexti = next(it) + for i, element in enumerate(iterable): + if i == nexti: + yield element + nexti = next(it) + + If *start* is ``None``, then iteration starts at zero. If *step* is ``None``, + then the step defaults to one. + + .. versionchanged:: 2.5 + accept ``None`` values for default *start* and *step*. + + +.. function:: izip(*iterables) + + Make an iterator that aggregates elements from each of the iterables. Like + :func:`zip` except that it returns an iterator instead of a list. Used for + lock-step iteration over several iterables at a time. Equivalent to:: + + def izip(*iterables): + iterables = map(iter, iterables) + while iterables: + result = [next(it) for it in iterables] + yield tuple(result) + + .. versionchanged:: 2.4 + When no iterables are specified, returns a zero length iterator instead of + raising a :exc:`TypeError` exception. + + Note, the left-to-right evaluation order of the iterables is guaranteed. This + makes possible an idiom for clustering a data series into n-length groups using + ``izip(*[iter(s)]*n)``. For data that doesn't fit n-length groups exactly, the + last tuple can be pre-padded with fill values using ``izip(*[chain(s, + [None]*(n-1))]*n)``. + + Note, when :func:`izip` is used with unequal length inputs, subsequent + iteration over the longer iterables cannot reliably be continued after + :func:`izip` terminates. Potentially, up to one entry will be missing from + each of the left-over iterables. This occurs because a value is fetched from + each iterator in- turn, but the process ends when one of the iterators + terminates. This leaves the last fetched values in limbo (they cannot be + returned in a final, incomplete tuple and they are cannot be pushed back into + the iterator for retrieval with ``next(it)``). In general, :func:`izip` + should only be used with unequal length inputs when you don't care about + trailing, unmatched values from the longer iterables. + + +.. function:: izip_longest(*iterables[, fillvalue]) + + Make an iterator that aggregates elements from each of the iterables. If the + iterables are of uneven length, missing values are filled-in with *fillvalue*. + Iteration continues until the longest iterable is exhausted. Equivalent to:: + + def izip_longest(*args, **kwds): + fillvalue = kwds.get('fillvalue') + def sentinel(counter = ([fillvalue]*(len(args)-1)).pop): + yield counter() # yields the fillvalue, or raises IndexError + fillers = repeat(fillvalue) + iters = [chain(it, sentinel(), fillers) for it in args] + try: + for tup in izip(*iters): + yield tup + except IndexError: + pass + + If one of the iterables is potentially infinite, then the :func:`izip_longest` + function should be wrapped with something that limits the number of calls (for + example :func:`islice` or :func:`takewhile`). + + .. versionadded:: 2.6 + + +.. function:: repeat(object[, times]) + + Make an iterator that returns *object* over and over again. Runs indefinitely + unless the *times* argument is specified. Used as argument to :func:`imap` for + invariant parameters to the called function. Also used with :func:`izip` to + create an invariant part of a tuple record. Equivalent to:: + + def repeat(object, times=None): + if times is None: + while True: + yield object + else: + for i in range(times): + yield object + + +.. function:: starmap(function, iterable) + + Make an iterator that computes the function using arguments tuples obtained from + the iterable. Used instead of :func:`imap` when argument parameters are already + grouped in tuples from a single iterable (the data has been "pre-zipped"). The + difference between :func:`imap` and :func:`starmap` parallels the distinction + between ``function(a,b)`` and ``function(*c)``. Equivalent to:: + + def starmap(function, iterable): + iterable = iter(iterable) + while True: + yield function(*next(iterable)) + + +.. function:: takewhile(predicate, iterable) + + Make an iterator that returns elements from the iterable as long as the + predicate is true. Equivalent to:: + + def takewhile(predicate, iterable): + for x in iterable: + if predicate(x): + yield x + else: + break + + +.. function:: tee(iterable[, n=2]) + + Return *n* independent iterators from a single iterable. The case where ``n==2`` + is equivalent to:: + + def tee(iterable): + def gen(next, data={}, cnt=[0]): + for i in count(): + if i == cnt[0]: + item = data[i] = next() + cnt[0] += 1 + else: + item = data.pop(i) + yield item + it = iter(iterable) + return (gen(it.__next__), gen(it.__next__)) + + Note, once :func:`tee` has made a split, the original *iterable* should not be + used anywhere else; otherwise, the *iterable* could get advanced without the tee + objects being informed. + + Note, this member of the toolkit may require significant auxiliary storage + (depending on how much temporary data needs to be stored). In general, if one + iterator is going to use most or all of the data before the other iterator, it + is faster to use :func:`list` instead of :func:`tee`. + + .. versionadded:: 2.4 + + +.. _itertools-example: + +Examples +-------- + +The following examples show common uses for each tool and demonstrate ways they +can be combined. :: + + >>> amounts = [120.15, 764.05, 823.14] + >>> for checknum, amount in izip(count(1200), amounts): + ... print 'Check %d is for $%.2f' % (checknum, amount) + ... + Check 1200 is for $120.15 + Check 1201 is for $764.05 + Check 1202 is for $823.14 + + >>> import operator + >>> for cube in imap(operator.pow, range(1,5), repeat(3)): + ... print cube + ... + 1 + 8 + 27 + 64 + + >>> reportlines = ['EuroPython', 'Roster', '', 'alex', '', 'laura', + ... '', 'martin', '', 'walter', '', 'mark'] + >>> for name in islice(reportlines, 3, None, 2): + ... print name.title() + ... + Alex + Laura + Martin + Walter + Mark + + # Show a dictionary sorted and grouped by value + >>> from operator import itemgetter + >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3) + >>> di = sorted(d.iteritems(), key=itemgetter(1)) + >>> for k, g in groupby(di, key=itemgetter(1)): + ... print k, map(itemgetter(0), g) + ... + 1 ['a', 'c', 'e'] + 2 ['b', 'd', 'f'] + 3 ['g'] + + # Find runs of consecutive numbers using groupby. The key to the solution + # is differencing with a range so that consecutive numbers all appear in + # same group. + >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28] + >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]): + ... print map(operator.itemgetter(1), g) + ... + [1] + [4, 5, 6] + [10] + [15, 16, 17, 18] + [22] + [25, 26, 27, 28] + + + +.. _itertools-recipes: + +Recipes +------- + +This section shows recipes for creating an extended toolset using the existing +itertools as building blocks. + +The extended tools offer the same high performance as the underlying toolset. +The superior memory performance is kept by processing elements one at a time +rather than bringing the whole iterable into memory all at once. Code volume is +kept small by linking the tools together in a functional style which helps +eliminate temporary variables. High speed is retained by preferring +"vectorized" building blocks over the use of for-loops and generators which +incur interpreter overhead. :: + + def take(n, seq): + return list(islice(seq, n)) + + def enumerate(iterable): + return izip(count(), iterable) + + def tabulate(function): + "Return function(0), function(1), ..." + return imap(function, count()) + + def iteritems(mapping): + return izip(mapping.iterkeys(), mapping.itervalues()) + + def nth(iterable, n): + "Returns the nth item or raise StopIteration" + return islice(iterable, n, None).next() + + def all(seq, pred=None): + "Returns True if pred(x) is true for every element in the iterable" + for elem in ifilterfalse(pred, seq): + return False + return True + + def any(seq, pred=None): + "Returns True if pred(x) is true for at least one element in the iterable" + for elem in ifilter(pred, seq): + return True + return False + + def no(seq, pred=None): + "Returns True if pred(x) is false for every element in the iterable" + for elem in ifilter(pred, seq): + return False + return True + + def quantify(seq, pred=None): + "Count how many times the predicate is true in the sequence" + return sum(imap(pred, seq)) + + def padnone(seq): + """Returns the sequence elements and then returns None indefinitely. + + Useful for emulating the behavior of the built-in map() function. + """ + return chain(seq, repeat(None)) + + def ncycles(seq, n): + "Returns the sequence elements n times" + return chain(*repeat(seq, n)) + + def dotproduct(vec1, vec2): + return sum(imap(operator.mul, vec1, vec2)) + + def flatten(listOfLists): + return list(chain(*listOfLists)) + + def repeatfunc(func, times=None, *args): + """Repeat calls to func with specified arguments. + + Example: repeatfunc(random.random) + """ + if times is None: + return starmap(func, repeat(args)) + else: + return starmap(func, repeat(args, times)) + + def pairwise(iterable): + "s -> (s0,s1), (s1,s2), (s2, s3), ..." + a, b = tee(iterable) + next(b, None) + return izip(a, b) + + def grouper(n, iterable, padvalue=None): + "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')" + return izip(*[chain(iterable, repeat(padvalue, n-1))]*n) + + + diff --git a/Doc/library/keyword.rst b/Doc/library/keyword.rst new file mode 100644 index 0000000..32a2d34 --- /dev/null +++ b/Doc/library/keyword.rst @@ -0,0 +1,22 @@ + +:mod:`keyword` --- Testing for Python keywords +============================================== + +.. module:: keyword + :synopsis: Test whether a string is a keyword in Python. + + +This module allows a Python program to determine if a string is a keyword. + + +.. function:: iskeyword(s) + + Return true if *s* is a Python keyword. + + +.. data:: kwlist + + Sequence containing all the keywords defined for the interpreter. If any + keywords are defined to only be active when particular :mod:`__future__` + statements are in effect, these will be included as well. + diff --git a/Doc/library/language.rst b/Doc/library/language.rst new file mode 100644 index 0000000..7d6af7d --- /dev/null +++ b/Doc/library/language.rst @@ -0,0 +1,29 @@ + +.. _language: + +************************ +Python Language Services +************************ + +Python provides a number of modules to assist in working with the Python +language. These modules support tokenizing, parsing, syntax analysis, bytecode +disassembly, and various other facilities. + +These modules include: + + +.. toctree:: + + parser.rst + _ast.rst + symbol.rst + token.rst + keyword.rst + tokenize.rst + tabnanny.rst + pyclbr.rst + py_compile.rst + compileall.rst + dis.rst + pickletools.rst + distutils.rst diff --git a/Doc/library/linecache.rst b/Doc/library/linecache.rst new file mode 100644 index 0000000..f3d8379 --- /dev/null +++ b/Doc/library/linecache.rst @@ -0,0 +1,52 @@ + +:mod:`linecache` --- Random access to text lines +================================================ + +.. module:: linecache + :synopsis: This module provides random access to individual lines from text files. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`linecache` module allows one to get any line from any file, while +attempting to optimize internally, using a cache, the common case where many +lines are read from a single file. This is used by the :mod:`traceback` module +to retrieve source lines for inclusion in the formatted traceback. + +The :mod:`linecache` module defines the following functions: + + +.. function:: getline(filename, lineno[, module_globals]) + + Get line *lineno* from file named *filename*. This function will never throw an + exception --- it will return ``''`` on errors (the terminating newline character + will be included for lines that are found). + + .. index:: triple: module; search; path + + If a file named *filename* is not found, the function will look for it in the + module search path, ``sys.path``, after first checking for a :pep:`302` + ``__loader__`` in *module_globals*, in case the module was imported from a + zipfile or other non-filesystem import source. + + .. versionadded:: 2.5 + The *module_globals* parameter was added. + + +.. function:: clearcache() + + Clear the cache. Use this function if you no longer need lines from files + previously read using :func:`getline`. + + +.. function:: checkcache([filename]) + + Check the cache for validity. Use this function if files in the cache may have + changed on disk, and you require the updated version. If *filename* is omitted, + it will check all the entries in the cache. + +Example:: + + >>> import linecache + >>> linecache.getline('/etc/passwd', 4) + 'sys:x:3:3:sys:/dev:/bin/sh\n' + diff --git a/Doc/library/locale.rst b/Doc/library/locale.rst new file mode 100644 index 0000000..6d427b7 --- /dev/null +++ b/Doc/library/locale.rst @@ -0,0 +1,578 @@ + +:mod:`locale` --- Internationalization services +=============================================== + +.. module:: locale + :synopsis: Internationalization services. +.. moduleauthor:: Martin von Löwis <martin@v.loewis.de> +.. sectionauthor:: Martin von Löwis <martin@v.loewis.de> + + +The :mod:`locale` module opens access to the POSIX locale database and +functionality. The POSIX locale mechanism allows programmers to deal with +certain cultural issues in an application, without requiring the programmer to +know all the specifics of each country where the software is executed. + +.. index:: module: _locale + +The :mod:`locale` module is implemented on top of the :mod:`_locale` module, +which in turn uses an ANSI C locale implementation if available. + +The :mod:`locale` module defines the following exception and functions: + + +.. exception:: Error + + Exception raised when :func:`setlocale` fails. + + +.. function:: setlocale(category[, locale]) + + If *locale* is specified, it may be a string, a tuple of the form ``(language + code, encoding)``, or ``None``. If it is a tuple, it is converted to a string + using the locale aliasing engine. If *locale* is given and not ``None``, + :func:`setlocale` modifies the locale setting for the *category*. The available + categories are listed in the data description below. The value is the name of a + locale. An empty string specifies the user's default settings. If the + modification of the locale fails, the exception :exc:`Error` is raised. If + successful, the new locale setting is returned. + + If *locale* is omitted or ``None``, the current setting for *category* is + returned. + + :func:`setlocale` is not thread safe on most systems. Applications typically + start with a call of :: + + import locale + locale.setlocale(locale.LC_ALL, '') + + This sets the locale for all categories to the user's default setting (typically + specified in the :envvar:`LANG` environment variable). If the locale is not + changed thereafter, using multithreading should not cause problems. + + .. versionchanged:: 2.0 + Added support for tuple values of the *locale* parameter. + + +.. function:: localeconv() + + Returns the database of the local conventions as a dictionary. This dictionary + has the following strings as keys: + + +----------------------+-------------------------------------+--------------------------------+ + | Category | Key | Meaning | + +======================+=====================================+================================+ + | :const:`LC_NUMERIC` | ``'decimal_point'`` | Decimal point character. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'grouping'`` | Sequence of numbers specifying | + | | | which relative positions the | + | | | ``'thousands_sep'`` is | + | | | expected. If the sequence is | + | | | terminated with | + | | | :const:`CHAR_MAX`, no further | + | | | grouping is performed. If the | + | | | sequence terminates with a | + | | | ``0``, the last group size is | + | | | repeatedly used. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'thousands_sep'`` | Character used between groups. | + +----------------------+-------------------------------------+--------------------------------+ + | :const:`LC_MONETARY` | ``'int_curr_symbol'`` | International currency symbol. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'currency_symbol'`` | Local currency symbol. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'p_cs_precedes/n_cs_precedes'`` | Whether the currency symbol | + | | | precedes the value (for | + | | | positive resp. negative | + | | | values). | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'p_sep_by_space/n_sep_by_space'`` | Whether the currency symbol is | + | | | separated from the value by a | + | | | space (for positive resp. | + | | | negative values). | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'mon_decimal_point'`` | Decimal point used for | + | | | monetary values. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'frac_digits'`` | Number of fractional digits | + | | | used in local formatting of | + | | | monetary values. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'int_frac_digits'`` | Number of fractional digits | + | | | used in international | + | | | formatting of monetary values. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'mon_thousands_sep'`` | Group separator used for | + | | | monetary values. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'mon_grouping'`` | Equivalent to ``'grouping'``, | + | | | used for monetary values. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'positive_sign'`` | Symbol used to annotate a | + | | | positive monetary value. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'negative_sign'`` | Symbol used to annotate a | + | | | negative monetary value. | + +----------------------+-------------------------------------+--------------------------------+ + | | ``'p_sign_posn/n_sign_posn'`` | The position of the sign (for | + | | | positive resp. negative | + | | | values), see below. | + +----------------------+-------------------------------------+--------------------------------+ + + All numeric values can be set to :const:`CHAR_MAX` to indicate that there is no + value specified in this locale. + + The possible values for ``'p_sign_posn'`` and ``'n_sign_posn'`` are given below. + + +--------------+-----------------------------------------+ + | Value | Explanation | + +==============+=========================================+ + | ``0`` | Currency and value are surrounded by | + | | parentheses. | + +--------------+-----------------------------------------+ + | ``1`` | The sign should precede the value and | + | | currency symbol. | + +--------------+-----------------------------------------+ + | ``2`` | The sign should follow the value and | + | | currency symbol. | + +--------------+-----------------------------------------+ + | ``3`` | The sign should immediately precede the | + | | value. | + +--------------+-----------------------------------------+ + | ``4`` | The sign should immediately follow the | + | | value. | + +--------------+-----------------------------------------+ + | ``CHAR_MAX`` | Nothing is specified in this locale. | + +--------------+-----------------------------------------+ + + +.. function:: nl_langinfo(option) + + Return some locale-specific information as a string. This function is not + available on all systems, and the set of possible options might also vary across + platforms. The possible argument values are numbers, for which symbolic + constants are available in the locale module. + + +.. function:: getdefaultlocale([envvars]) + + Tries to determine the default locale settings and returns them as a tuple of + the form ``(language code, encoding)``. + + According to POSIX, a program which has not called ``setlocale(LC_ALL, '')`` + runs using the portable ``'C'`` locale. Calling ``setlocale(LC_ALL, '')`` lets + it use the default locale as defined by the :envvar:`LANG` variable. Since we + do not want to interfere with the current locale setting we thus emulate the + behavior in the way described above. + + To maintain compatibility with other platforms, not only the :envvar:`LANG` + variable is tested, but a list of variables given as envvars parameter. The + first found to be defined will be used. *envvars* defaults to the search path + used in GNU gettext; it must always contain the variable name ``LANG``. The GNU + gettext search path contains ``'LANGUAGE'``, ``'LC_ALL'``, ``'LC_CTYPE'``, and + ``'LANG'``, in that order. + + Except for the code ``'C'``, the language code corresponds to :rfc:`1766`. + *language code* and *encoding* may be ``None`` if their values cannot be + determined. + + .. versionadded:: 2.0 + + +.. function:: getlocale([category]) + + Returns the current setting for the given locale category as sequence containing + *language code*, *encoding*. *category* may be one of the :const:`LC_\*` values + except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`. + + Except for the code ``'C'``, the language code corresponds to :rfc:`1766`. + *language code* and *encoding* may be ``None`` if their values cannot be + determined. + + .. versionadded:: 2.0 + + +.. function:: getpreferredencoding([do_setlocale]) + + Return the encoding used for text data, according to user preferences. User + preferences are expressed differently on different systems, and might not be + available programmatically on some systems, so this function only returns a + guess. + + On some systems, it is necessary to invoke :func:`setlocale` to obtain the user + preferences, so this function is not thread-safe. If invoking setlocale is not + necessary or desired, *do_setlocale* should be set to ``False``. + + .. versionadded:: 2.3 + + +.. function:: normalize(localename) + + Returns a normalized locale code for the given locale name. The returned locale + code is formatted for use with :func:`setlocale`. If normalization fails, the + original name is returned unchanged. + + If the given encoding is not known, the function defaults to the default + encoding for the locale code just like :func:`setlocale`. + + .. versionadded:: 2.0 + + +.. function:: resetlocale([category]) + + Sets the locale for *category* to the default setting. + + The default setting is determined by calling :func:`getdefaultlocale`. + *category* defaults to :const:`LC_ALL`. + + .. versionadded:: 2.0 + + +.. function:: strcoll(string1, string2) + + Compares two strings according to the current :const:`LC_COLLATE` setting. As + any other compare function, returns a negative, or a positive value, or ``0``, + depending on whether *string1* collates before or after *string2* or is equal to + it. + + +.. function:: strxfrm(string) + + .. index:: builtin: cmp + + Transforms a string to one that can be used for the built-in function + :func:`cmp`, and still returns locale-aware results. This function can be used + when the same string is compared repeatedly, e.g. when collating a sequence of + strings. + + +.. function:: format(format, val[, grouping[, monetary]]) + + Formats a number *val* according to the current :const:`LC_NUMERIC` setting. + The format follows the conventions of the ``%`` operator. For floating point + values, the decimal point is modified if appropriate. If *grouping* is true, + also takes the grouping into account. + + If *monetary* is true, the conversion uses monetary thousands separator and + grouping strings. + + Please note that this function will only work for exactly one %char specifier. + For whole format strings, use :func:`format_string`. + + .. versionchanged:: 2.5 + Added the *monetary* parameter. + + +.. function:: format_string(format, val[, grouping]) + + Processes formatting specifiers as in ``format % val``, but takes the current + locale settings into account. + + .. versionadded:: 2.5 + + +.. function:: currency(val[, symbol[, grouping[, international]]]) + + Formats a number *val* according to the current :const:`LC_MONETARY` settings. + + The returned string includes the currency symbol if *symbol* is true, which is + the default. If *grouping* is true (which is not the default), grouping is done + with the value. If *international* is true (which is not the default), the + international currency symbol is used. + + Note that this function will not work with the 'C' locale, so you have to set a + locale via :func:`setlocale` first. + + .. versionadded:: 2.5 + + +.. function:: str(float) + + Formats a floating point number using the same format as the built-in function + ``str(float)``, but takes the decimal point into account. + + +.. function:: atof(string) + + Converts a string to a floating point number, following the :const:`LC_NUMERIC` + settings. + + +.. function:: atoi(string) + + Converts a string to an integer, following the :const:`LC_NUMERIC` conventions. + + +.. data:: LC_CTYPE + + .. index:: module: string + + Locale category for the character type functions. Depending on the settings of + this category, the functions of module :mod:`string` dealing with case change + their behaviour. + + +.. data:: LC_COLLATE + + Locale category for sorting strings. The functions :func:`strcoll` and + :func:`strxfrm` of the :mod:`locale` module are affected. + + +.. data:: LC_TIME + + Locale category for the formatting of time. The function :func:`time.strftime` + follows these conventions. + + +.. data:: LC_MONETARY + + Locale category for formatting of monetary values. The available options are + available from the :func:`localeconv` function. + + +.. data:: LC_MESSAGES + + Locale category for message display. Python currently does not support + application specific locale-aware messages. Messages displayed by the operating + system, like those returned by :func:`os.strerror` might be affected by this + category. + + +.. data:: LC_NUMERIC + + Locale category for formatting numbers. The functions :func:`format`, + :func:`atoi`, :func:`atof` and :func:`str` of the :mod:`locale` module are + affected by that category. All other numeric formatting operations are not + affected. + + +.. data:: LC_ALL + + Combination of all locale settings. If this flag is used when the locale is + changed, setting the locale for all categories is attempted. If that fails for + any category, no category is changed at all. When the locale is retrieved using + this flag, a string indicating the setting for all categories is returned. This + string can be later used to restore the settings. + + +.. data:: CHAR_MAX + + This is a symbolic constant used for different values returned by + :func:`localeconv`. + +The :func:`nl_langinfo` function accepts one of the following keys. Most +descriptions are taken from the corresponding description in the GNU C library. + + +.. data:: CODESET + + Return a string with the name of the character encoding used in the selected + locale. + + +.. data:: D_T_FMT + + Return a string that can be used as a format string for strftime(3) to represent + time and date in a locale-specific way. + + +.. data:: D_FMT + + Return a string that can be used as a format string for strftime(3) to represent + a date in a locale-specific way. + + +.. data:: T_FMT + + Return a string that can be used as a format string for strftime(3) to represent + a time in a locale-specific way. + + +.. data:: T_FMT_AMPM + + The return value can be used as a format string for 'strftime' to represent time + in the am/pm format. + + +.. data:: DAY_1 ... DAY_7 + + Return name of the n-th day of the week. + + .. warning:: + + This follows the US convention of :const:`DAY_1` being Sunday, not the + international convention (ISO 8601) that Monday is the first day of the week. + + +.. data:: ABDAY_1 ... ABDAY_7 + + Return abbreviated name of the n-th day of the week. + + +.. data:: MON_1 ... MON_12 + + Return name of the n-th month. + + +.. data:: ABMON_1 ... ABMON_12 + + Return abbreviated name of the n-th month. + + +.. data:: RADIXCHAR + + Return radix character (decimal dot, decimal comma, etc.) + + +.. data:: THOUSEP + + Return separator character for thousands (groups of three digits). + + +.. data:: YESEXPR + + Return a regular expression that can be used with the regex function to + recognize a positive response to a yes/no question. + + .. warning:: + + The expression is in the syntax suitable for the :cfunc:`regex` function from + the C library, which might differ from the syntax used in :mod:`re`. + + +.. data:: NOEXPR + + Return a regular expression that can be used with the regex(3) function to + recognize a negative response to a yes/no question. + + +.. data:: CRNCYSTR + + Return the currency symbol, preceded by "-" if the symbol should appear before + the value, "+" if the symbol should appear after the value, or "." if the symbol + should replace the radix character. + + +.. data:: ERA + + The return value represents the era used in the current locale. + + Most locales do not define this value. An example of a locale which does define + this value is the Japanese one. In Japan, the traditional representation of + dates includes the name of the era corresponding to the then-emperor's reign. + + Normally it should not be necessary to use this value directly. Specifying the + ``E`` modifier in their format strings causes the :func:`strftime` function to + use this information. The format of the returned string is not specified, and + therefore you should not assume knowledge of it on different systems. + + +.. data:: ERA_YEAR + + The return value gives the year in the relevant era of the locale. + + +.. data:: ERA_D_T_FMT + + This return value can be used as a format string for :func:`strftime` to + represent dates and times in a locale-specific era-based way. + + +.. data:: ERA_D_FMT + + This return value can be used as a format string for :func:`strftime` to + represent time in a locale-specific era-based way. + + +.. data:: ALT_DIGITS + + The return value is a representation of up to 100 values used to represent the + values 0 to 99. + +Example:: + + >>> import locale + >>> loc = locale.getlocale(locale.LC_ALL) # get current locale + >>> locale.setlocale(locale.LC_ALL, 'de_DE') # use German locale; name might vary with platform + >>> locale.strcoll('f\xe4n', 'foo') # compare a string containing an umlaut + >>> locale.setlocale(locale.LC_ALL, '') # use user's preferred locale + >>> locale.setlocale(locale.LC_ALL, 'C') # use default (C) locale + >>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale + + +Background, details, hints, tips and caveats +-------------------------------------------- + +The C standard defines the locale as a program-wide property that may be +relatively expensive to change. On top of that, some implementation are broken +in such a way that frequent locale changes may cause core dumps. This makes the +locale somewhat painful to use correctly. + +Initially, when a program is started, the locale is the ``C`` locale, no matter +what the user's preferred locale is. The program must explicitly say that it +wants the user's preferred locale settings by calling ``setlocale(LC_ALL, '')``. + +It is generally a bad idea to call :func:`setlocale` in some library routine, +since as a side effect it affects the entire program. Saving and restoring it +is almost as bad: it is expensive and affects other threads that happen to run +before the settings have been restored. + +If, when coding a module for general use, you need a locale independent version +of an operation that is affected by the locale (such as :func:`string.lower`, or +certain formats used with :func:`time.strftime`), you will have to find a way to +do it without using the standard library routine. Even better is convincing +yourself that using locale settings is okay. Only as a last resort should you +document that your module is not compatible with non-\ ``C`` locale settings. + +.. index:: module: string + +The case conversion functions in the :mod:`string` module are affected by the +locale settings. When a call to the :func:`setlocale` function changes the +:const:`LC_CTYPE` settings, the variables ``string.lowercase``, +``string.uppercase`` and ``string.letters`` are recalculated. Note that code +that uses these variable through ':keyword:`from` ... :keyword:`import` ...', +e.g. ``from string import letters``, is not affected by subsequent +:func:`setlocale` calls. + +The only way to perform numeric operations according to the locale is to use the +special functions defined by this module: :func:`atof`, :func:`atoi`, +:func:`format`, :func:`str`. + + +.. _embedding-locale: + +For extension writers and programs that embed Python +---------------------------------------------------- + +Extension modules should never call :func:`setlocale`, except to find out what +the current locale is. But since the return value can only be used portably to +restore it, that is not very useful (except perhaps to find out whether or not +the locale is ``C``). + +When Python code uses the :mod:`locale` module to change the locale, this also +affects the embedding application. If the embedding application doesn't want +this to happen, it should remove the :mod:`_locale` extension module (which does +all the work) from the table of built-in modules in the :file:`config.c` file, +and make sure that the :mod:`_locale` module is not accessible as a shared +library. + + +.. _locale-gettext: + +Access to message catalogs +-------------------------- + +The locale module exposes the C library's gettext interface on systems that +provide this interface. It consists of the functions :func:`gettext`, +:func:`dgettext`, :func:`dcgettext`, :func:`textdomain`, :func:`bindtextdomain`, +and :func:`bind_textdomain_codeset`. These are similar to the same functions in +the :mod:`gettext` module, but use the C library's binary format for message +catalogs, and the C library's search algorithms for locating message catalogs. + +Python applications should normally find no need to invoke these functions, and +should use :mod:`gettext` instead. A known exception to this rule are +applications that link use additional C libraries which internally invoke +:cfunc:`gettext` or :func:`dcgettext`. For these applications, it may be +necessary to bind the text domain, so that the libraries can properly locate +their message catalogs. + diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst new file mode 100644 index 0000000..218fb0d --- /dev/null +++ b/Doc/library/logging.rst @@ -0,0 +1,1857 @@ +:mod:`logging` --- Logging facility for Python +============================================== + +.. module:: logging + :synopsis: Flexible error logging system for applications. + + +.. moduleauthor:: Vinay Sajip <vinay_sajip@red-dove.com> +.. sectionauthor:: Vinay Sajip <vinay_sajip@red-dove.com> + + +.. % These apply to all modules, and may be given more than once: + + + +.. index:: pair: Errors; logging + +.. versionadded:: 2.3 + +This module defines functions and classes which implement a flexible error +logging system for applications. + +Logging is performed by calling methods on instances of the :class:`Logger` +class (hereafter called :dfn:`loggers`). Each instance has a name, and they are +conceptually arranged in a name space hierarchy using dots (periods) as +separators. For example, a logger named "scan" is the parent of loggers +"scan.text", "scan.html" and "scan.pdf". Logger names can be anything you want, +and indicate the area of an application in which a logged message originates. + +Logged messages also have levels of importance associated with them. The default +levels provided are :const:`DEBUG`, :const:`INFO`, :const:`WARNING`, +:const:`ERROR` and :const:`CRITICAL`. As a convenience, you indicate the +importance of a logged message by calling an appropriate method of +:class:`Logger`. The methods are :meth:`debug`, :meth:`info`, :meth:`warning`, +:meth:`error` and :meth:`critical`, which mirror the default levels. You are not +constrained to use these levels: you can specify your own and use a more general +:class:`Logger` method, :meth:`log`, which takes an explicit level argument. + +The numeric values of logging levels are given in the following table. These are +primarily of interest if you want to define your own levels, and need them to +have specific values relative to the predefined levels. If you define a level +with the same numeric value, it overwrites the predefined value; the predefined +name is lost. + ++--------------+---------------+ +| Level | Numeric value | ++==============+===============+ +| ``CRITICAL`` | 50 | ++--------------+---------------+ +| ``ERROR`` | 40 | ++--------------+---------------+ +| ``WARNING`` | 30 | ++--------------+---------------+ +| ``INFO`` | 20 | ++--------------+---------------+ +| ``DEBUG`` | 10 | ++--------------+---------------+ +| ``NOTSET`` | 0 | ++--------------+---------------+ + +Levels can also be associated with loggers, being set either by the developer or +through loading a saved logging configuration. When a logging method is called +on a logger, the logger compares its own level with the level associated with +the method call. If the logger's level is higher than the method call's, no +logging message is actually generated. This is the basic mechanism controlling +the verbosity of logging output. + +Logging messages are encoded as instances of the :class:`LogRecord` class. When +a logger decides to actually log an event, a :class:`LogRecord` instance is +created from the logging message. + +Logging messages are subjected to a dispatch mechanism through the use of +:dfn:`handlers`, which are instances of subclasses of the :class:`Handler` +class. Handlers are responsible for ensuring that a logged message (in the form +of a :class:`LogRecord`) ends up in a particular location (or set of locations) +which is useful for the target audience for that message (such as end users, +support desk staff, system administrators, developers). Handlers are passed +:class:`LogRecord` instances intended for particular destinations. Each logger +can have zero, one or more handlers associated with it (via the +:meth:`addHandler` method of :class:`Logger`). In addition to any handlers +directly associated with a logger, *all handlers associated with all ancestors +of the logger* are called to dispatch the message. + +Just as for loggers, handlers can have levels associated with them. A handler's +level acts as a filter in the same way as a logger's level does. If a handler +decides to actually dispatch an event, the :meth:`emit` method is used to send +the message to its destination. Most user-defined subclasses of :class:`Handler` +will need to override this :meth:`emit`. + +In addition to the base :class:`Handler` class, many useful subclasses are +provided: + +#. :class:`StreamHandler` instances send error messages to streams (file-like + objects). + +#. :class:`FileHandler` instances send error messages to disk files. + +#. :class:`BaseRotatingHandler` is the base class for handlers that rotate log + files at a certain point. It is not meant to be instantiated directly. Instead, + use :class:`RotatingFileHandler` or :class:`TimedRotatingFileHandler`. + +#. :class:`RotatingFileHandler` instances send error messages to disk files, + with support for maximum log file sizes and log file rotation. + +#. :class:`TimedRotatingFileHandler` instances send error messages to disk files + rotating the log file at certain timed intervals. + +#. :class:`SocketHandler` instances send error messages to TCP/IP sockets. + +#. :class:`DatagramHandler` instances send error messages to UDP sockets. + +#. :class:`SMTPHandler` instances send error messages to a designated email + address. + +#. :class:`SysLogHandler` instances send error messages to a Unix syslog daemon, + possibly on a remote machine. + +#. :class:`NTEventLogHandler` instances send error messages to a Windows + NT/2000/XP event log. + +#. :class:`MemoryHandler` instances send error messages to a buffer in memory, + which is flushed whenever specific criteria are met. + +#. :class:`HTTPHandler` instances send error messages to an HTTP server using + either ``GET`` or ``POST`` semantics. + +The :class:`StreamHandler` and :class:`FileHandler` classes are defined in the +core logging package. The other handlers are defined in a sub- module, +:mod:`logging.handlers`. (There is also another sub-module, +:mod:`logging.config`, for configuration functionality.) + +Logged messages are formatted for presentation through instances of the +:class:`Formatter` class. They are initialized with a format string suitable for +use with the % operator and a dictionary. + +For formatting multiple messages in a batch, instances of +:class:`BufferingFormatter` can be used. In addition to the format string (which +is applied to each message in the batch), there is provision for header and +trailer format strings. + +When filtering based on logger level and/or handler level is not enough, +instances of :class:`Filter` can be added to both :class:`Logger` and +:class:`Handler` instances (through their :meth:`addFilter` method). Before +deciding to process a message further, both loggers and handlers consult all +their filters for permission. If any filter returns a false value, the message +is not processed further. + +The basic :class:`Filter` functionality allows filtering by specific logger +name. If this feature is used, messages sent to the named logger and its +children are allowed through the filter, and all others dropped. + +In addition to the classes described above, there are a number of module- level +functions. + + +.. function:: getLogger([name]) + + Return a logger with the specified name or, if no name is specified, return a + logger which is the root logger of the hierarchy. If specified, the name is + typically a dot-separated hierarchical name like *"a"*, *"a.b"* or *"a.b.c.d"*. + Choice of these names is entirely up to the developer who is using logging. + + All calls to this function with a given name return the same logger instance. + This means that logger instances never need to be passed between different parts + of an application. + + +.. function:: getLoggerClass() + + Return either the standard :class:`Logger` class, or the last class passed to + :func:`setLoggerClass`. This function may be called from within a new class + definition, to ensure that installing a customised :class:`Logger` class will + not undo customisations already applied by other code. For example:: + + class MyLogger(logging.getLoggerClass()): + # ... override behaviour here + + +.. function:: debug(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`DEBUG` on the root logger. The *msg* is the + message format string, and the *args* are the arguments which are merged into + *msg* using the string formatting operator. (Note that this means that you can + use keywords in the format string, together with a single dictionary argument.) + + There are two keyword arguments in *kwargs* which are inspected: *exc_info* + which, if it does not evaluate as false, causes exception information to be + added to the logging message. If an exception tuple (in the format returned by + :func:`sys.exc_info`) is provided, it is used; otherwise, :func:`sys.exc_info` + is called to get the exception information. + + The other optional keyword argument is *extra* which can be used to pass a + dictionary which is used to populate the __dict__ of the LogRecord created for + the logging event with user-defined attributes. These custom attributes can then + be used as you like. For example, they could be incorporated into logged + messages. For example:: + + FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s" + logging.basicConfig(format=FORMAT) + d = {'clientip': '192.168.0.1', 'user': 'fbloggs'} + logging.warning("Protocol problem: %s", "connection reset", extra=d) + + would print something like :: + + 2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset + + The keys in the dictionary passed in *extra* should not clash with the keys used + by the logging system. (See the :class:`Formatter` documentation for more + information on which keys are used by the logging system.) + + If you choose to use these attributes in logged messages, you need to exercise + some care. In the above example, for instance, the :class:`Formatter` has been + set up with a format string which expects 'clientip' and 'user' in the attribute + dictionary of the LogRecord. If these are missing, the message will not be + logged because a string formatting exception will occur. So in this case, you + always need to pass the *extra* dictionary with these keys. + + While this might be annoying, this feature is intended for use in specialized + circumstances, such as multi-threaded servers where the same code executes in + many contexts, and interesting conditions which arise are dependent on this + context (such as remote client IP address and authenticated user name, in the + above example). In such circumstances, it is likely that specialized + :class:`Formatter`\ s would be used with particular :class:`Handler`\ s. + + .. versionchanged:: 2.5 + *extra* was added. + + +.. function:: info(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`INFO` on the root logger. The arguments are + interpreted as for :func:`debug`. + + +.. function:: warning(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`WARNING` on the root logger. The arguments are + interpreted as for :func:`debug`. + + +.. function:: error(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`ERROR` on the root logger. The arguments are + interpreted as for :func:`debug`. + + +.. function:: critical(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`CRITICAL` on the root logger. The arguments + are interpreted as for :func:`debug`. + + +.. function:: exception(msg[, *args]) + + Logs a message with level :const:`ERROR` on the root logger. The arguments are + interpreted as for :func:`debug`. Exception info is added to the logging + message. This function should only be called from an exception handler. + + +.. function:: log(level, msg[, *args[, **kwargs]]) + + Logs a message with level *level* on the root logger. The other arguments are + interpreted as for :func:`debug`. + + +.. function:: disable(lvl) + + Provides an overriding level *lvl* for all loggers which takes precedence over + the logger's own level. When the need arises to temporarily throttle logging + output down across the whole application, this function can be useful. + + +.. function:: addLevelName(lvl, levelName) + + Associates level *lvl* with text *levelName* in an internal dictionary, which is + used to map numeric levels to a textual representation, for example when a + :class:`Formatter` formats a message. This function can also be used to define + your own levels. The only constraints are that all levels used must be + registered using this function, levels should be positive integers and they + should increase in increasing order of severity. + + +.. function:: getLevelName(lvl) + + Returns the textual representation of logging level *lvl*. If the level is one + of the predefined levels :const:`CRITICAL`, :const:`ERROR`, :const:`WARNING`, + :const:`INFO` or :const:`DEBUG` then you get the corresponding string. If you + have associated levels with names using :func:`addLevelName` then the name you + have associated with *lvl* is returned. If a numeric value corresponding to one + of the defined levels is passed in, the corresponding string representation is + returned. Otherwise, the string "Level %s" % lvl is returned. + + +.. function:: makeLogRecord(attrdict) + + Creates and returns a new :class:`LogRecord` instance whose attributes are + defined by *attrdict*. This function is useful for taking a pickled + :class:`LogRecord` attribute dictionary, sent over a socket, and reconstituting + it as a :class:`LogRecord` instance at the receiving end. + + +.. function:: basicConfig([**kwargs]) + + Does basic configuration for the logging system by creating a + :class:`StreamHandler` with a default :class:`Formatter` and adding it to the + root logger. The functions :func:`debug`, :func:`info`, :func:`warning`, + :func:`error` and :func:`critical` will call :func:`basicConfig` automatically + if no handlers are defined for the root logger. + + .. versionchanged:: 2.4 + Formerly, :func:`basicConfig` did not take any keyword arguments. + + The following keyword arguments are supported. + + +--------------+---------------------------------------------+ + | Format | Description | + +==============+=============================================+ + | ``filename`` | Specifies that a FileHandler be created, | + | | using the specified filename, rather than a | + | | StreamHandler. | + +--------------+---------------------------------------------+ + | ``filemode`` | Specifies the mode to open the file, if | + | | filename is specified (if filemode is | + | | unspecified, it defaults to 'a'). | + +--------------+---------------------------------------------+ + | ``format`` | Use the specified format string for the | + | | handler. | + +--------------+---------------------------------------------+ + | ``datefmt`` | Use the specified date/time format. | + +--------------+---------------------------------------------+ + | ``level`` | Set the root logger level to the specified | + | | level. | + +--------------+---------------------------------------------+ + | ``stream`` | Use the specified stream to initialize the | + | | StreamHandler. Note that this argument is | + | | incompatible with 'filename' - if both are | + | | present, 'stream' is ignored. | + +--------------+---------------------------------------------+ + + +.. function:: shutdown() + + Informs the logging system to perform an orderly shutdown by flushing and + closing all handlers. + + +.. function:: setLoggerClass(klass) + + Tells the logging system to use the class *klass* when instantiating a logger. + The class should define :meth:`__init__` such that only a name argument is + required, and the :meth:`__init__` should call :meth:`Logger.__init__`. This + function is typically called before any loggers are instantiated by applications + which need to use custom logger behavior. + + +.. seealso:: + + :pep:`282` - A Logging System + The proposal which described this feature for inclusion in the Python standard + library. + + `Original Python :mod:`logging` package <http://www.red-dove.com/python_logging.html>`_ + This is the original source for the :mod:`logging` package. The version of the + package available from this site is suitable for use with Python 1.5.2, 2.1.x + and 2.2.x, which do not include the :mod:`logging` package in the standard + library. + + +Logger Objects +-------------- + +Loggers have the following attributes and methods. Note that Loggers are never +instantiated directly, but always through the module-level function +``logging.getLogger(name)``. + + +.. attribute:: Logger.propagate + + If this evaluates to false, logging messages are not passed by this logger or by + child loggers to higher level (ancestor) loggers. The constructor sets this + attribute to 1. + + +.. method:: Logger.setLevel(lvl) + + Sets the threshold for this logger to *lvl*. Logging messages which are less + severe than *lvl* will be ignored. When a logger is created, the level is set to + :const:`NOTSET` (which causes all messages to be processed when the logger is + the root logger, or delegation to the parent when the logger is a non-root + logger). Note that the root logger is created with level :const:`WARNING`. + + The term "delegation to the parent" means that if a logger has a level of + NOTSET, its chain of ancestor loggers is traversed until either an ancestor with + a level other than NOTSET is found, or the root is reached. + + If an ancestor is found with a level other than NOTSET, then that ancestor's + level is treated as the effective level of the logger where the ancestor search + began, and is used to determine how a logging event is handled. + + If the root is reached, and it has a level of NOTSET, then all messages will be + processed. Otherwise, the root's level will be used as the effective level. + + +.. method:: Logger.isEnabledFor(lvl) + + Indicates if a message of severity *lvl* would be processed by this logger. + This method checks first the module-level level set by + ``logging.disable(lvl)`` and then the logger's effective level as determined + by :meth:`getEffectiveLevel`. + + +.. method:: Logger.getEffectiveLevel() + + Indicates the effective level for this logger. If a value other than + :const:`NOTSET` has been set using :meth:`setLevel`, it is returned. Otherwise, + the hierarchy is traversed towards the root until a value other than + :const:`NOTSET` is found, and that value is returned. + + +.. method:: Logger.debug(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`DEBUG` on this logger. The *msg* is the + message format string, and the *args* are the arguments which are merged into + *msg* using the string formatting operator. (Note that this means that you can + use keywords in the format string, together with a single dictionary argument.) + + There are two keyword arguments in *kwargs* which are inspected: *exc_info* + which, if it does not evaluate as false, causes exception information to be + added to the logging message. If an exception tuple (in the format returned by + :func:`sys.exc_info`) is provided, it is used; otherwise, :func:`sys.exc_info` + is called to get the exception information. + + The other optional keyword argument is *extra* which can be used to pass a + dictionary which is used to populate the __dict__ of the LogRecord created for + the logging event with user-defined attributes. These custom attributes can then + be used as you like. For example, they could be incorporated into logged + messages. For example:: + + FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s" + logging.basicConfig(format=FORMAT) + dict = { 'clientip' : '192.168.0.1', 'user' : 'fbloggs' } + logger = logging.getLogger("tcpserver") + logger.warning("Protocol problem: %s", "connection reset", extra=d) + + would print something like :: + + 2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset + + The keys in the dictionary passed in *extra* should not clash with the keys used + by the logging system. (See the :class:`Formatter` documentation for more + information on which keys are used by the logging system.) + + If you choose to use these attributes in logged messages, you need to exercise + some care. In the above example, for instance, the :class:`Formatter` has been + set up with a format string which expects 'clientip' and 'user' in the attribute + dictionary of the LogRecord. If these are missing, the message will not be + logged because a string formatting exception will occur. So in this case, you + always need to pass the *extra* dictionary with these keys. + + While this might be annoying, this feature is intended for use in specialized + circumstances, such as multi-threaded servers where the same code executes in + many contexts, and interesting conditions which arise are dependent on this + context (such as remote client IP address and authenticated user name, in the + above example). In such circumstances, it is likely that specialized + :class:`Formatter`\ s would be used with particular :class:`Handler`\ s. + + .. versionchanged:: 2.5 + *extra* was added. + + +.. method:: Logger.info(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`INFO` on this logger. The arguments are + interpreted as for :meth:`debug`. + + +.. method:: Logger.warning(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`WARNING` on this logger. The arguments are + interpreted as for :meth:`debug`. + + +.. method:: Logger.error(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`ERROR` on this logger. The arguments are + interpreted as for :meth:`debug`. + + +.. method:: Logger.critical(msg[, *args[, **kwargs]]) + + Logs a message with level :const:`CRITICAL` on this logger. The arguments are + interpreted as for :meth:`debug`. + + +.. method:: Logger.log(lvl, msg[, *args[, **kwargs]]) + + Logs a message with integer level *lvl* on this logger. The other arguments are + interpreted as for :meth:`debug`. + + +.. method:: Logger.exception(msg[, *args]) + + Logs a message with level :const:`ERROR` on this logger. The arguments are + interpreted as for :meth:`debug`. Exception info is added to the logging + message. This method should only be called from an exception handler. + + +.. method:: Logger.addFilter(filt) + + Adds the specified filter *filt* to this logger. + + +.. method:: Logger.removeFilter(filt) + + Removes the specified filter *filt* from this logger. + + +.. method:: Logger.filter(record) + + Applies this logger's filters to the record and returns a true value if the + record is to be processed. + + +.. method:: Logger.addHandler(hdlr) + + Adds the specified handler *hdlr* to this logger. + + +.. method:: Logger.removeHandler(hdlr) + + Removes the specified handler *hdlr* from this logger. + + +.. method:: Logger.findCaller() + + Finds the caller's source filename and line number. Returns the filename, line + number and function name as a 3-element tuple. + + .. versionchanged:: 2.5 + The function name was added. In earlier versions, the filename and line number + were returned as a 2-element tuple.. + + +.. method:: Logger.handle(record) + + Handles a record by passing it to all handlers associated with this logger and + its ancestors (until a false value of *propagate* is found). This method is used + for unpickled records received from a socket, as well as those created locally. + Logger-level filtering is applied using :meth:`filter`. + + +.. method:: Logger.makeRecord(name, lvl, fn, lno, msg, args, exc_info [, func, extra]) + + This is a factory method which can be overridden in subclasses to create + specialized :class:`LogRecord` instances. + + .. versionchanged:: 2.5 + *func* and *extra* were added. + + +.. _minimal-example: + +Basic example +------------- + +.. versionchanged:: 2.4 + formerly :func:`basicConfig` did not take any keyword arguments. + +The :mod:`logging` package provides a lot of flexibility, and its configuration +can appear daunting. This section demonstrates that simple use of the logging +package is possible. + +The simplest example shows logging to the console:: + + import logging + + logging.debug('A debug message') + logging.info('Some information') + logging.warning('A shot across the bows') + +If you run the above script, you'll see this:: + + WARNING:root:A shot across the bows + +Because no particular logger was specified, the system used the root logger. The +debug and info messages didn't appear because by default, the root logger is +configured to only handle messages with a severity of WARNING or above. The +message format is also a configuration default, as is the output destination of +the messages - ``sys.stderr``. The severity level, the message format and +destination can be easily changed, as shown in the example below:: + + import logging + + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)s %(levelname)s %(message)s', + filename='/tmp/myapp.log', + filemode='w') + logging.debug('A debug message') + logging.info('Some information') + logging.warning('A shot across the bows') + +The :meth:`basicConfig` method is used to change the configuration defaults, +which results in output (written to ``/tmp/myapp.log``) which should look +something like the following:: + + 2004-07-02 13:00:08,743 DEBUG A debug message + 2004-07-02 13:00:08,743 INFO Some information + 2004-07-02 13:00:08,743 WARNING A shot across the bows + +This time, all messages with a severity of DEBUG or above were handled, and the +format of the messages was also changed, and output went to the specified file +rather than the console. + +Formatting uses standard Python string formatting - see section +:ref:`string-formatting`. The format string takes the following common +specifiers. For a complete list of specifiers, consult the :class:`Formatter` +documentation. + ++-------------------+-----------------------------------------------+ +| Format | Description | ++===================+===============================================+ +| ``%(name)s`` | Name of the logger (logging channel). | ++-------------------+-----------------------------------------------+ +| ``%(levelname)s`` | Text logging level for the message | +| | (``'DEBUG'``, ``'INFO'``, ``'WARNING'``, | +| | ``'ERROR'``, ``'CRITICAL'``). | ++-------------------+-----------------------------------------------+ +| ``%(asctime)s`` | Human-readable time when the | +| | :class:`LogRecord` was created. By default | +| | this is of the form "2003-07-08 16:49:45,896" | +| | (the numbers after the comma are millisecond | +| | portion of the time). | ++-------------------+-----------------------------------------------+ +| ``%(message)s`` | The logged message. | ++-------------------+-----------------------------------------------+ + +To change the date/time format, you can pass an additional keyword parameter, +*datefmt*, as in the following:: + + import logging + + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)s %(levelname)-8s %(message)s', + datefmt='%a, %d %b %Y %H:%M:%S', + filename='/temp/myapp.log', + filemode='w') + logging.debug('A debug message') + logging.info('Some information') + logging.warning('A shot across the bows') + +which would result in output like :: + + Fri, 02 Jul 2004 13:06:18 DEBUG A debug message + Fri, 02 Jul 2004 13:06:18 INFO Some information + Fri, 02 Jul 2004 13:06:18 WARNING A shot across the bows + +The date format string follows the requirements of :func:`strftime` - see the +documentation for the :mod:`time` module. + +If, instead of sending logging output to the console or a file, you'd rather use +a file-like object which you have created separately, you can pass it to +:func:`basicConfig` using the *stream* keyword argument. Note that if both +*stream* and *filename* keyword arguments are passed, the *stream* argument is +ignored. + +Of course, you can put variable information in your output. To do this, simply +have the message be a format string and pass in additional arguments containing +the variable information, as in the following example:: + + import logging + + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)s %(levelname)-8s %(message)s', + datefmt='%a, %d %b %Y %H:%M:%S', + filename='/temp/myapp.log', + filemode='w') + logging.error('Pack my box with %d dozen %s', 5, 'liquor jugs') + +which would result in :: + + Wed, 21 Jul 2004 15:35:16 ERROR Pack my box with 5 dozen liquor jugs + + +.. _multiple-destinations: + +Logging to multiple destinations +-------------------------------- + +Let's say you want to log to console and file with different message formats and +in differing circumstances. Say you want to log messages with levels of DEBUG +and higher to file, and those messages at level INFO and higher to the console. +Let's also assume that the file should contain timestamps, but the console +messages should not. Here's how you can achieve this:: + + import logging + + # set up logging to file - see previous section for more details + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s', + datefmt='%m-%d %H:%M', + filename='/temp/myapp.log', + filemode='w') + # define a Handler which writes INFO messages or higher to the sys.stderr + console = logging.StreamHandler() + console.setLevel(logging.INFO) + # set a format which is simpler for console use + formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s') + # tell the handler to use this format + console.setFormatter(formatter) + # add the handler to the root logger + logging.getLogger('').addHandler(console) + + # Now, we can log to the root logger, or any other logger. First the root... + logging.info('Jackdaws love my big sphinx of quartz.') + + # Now, define a couple of other loggers which might represent areas in your + # application: + + logger1 = logging.getLogger('myapp.area1') + logger2 = logging.getLogger('myapp.area2') + + logger1.debug('Quick zephyrs blow, vexing daft Jim.') + logger1.info('How quickly daft jumping zebras vex.') + logger2.warning('Jail zesty vixen who grabbed pay from quack.') + logger2.error('The five boxing wizards jump quickly.') + +When you run this, on the console you will see :: + + root : INFO Jackdaws love my big sphinx of quartz. + myapp.area1 : INFO How quickly daft jumping zebras vex. + myapp.area2 : WARNING Jail zesty vixen who grabbed pay from quack. + myapp.area2 : ERROR The five boxing wizards jump quickly. + +and in the file you will see something like :: + + 10-22 22:19 root INFO Jackdaws love my big sphinx of quartz. + 10-22 22:19 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. + 10-22 22:19 myapp.area1 INFO How quickly daft jumping zebras vex. + 10-22 22:19 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. + 10-22 22:19 myapp.area2 ERROR The five boxing wizards jump quickly. + +As you can see, the DEBUG message only shows up in the file. The other messages +are sent to both destinations. + +This example uses console and file handlers, but you can use any number and +combination of handlers you choose. + + +.. _network-logging: + +Sending and receiving logging events across a network +----------------------------------------------------- + +Let's say you want to send logging events across a network, and handle them at +the receiving end. A simple way of doing this is attaching a +:class:`SocketHandler` instance to the root logger at the sending end:: + + import logging, logging.handlers + + rootLogger = logging.getLogger('') + rootLogger.setLevel(logging.DEBUG) + socketHandler = logging.handlers.SocketHandler('localhost', + logging.handlers.DEFAULT_TCP_LOGGING_PORT) + # don't bother with a formatter, since a socket handler sends the event as + # an unformatted pickle + rootLogger.addHandler(socketHandler) + + # Now, we can log to the root logger, or any other logger. First the root... + logging.info('Jackdaws love my big sphinx of quartz.') + + # Now, define a couple of other loggers which might represent areas in your + # application: + + logger1 = logging.getLogger('myapp.area1') + logger2 = logging.getLogger('myapp.area2') + + logger1.debug('Quick zephyrs blow, vexing daft Jim.') + logger1.info('How quickly daft jumping zebras vex.') + logger2.warning('Jail zesty vixen who grabbed pay from quack.') + logger2.error('The five boxing wizards jump quickly.') + +At the receiving end, you can set up a receiver using the :mod:`SocketServer` +module. Here is a basic working example:: + + import cPickle + import logging + import logging.handlers + import SocketServer + import struct + + + class LogRecordStreamHandler(SocketServer.StreamRequestHandler): + """Handler for a streaming logging request. + + This basically logs the record using whatever logging policy is + configured locally. + """ + + def handle(self): + """ + Handle multiple requests - each expected to be a 4-byte length, + followed by the LogRecord in pickle format. Logs the record + according to whatever policy is configured locally. + """ + while 1: + chunk = self.connection.recv(4) + if len(chunk) < 4: + break + slen = struct.unpack(">L", chunk)[0] + chunk = self.connection.recv(slen) + while len(chunk) < slen: + chunk = chunk + self.connection.recv(slen - len(chunk)) + obj = self.unPickle(chunk) + record = logging.makeLogRecord(obj) + self.handleLogRecord(record) + + def unPickle(self, data): + return cPickle.loads(data) + + def handleLogRecord(self, record): + # if a name is specified, we use the named logger rather than the one + # implied by the record. + if self.server.logname is not None: + name = self.server.logname + else: + name = record.name + logger = logging.getLogger(name) + # N.B. EVERY record gets logged. This is because Logger.handle + # is normally called AFTER logger-level filtering. If you want + # to do filtering, do it at the client end to save wasting + # cycles and network bandwidth! + logger.handle(record) + + class LogRecordSocketReceiver(SocketServer.ThreadingTCPServer): + """simple TCP socket-based logging receiver suitable for testing. + """ + + allow_reuse_address = 1 + + def __init__(self, host='localhost', + port=logging.handlers.DEFAULT_TCP_LOGGING_PORT, + handler=LogRecordStreamHandler): + SocketServer.ThreadingTCPServer.__init__(self, (host, port), handler) + self.abort = 0 + self.timeout = 1 + self.logname = None + + def serve_until_stopped(self): + import select + abort = 0 + while not abort: + rd, wr, ex = select.select([self.socket.fileno()], + [], [], + self.timeout) + if rd: + self.handle_request() + abort = self.abort + + def main(): + logging.basicConfig( + format="%(relativeCreated)5d %(name)-15s %(levelname)-8s %(message)s") + tcpserver = LogRecordSocketReceiver() + print "About to start TCP server..." + tcpserver.serve_until_stopped() + + if __name__ == "__main__": + main() + +First run the server, and then the client. On the client side, nothing is +printed on the console; on the server side, you should see something like:: + + About to start TCP server... + 59 root INFO Jackdaws love my big sphinx of quartz. + 59 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. + 69 myapp.area1 INFO How quickly daft jumping zebras vex. + 69 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. + 69 myapp.area2 ERROR The five boxing wizards jump quickly. + + +Handler Objects +--------------- + +Handlers have the following attributes and methods. Note that :class:`Handler` +is never instantiated directly; this class acts as a base for more useful +subclasses. However, the :meth:`__init__` method in subclasses needs to call +:meth:`Handler.__init__`. + + +.. method:: Handler.__init__(level=NOTSET) + + Initializes the :class:`Handler` instance by setting its level, setting the list + of filters to the empty list and creating a lock (using :meth:`createLock`) for + serializing access to an I/O mechanism. + + +.. method:: Handler.createLock() + + Initializes a thread lock which can be used to serialize access to underlying + I/O functionality which may not be threadsafe. + + +.. method:: Handler.acquire() + + Acquires the thread lock created with :meth:`createLock`. + + +.. method:: Handler.release() + + Releases the thread lock acquired with :meth:`acquire`. + + +.. method:: Handler.setLevel(lvl) + + Sets the threshold for this handler to *lvl*. Logging messages which are less + severe than *lvl* will be ignored. When a handler is created, the level is set + to :const:`NOTSET` (which causes all messages to be processed). + + +.. method:: Handler.setFormatter(form) + + Sets the :class:`Formatter` for this handler to *form*. + + +.. method:: Handler.addFilter(filt) + + Adds the specified filter *filt* to this handler. + + +.. method:: Handler.removeFilter(filt) + + Removes the specified filter *filt* from this handler. + + +.. method:: Handler.filter(record) + + Applies this handler's filters to the record and returns a true value if the + record is to be processed. + + +.. method:: Handler.flush() + + Ensure all logging output has been flushed. This version does nothing and is + intended to be implemented by subclasses. + + +.. method:: Handler.close() + + Tidy up any resources used by the handler. This version does nothing and is + intended to be implemented by subclasses. + + +.. method:: Handler.handle(record) + + Conditionally emits the specified logging record, depending on filters which may + have been added to the handler. Wraps the actual emission of the record with + acquisition/release of the I/O thread lock. + + +.. method:: Handler.handleError(record) + + This method should be called from handlers when an exception is encountered + during an :meth:`emit` call. By default it does nothing, which means that + exceptions get silently ignored. This is what is mostly wanted for a logging + system - most users will not care about errors in the logging system, they are + more interested in application errors. You could, however, replace this with a + custom handler if you wish. The specified record is the one which was being + processed when the exception occurred. + + +.. method:: Handler.format(record) + + Do formatting for a record - if a formatter is set, use it. Otherwise, use the + default formatter for the module. + + +.. method:: Handler.emit(record) + + Do whatever it takes to actually log the specified logging record. This version + is intended to be implemented by subclasses and so raises a + :exc:`NotImplementedError`. + + +StreamHandler +^^^^^^^^^^^^^ + +The :class:`StreamHandler` class, located in the core :mod:`logging` package, +sends logging output to streams such as *sys.stdout*, *sys.stderr* or any +file-like object (or, more precisely, any object which supports :meth:`write` +and :meth:`flush` methods). + + +.. class:: StreamHandler([strm]) + + Returns a new instance of the :class:`StreamHandler` class. If *strm* is + specified, the instance will use it for logging output; otherwise, *sys.stderr* + will be used. + + +.. method:: StreamHandler.emit(record) + + If a formatter is specified, it is used to format the record. The record is then + written to the stream with a trailing newline. If exception information is + present, it is formatted using :func:`traceback.print_exception` and appended to + the stream. + + +.. method:: StreamHandler.flush() + + Flushes the stream by calling its :meth:`flush` method. Note that the + :meth:`close` method is inherited from :class:`Handler` and so does nothing, so + an explicit :meth:`flush` call may be needed at times. + + +FileHandler +^^^^^^^^^^^ + +The :class:`FileHandler` class, located in the core :mod:`logging` package, +sends logging output to a disk file. It inherits the output functionality from +:class:`StreamHandler`. + + +.. class:: FileHandler(filename[, mode[, encoding]]) + + Returns a new instance of the :class:`FileHandler` class. The specified file is + opened and used as the stream for logging. If *mode* is not specified, + :const:`'a'` is used. If *encoding* is not *None*, it is used to open the file + with that encoding. By default, the file grows indefinitely. + + +.. method:: FileHandler.close() + + Closes the file. + + +.. method:: FileHandler.emit(record) + + Outputs the record to the file. + + +WatchedFileHandler +^^^^^^^^^^^^^^^^^^ + +.. versionadded:: 2.6 + +The :class:`WatchedFileHandler` class, located in the :mod:`logging.handlers` +module, is a :class:`FileHandler` which watches the file it is logging to. If +the file changes, it is closed and reopened using the file name. + +A file change can happen because of usage of programs such as *newsyslog* and +*logrotate* which perform log file rotation. This handler, intended for use +under Unix/Linux, watches the file to see if it has changed since the last emit. +(A file is deemed to have changed if its device or inode have changed.) If the +file has changed, the old file stream is closed, and the file opened to get a +new stream. + +This handler is not appropriate for use under Windows, because under Windows +open log files cannot be moved or renamed - logging opens the files with +exclusive locks - and so there is no need for such a handler. Furthermore, +*ST_INO* is not supported under Windows; :func:`stat` always returns zero for +this value. + + +.. class:: WatchedFileHandler(filename[,mode[, encoding]]) + + Returns a new instance of the :class:`WatchedFileHandler` class. The specified + file is opened and used as the stream for logging. If *mode* is not specified, + :const:`'a'` is used. If *encoding* is not *None*, it is used to open the file + with that encoding. By default, the file grows indefinitely. + + +.. method:: WatchedFileHandler.emit(record) + + Outputs the record to the file, but first checks to see if the file has changed. + If it has, the existing stream is flushed and closed and the file opened again, + before outputting the record to the file. + + +RotatingFileHandler +^^^^^^^^^^^^^^^^^^^ + +The :class:`RotatingFileHandler` class, located in the :mod:`logging.handlers` +module, supports rotation of disk log files. + + +.. class:: RotatingFileHandler(filename[, mode[, maxBytes[, backupCount]]]) + + Returns a new instance of the :class:`RotatingFileHandler` class. The specified + file is opened and used as the stream for logging. If *mode* is not specified, + ``'a'`` is used. By default, the file grows indefinitely. + + You can use the *maxBytes* and *backupCount* values to allow the file to + :dfn:`rollover` at a predetermined size. When the size is about to be exceeded, + the file is closed and a new file is silently opened for output. Rollover occurs + whenever the current log file is nearly *maxBytes* in length; if *maxBytes* is + zero, rollover never occurs. If *backupCount* is non-zero, the system will save + old log files by appending the extensions ".1", ".2" etc., to the filename. For + example, with a *backupCount* of 5 and a base file name of :file:`app.log`, you + would get :file:`app.log`, :file:`app.log.1`, :file:`app.log.2`, up to + :file:`app.log.5`. The file being written to is always :file:`app.log`. When + this file is filled, it is closed and renamed to :file:`app.log.1`, and if files + :file:`app.log.1`, :file:`app.log.2`, etc. exist, then they are renamed to + :file:`app.log.2`, :file:`app.log.3` etc. respectively. + + +.. method:: RotatingFileHandler.doRollover() + + Does a rollover, as described above. + + +.. method:: RotatingFileHandler.emit(record) + + Outputs the record to the file, catering for rollover as described previously. + + +TimedRotatingFileHandler +^^^^^^^^^^^^^^^^^^^^^^^^ + +The :class:`TimedRotatingFileHandler` class, located in the +:mod:`logging.handlers` module, supports rotation of disk log files at certain +timed intervals. + + +.. class:: TimedRotatingFileHandler(filename [,when [,interval [,backupCount]]]) + + Returns a new instance of the :class:`TimedRotatingFileHandler` class. The + specified file is opened and used as the stream for logging. On rotating it also + sets the filename suffix. Rotating happens based on the product of *when* and + *interval*. + + You can use the *when* to specify the type of *interval*. The list of possible + values is, note that they are not case sensitive: + + +----------+-----------------------+ + | Value | Type of interval | + +==========+=======================+ + | S | Seconds | + +----------+-----------------------+ + | M | Minutes | + +----------+-----------------------+ + | H | Hours | + +----------+-----------------------+ + | D | Days | + +----------+-----------------------+ + | W | Week day (0=Monday) | + +----------+-----------------------+ + | midnight | Roll over at midnight | + +----------+-----------------------+ + + If *backupCount* is non-zero, the system will save old log files by appending + extensions to the filename. The extensions are date-and-time based, using the + strftime format ``%Y-%m-%d_%H-%M-%S`` or a leading portion thereof, depending on + the rollover interval. At most *backupCount* files will be kept, and if more + would be created when rollover occurs, the oldest one is deleted. + + +.. method:: TimedRotatingFileHandler.doRollover() + + Does a rollover, as described above. + + +.. method:: TimedRotatingFileHandler.emit(record) + + Outputs the record to the file, catering for rollover as described above. + + +SocketHandler +^^^^^^^^^^^^^ + +The :class:`SocketHandler` class, located in the :mod:`logging.handlers` module, +sends logging output to a network socket. The base class uses a TCP socket. + + +.. class:: SocketHandler(host, port) + + Returns a new instance of the :class:`SocketHandler` class intended to + communicate with a remote machine whose address is given by *host* and *port*. + + +.. method:: SocketHandler.close() + + Closes the socket. + + +.. method:: SocketHandler.emit() + + Pickles the record's attribute dictionary and writes it to the socket in binary + format. If there is an error with the socket, silently drops the packet. If the + connection was previously lost, re-establishes the connection. To unpickle the + record at the receiving end into a :class:`LogRecord`, use the + :func:`makeLogRecord` function. + + +.. method:: SocketHandler.handleError() + + Handles an error which has occurred during :meth:`emit`. The most likely cause + is a lost connection. Closes the socket so that we can retry on the next event. + + +.. method:: SocketHandler.makeSocket() + + This is a factory method which allows subclasses to define the precise type of + socket they want. The default implementation creates a TCP socket + (:const:`socket.SOCK_STREAM`). + + +.. method:: SocketHandler.makePickle(record) + + Pickles the record's attribute dictionary in binary format with a length prefix, + and returns it ready for transmission across the socket. + + +.. method:: SocketHandler.send(packet) + + Send a pickled string *packet* to the socket. This function allows for partial + sends which can happen when the network is busy. + + +DatagramHandler +^^^^^^^^^^^^^^^ + +The :class:`DatagramHandler` class, located in the :mod:`logging.handlers` +module, inherits from :class:`SocketHandler` to support sending logging messages +over UDP sockets. + + +.. class:: DatagramHandler(host, port) + + Returns a new instance of the :class:`DatagramHandler` class intended to + communicate with a remote machine whose address is given by *host* and *port*. + + +.. method:: DatagramHandler.emit() + + Pickles the record's attribute dictionary and writes it to the socket in binary + format. If there is an error with the socket, silently drops the packet. To + unpickle the record at the receiving end into a :class:`LogRecord`, use the + :func:`makeLogRecord` function. + + +.. method:: DatagramHandler.makeSocket() + + The factory method of :class:`SocketHandler` is here overridden to create a UDP + socket (:const:`socket.SOCK_DGRAM`). + + +.. method:: DatagramHandler.send(s) + + Send a pickled string to a socket. + + +SysLogHandler +^^^^^^^^^^^^^ + +The :class:`SysLogHandler` class, located in the :mod:`logging.handlers` module, +supports sending logging messages to a remote or local Unix syslog. + + +.. class:: SysLogHandler([address[, facility]]) + + Returns a new instance of the :class:`SysLogHandler` class intended to + communicate with a remote Unix machine whose address is given by *address* in + the form of a ``(host, port)`` tuple. If *address* is not specified, + ``('localhost', 514)`` is used. The address is used to open a UDP socket. An + alternative to providing a ``(host, port)`` tuple is providing an address as a + string, for example "/dev/log". In this case, a Unix domain socket is used to + send the message to the syslog. If *facility* is not specified, + :const:`LOG_USER` is used. + + +.. method:: SysLogHandler.close() + + Closes the socket to the remote host. + + +.. method:: SysLogHandler.emit(record) + + The record is formatted, and then sent to the syslog server. If exception + information is present, it is *not* sent to the server. + + +.. method:: SysLogHandler.encodePriority(facility, priority) + + Encodes the facility and priority into an integer. You can pass in strings or + integers - if strings are passed, internal mapping dictionaries are used to + convert them to integers. + + +NTEventLogHandler +^^^^^^^^^^^^^^^^^ + +The :class:`NTEventLogHandler` class, located in the :mod:`logging.handlers` +module, supports sending logging messages to a local Windows NT, Windows 2000 or +Windows XP event log. Before you can use it, you need Mark Hammond's Win32 +extensions for Python installed. + + +.. class:: NTEventLogHandler(appname[, dllname[, logtype]]) + + Returns a new instance of the :class:`NTEventLogHandler` class. The *appname* is + used to define the application name as it appears in the event log. An + appropriate registry entry is created using this name. The *dllname* should give + the fully qualified pathname of a .dll or .exe which contains message + definitions to hold in the log (if not specified, ``'win32service.pyd'`` is used + - this is installed with the Win32 extensions and contains some basic + placeholder message definitions. Note that use of these placeholders will make + your event logs big, as the entire message source is held in the log. If you + want slimmer logs, you have to pass in the name of your own .dll or .exe which + contains the message definitions you want to use in the event log). The + *logtype* is one of ``'Application'``, ``'System'`` or ``'Security'``, and + defaults to ``'Application'``. + + +.. method:: NTEventLogHandler.close() + + At this point, you can remove the application name from the registry as a source + of event log entries. However, if you do this, you will not be able to see the + events as you intended in the Event Log Viewer - it needs to be able to access + the registry to get the .dll name. The current version does not do this (in fact + it doesn't do anything). + + +.. method:: NTEventLogHandler.emit(record) + + Determines the message ID, event category and event type, and then logs the + message in the NT event log. + + +.. method:: NTEventLogHandler.getEventCategory(record) + + Returns the event category for the record. Override this if you want to specify + your own categories. This version returns 0. + + +.. method:: NTEventLogHandler.getEventType(record) + + Returns the event type for the record. Override this if you want to specify your + own types. This version does a mapping using the handler's typemap attribute, + which is set up in :meth:`__init__` to a dictionary which contains mappings for + :const:`DEBUG`, :const:`INFO`, :const:`WARNING`, :const:`ERROR` and + :const:`CRITICAL`. If you are using your own levels, you will either need to + override this method or place a suitable dictionary in the handler's *typemap* + attribute. + + +.. method:: NTEventLogHandler.getMessageID(record) + + Returns the message ID for the record. If you are using your own messages, you + could do this by having the *msg* passed to the logger being an ID rather than a + format string. Then, in here, you could use a dictionary lookup to get the + message ID. This version returns 1, which is the base message ID in + :file:`win32service.pyd`. + + +SMTPHandler +^^^^^^^^^^^ + +The :class:`SMTPHandler` class, located in the :mod:`logging.handlers` module, +supports sending logging messages to an email address via SMTP. + + +.. class:: SMTPHandler(mailhost, fromaddr, toaddrs, subject[, credentials]) + + Returns a new instance of the :class:`SMTPHandler` class. The instance is + initialized with the from and to addresses and subject line of the email. The + *toaddrs* should be a list of strings. To specify a non-standard SMTP port, use + the (host, port) tuple format for the *mailhost* argument. If you use a string, + the standard SMTP port is used. If your SMTP server requires authentication, you + can specify a (username, password) tuple for the *credentials* argument. + + .. versionchanged:: 2.6 + *credentials* was added. + + +.. method:: SMTPHandler.emit(record) + + Formats the record and sends it to the specified addressees. + + +.. method:: SMTPHandler.getSubject(record) + + If you want to specify a subject line which is record-dependent, override this + method. + + +MemoryHandler +^^^^^^^^^^^^^ + +The :class:`MemoryHandler` class, located in the :mod:`logging.handlers` module, +supports buffering of logging records in memory, periodically flushing them to a +:dfn:`target` handler. Flushing occurs whenever the buffer is full, or when an +event of a certain severity or greater is seen. + +:class:`MemoryHandler` is a subclass of the more general +:class:`BufferingHandler`, which is an abstract class. This buffers logging +records in memory. Whenever each record is added to the buffer, a check is made +by calling :meth:`shouldFlush` to see if the buffer should be flushed. If it +should, then :meth:`flush` is expected to do the needful. + + +.. class:: BufferingHandler(capacity) + + Initializes the handler with a buffer of the specified capacity. + + +.. method:: BufferingHandler.emit(record) + + Appends the record to the buffer. If :meth:`shouldFlush` returns true, calls + :meth:`flush` to process the buffer. + + +.. method:: BufferingHandler.flush() + + You can override this to implement custom flushing behavior. This version just + zaps the buffer to empty. + + +.. method:: BufferingHandler.shouldFlush(record) + + Returns true if the buffer is up to capacity. This method can be overridden to + implement custom flushing strategies. + + +.. class:: MemoryHandler(capacity[, flushLevel [, target]]) + + Returns a new instance of the :class:`MemoryHandler` class. The instance is + initialized with a buffer size of *capacity*. If *flushLevel* is not specified, + :const:`ERROR` is used. If no *target* is specified, the target will need to be + set using :meth:`setTarget` before this handler does anything useful. + + +.. method:: MemoryHandler.close() + + Calls :meth:`flush`, sets the target to :const:`None` and clears the buffer. + + +.. method:: MemoryHandler.flush() + + For a :class:`MemoryHandler`, flushing means just sending the buffered records + to the target, if there is one. Override if you want different behavior. + + +.. method:: MemoryHandler.setTarget(target) + + Sets the target handler for this handler. + + +.. method:: MemoryHandler.shouldFlush(record) + + Checks for buffer full or a record at the *flushLevel* or higher. + + +HTTPHandler +^^^^^^^^^^^ + +The :class:`HTTPHandler` class, located in the :mod:`logging.handlers` module, +supports sending logging messages to a Web server, using either ``GET`` or +``POST`` semantics. + + +.. class:: HTTPHandler(host, url[, method]) + + Returns a new instance of the :class:`HTTPHandler` class. The instance is + initialized with a host address, url and HTTP method. The *host* can be of the + form ``host:port``, should you need to use a specific port number. If no + *method* is specified, ``GET`` is used. + + +.. method:: HTTPHandler.emit(record) + + Sends the record to the Web server as an URL-encoded dictionary. + + +Formatter Objects +----------------- + +:class:`Formatter`\ s have the following attributes and methods. They are +responsible for converting a :class:`LogRecord` to (usually) a string which can +be interpreted by either a human or an external system. The base +:class:`Formatter` allows a formatting string to be specified. If none is +supplied, the default value of ``'%(message)s'`` is used. + +A Formatter can be initialized with a format string which makes use of knowledge +of the :class:`LogRecord` attributes - such as the default value mentioned above +making use of the fact that the user's message and arguments are pre-formatted +into a :class:`LogRecord`'s *message* attribute. This format string contains +standard python %-style mapping keys. See section :ref:`string-formatting` +for more information on string formatting. + +Currently, the useful mapping keys in a :class:`LogRecord` are: + ++-------------------------+-----------------------------------------------+ +| Format | Description | ++=========================+===============================================+ +| ``%(name)s`` | Name of the logger (logging channel). | ++-------------------------+-----------------------------------------------+ +| ``%(levelno)s`` | Numeric logging level for the message | +| | (:const:`DEBUG`, :const:`INFO`, | +| | :const:`WARNING`, :const:`ERROR`, | +| | :const:`CRITICAL`). | ++-------------------------+-----------------------------------------------+ +| ``%(levelname)s`` | Text logging level for the message | +| | (``'DEBUG'``, ``'INFO'``, ``'WARNING'``, | +| | ``'ERROR'``, ``'CRITICAL'``). | ++-------------------------+-----------------------------------------------+ +| ``%(pathname)s`` | Full pathname of the source file where the | +| | logging call was issued (if available). | ++-------------------------+-----------------------------------------------+ +| ``%(filename)s`` | Filename portion of pathname. | ++-------------------------+-----------------------------------------------+ +| ``%(module)s`` | Module (name portion of filename). | ++-------------------------+-----------------------------------------------+ +| ``%(funcName)s`` | Name of function containing the logging call. | ++-------------------------+-----------------------------------------------+ +| ``%(lineno)d`` | Source line number where the logging call was | +| | issued (if available). | ++-------------------------+-----------------------------------------------+ +| ``%(created)f`` | Time when the :class:`LogRecord` was created | +| | (as returned by :func:`time.time`). | ++-------------------------+-----------------------------------------------+ +| ``%(relativeCreated)d`` | Time in milliseconds when the LogRecord was | +| | created, relative to the time the logging | +| | module was loaded. | ++-------------------------+-----------------------------------------------+ +| ``%(asctime)s`` | Human-readable time when the | +| | :class:`LogRecord` was created. By default | +| | this is of the form "2003-07-08 16:49:45,896" | +| | (the numbers after the comma are millisecond | +| | portion of the time). | ++-------------------------+-----------------------------------------------+ +| ``%(msecs)d`` | Millisecond portion of the time when the | +| | :class:`LogRecord` was created. | ++-------------------------+-----------------------------------------------+ +| ``%(thread)d`` | Thread ID (if available). | ++-------------------------+-----------------------------------------------+ +| ``%(threadName)s`` | Thread name (if available). | ++-------------------------+-----------------------------------------------+ +| ``%(process)d`` | Process ID (if available). | ++-------------------------+-----------------------------------------------+ +| ``%(message)s`` | The logged message, computed as ``msg % | +| | args``. | ++-------------------------+-----------------------------------------------+ + +.. versionchanged:: 2.5 + *funcName* was added. + + +.. class:: Formatter([fmt[, datefmt]]) + + Returns a new instance of the :class:`Formatter` class. The instance is + initialized with a format string for the message as a whole, as well as a format + string for the date/time portion of a message. If no *fmt* is specified, + ``'%(message)s'`` is used. If no *datefmt* is specified, the ISO8601 date format + is used. + + +.. method:: Formatter.format(record) + + The record's attribute dictionary is used as the operand to a string formatting + operation. Returns the resulting string. Before formatting the dictionary, a + couple of preparatory steps are carried out. The *message* attribute of the + record is computed using *msg* % *args*. If the formatting string contains + ``'(asctime)'``, :meth:`formatTime` is called to format the event time. If there + is exception information, it is formatted using :meth:`formatException` and + appended to the message. + + +.. method:: Formatter.formatTime(record[, datefmt]) + + This method should be called from :meth:`format` by a formatter which wants to + make use of a formatted time. This method can be overridden in formatters to + provide for any specific requirement, but the basic behavior is as follows: if + *datefmt* (a string) is specified, it is used with :func:`time.strftime` to + format the creation time of the record. Otherwise, the ISO8601 format is used. + The resulting string is returned. + + +.. method:: Formatter.formatException(exc_info) + + Formats the specified exception information (a standard exception tuple as + returned by :func:`sys.exc_info`) as a string. This default implementation just + uses :func:`traceback.print_exception`. The resulting string is returned. + + +Filter Objects +-------------- + +:class:`Filter`\ s can be used by :class:`Handler`\ s and :class:`Logger`\ s for +more sophisticated filtering than is provided by levels. The base filter class +only allows events which are below a certain point in the logger hierarchy. For +example, a filter initialized with "A.B" will allow events logged by loggers +"A.B", "A.B.C", "A.B.C.D", "A.B.D" etc. but not "A.BB", "B.A.B" etc. If +initialized with the empty string, all events are passed. + + +.. class:: Filter([name]) + + Returns an instance of the :class:`Filter` class. If *name* is specified, it + names a logger which, together with its children, will have its events allowed + through the filter. If no name is specified, allows every event. + + +.. method:: Filter.filter(record) + + Is the specified record to be logged? Returns zero for no, nonzero for yes. If + deemed appropriate, the record may be modified in-place by this method. + + +LogRecord Objects +----------------- + +:class:`LogRecord` instances are created every time something is logged. They +contain all the information pertinent to the event being logged. The main +information passed in is in msg and args, which are combined using msg % args to +create the message field of the record. The record also includes information +such as when the record was created, the source line where the logging call was +made, and any exception information to be logged. + + +.. class:: LogRecord(name, lvl, pathname, lineno, msg, args, exc_info [, func]) + + Returns an instance of :class:`LogRecord` initialized with interesting + information. The *name* is the logger name; *lvl* is the numeric level; + *pathname* is the absolute pathname of the source file in which the logging + call was made; *lineno* is the line number in that file where the logging + call is found; *msg* is the user-supplied message (a format string); *args* + is the tuple which, together with *msg*, makes up the user message; and + *exc_info* is the exception tuple obtained by calling :func:`sys.exc_info` + (or :const:`None`, if no exception information is available). The *func* is + the name of the function from which the logging call was made. If not + specified, it defaults to ``None``. + + .. versionchanged:: 2.5 + *func* was added. + + +.. method:: LogRecord.getMessage() + + Returns the message for this :class:`LogRecord` instance after merging any + user-supplied arguments with the message. + + +Thread Safety +------------- + +The logging module is intended to be thread-safe without any special work +needing to be done by its clients. It achieves this though using threading +locks; there is one lock to serialize access to the module's shared data, and +each handler also creates a lock to serialize access to its underlying I/O. + + +Configuration +------------- + + +.. _logging-config-api: + +Configuration functions +^^^^^^^^^^^^^^^^^^^^^^^ + +.. % + +The following functions configure the logging module. They are located in the +:mod:`logging.config` module. Their use is optional --- you can configure the +logging module using these functions or by making calls to the main API (defined +in :mod:`logging` itself) and defining handlers which are declared either in +:mod:`logging` or :mod:`logging.handlers`. + + +.. function:: fileConfig(fname[, defaults]) + + Reads the logging configuration from a ConfigParser-format file named *fname*. + This function can be called several times from an application, allowing an end + user the ability to select from various pre-canned configurations (if the + developer provides a mechanism to present the choices and load the chosen + configuration). Defaults to be passed to ConfigParser can be specified in the + *defaults* argument. + + +.. function:: listen([port]) + + Starts up a socket server on the specified port, and listens for new + configurations. If no port is specified, the module's default + :const:`DEFAULT_LOGGING_CONFIG_PORT` is used. Logging configurations will be + sent as a file suitable for processing by :func:`fileConfig`. Returns a + :class:`Thread` instance on which you can call :meth:`start` to start the + server, and which you can :meth:`join` when appropriate. To stop the server, + call :func:`stopListening`. To send a configuration to the socket, read in the + configuration file and send it to the socket as a string of bytes preceded by a + four-byte length packed in binary using struct.\ ``pack('>L', n)``. + + +.. function:: stopListening() + + Stops the listening server which was created with a call to :func:`listen`. This + is typically called before calling :meth:`join` on the return value from + :func:`listen`. + + +.. _logging-config-fileformat: + +Configuration file format +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. % + +The configuration file format understood by :func:`fileConfig` is based on +ConfigParser functionality. The file must contain sections called ``[loggers]``, +``[handlers]`` and ``[formatters]`` which identify by name the entities of each +type which are defined in the file. For each such entity, there is a separate +section which identified how that entity is configured. Thus, for a logger named +``log01`` in the ``[loggers]`` section, the relevant configuration details are +held in a section ``[logger_log01]``. Similarly, a handler called ``hand01`` in +the ``[handlers]`` section will have its configuration held in a section called +``[handler_hand01]``, while a formatter called ``form01`` in the +``[formatters]`` section will have its configuration specified in a section +called ``[formatter_form01]``. The root logger configuration must be specified +in a section called ``[logger_root]``. + +Examples of these sections in the file are given below. :: + + [loggers] + keys=root,log02,log03,log04,log05,log06,log07 + + [handlers] + keys=hand01,hand02,hand03,hand04,hand05,hand06,hand07,hand08,hand09 + + [formatters] + keys=form01,form02,form03,form04,form05,form06,form07,form08,form09 + +The root logger must specify a level and a list of handlers. An example of a +root logger section is given below. :: + + [logger_root] + level=NOTSET + handlers=hand01 + +The ``level`` entry can be one of ``DEBUG, INFO, WARNING, ERROR, CRITICAL`` or +``NOTSET``. For the root logger only, ``NOTSET`` means that all messages will be +logged. Level values are :func:`eval`\ uated in the context of the ``logging`` +package's namespace. + +The ``handlers`` entry is a comma-separated list of handler names, which must +appear in the ``[handlers]`` section. These names must appear in the +``[handlers]`` section and have corresponding sections in the configuration +file. + +For loggers other than the root logger, some additional information is required. +This is illustrated by the following example. :: + + [logger_parser] + level=DEBUG + handlers=hand01 + propagate=1 + qualname=compiler.parser + +The ``level`` and ``handlers`` entries are interpreted as for the root logger, +except that if a non-root logger's level is specified as ``NOTSET``, the system +consults loggers higher up the hierarchy to determine the effective level of the +logger. The ``propagate`` entry is set to 1 to indicate that messages must +propagate to handlers higher up the logger hierarchy from this logger, or 0 to +indicate that messages are **not** propagated to handlers up the hierarchy. The +``qualname`` entry is the hierarchical channel name of the logger, that is to +say the name used by the application to get the logger. + +Sections which specify handler configuration are exemplified by the following. +:: + + [handler_hand01] + class=StreamHandler + level=NOTSET + formatter=form01 + args=(sys.stdout,) + +The ``class`` entry indicates the handler's class (as determined by :func:`eval` +in the ``logging`` package's namespace). The ``level`` is interpreted as for +loggers, and ``NOTSET`` is taken to mean "log everything". + +The ``formatter`` entry indicates the key name of the formatter for this +handler. If blank, a default formatter (``logging._defaultFormatter``) is used. +If a name is specified, it must appear in the ``[formatters]`` section and have +a corresponding section in the configuration file. + +The ``args`` entry, when :func:`eval`\ uated in the context of the ``logging`` +package's namespace, is the list of arguments to the constructor for the handler +class. Refer to the constructors for the relevant handlers, or to the examples +below, to see how typical entries are constructed. :: + + [handler_hand02] + class=FileHandler + level=DEBUG + formatter=form02 + args=('python.log', 'w') + + [handler_hand03] + class=handlers.SocketHandler + level=INFO + formatter=form03 + args=('localhost', handlers.DEFAULT_TCP_LOGGING_PORT) + + [handler_hand04] + class=handlers.DatagramHandler + level=WARN + formatter=form04 + args=('localhost', handlers.DEFAULT_UDP_LOGGING_PORT) + + [handler_hand05] + class=handlers.SysLogHandler + level=ERROR + formatter=form05 + args=(('localhost', handlers.SYSLOG_UDP_PORT), handlers.SysLogHandler.LOG_USER) + + [handler_hand06] + class=handlers.NTEventLogHandler + level=CRITICAL + formatter=form06 + args=('Python Application', '', 'Application') + + [handler_hand07] + class=handlers.SMTPHandler + level=WARN + formatter=form07 + args=('localhost', 'from@abc', ['user1@abc', 'user2@xyz'], 'Logger Subject') + + [handler_hand08] + class=handlers.MemoryHandler + level=NOTSET + formatter=form08 + target= + args=(10, ERROR) + + [handler_hand09] + class=handlers.HTTPHandler + level=NOTSET + formatter=form09 + args=('localhost:9022', '/log', 'GET') + +Sections which specify formatter configuration are typified by the following. :: + + [formatter_form01] + format=F1 %(asctime)s %(levelname)s %(message)s + datefmt= + class=logging.Formatter + +The ``format`` entry is the overall format string, and the ``datefmt`` entry is +the :func:`strftime`\ -compatible date/time format string. If empty, the package +substitutes ISO8601 format date/times, which is almost equivalent to specifying +the date format string "The ISO8601 format also specifies milliseconds, which +are appended to the result of using the above format string, with a comma +separator. An example time in ISO8601 format is ``2003-01-23 00:29:50,411``. + +.. % Y-%m-%d %H:%M:%S". + +The ``class`` entry is optional. It indicates the name of the formatter's class +(as a dotted module and class name.) This option is useful for instantiating a +:class:`Formatter` subclass. Subclasses of :class:`Formatter` can present +exception tracebacks in an expanded or condensed format. + diff --git a/Doc/library/mac.rst b/Doc/library/mac.rst new file mode 100644 index 0000000..791eb81 --- /dev/null +++ b/Doc/library/mac.rst @@ -0,0 +1,23 @@ +.. _mac-specific-services: + +************************* +MacOS X specific services +************************* + +This chapter describes modules that are only available on the Mac OS X platform. + +See the chapters :ref:`mac-scripting` and :ref:`undoc-mac-modules` for more +modules, and the HOWTO :ref:`using-on-mac` for a general introduction to +Mac-specific Python programming. + + +.. toctree:: + + ic.rst + macos.rst + macostools.rst + easydialogs.rst + framework.rst + autogil.rst + carbon.rst + colorpicker.rst diff --git a/Doc/library/macos.rst b/Doc/library/macos.rst new file mode 100644 index 0000000..543f868 --- /dev/null +++ b/Doc/library/macos.rst @@ -0,0 +1,95 @@ + +:mod:`MacOS` --- Access to Mac OS interpreter features +====================================================== + +.. module:: MacOS + :platform: Mac + :synopsis: Access to Mac OS-specific interpreter features. + + +This module provides access to MacOS specific functionality in the Python +interpreter, such as how the interpreter eventloop functions and the like. Use +with care. + +Note the capitalization of the module name; this is a historical artifact. + + +.. data:: runtimemodel + + Always ``'macho'``, from Python 2.4 on. In earlier versions of Python the value + could also be ``'ppc'`` for the classic Mac OS 8 runtime model or ``'carbon'`` + for the Mac OS 9 runtime model. + + +.. data:: linkmodel + + The way the interpreter has been linked. As extension modules may be + incompatible between linking models, packages could use this information to give + more decent error messages. The value is one of ``'static'`` for a statically + linked Python, ``'framework'`` for Python in a Mac OS X framework, ``'shared'`` + for Python in a standard Unix shared library. Older Pythons could also have the + value ``'cfm'`` for Mac OS 9-compatible Python. + + +.. exception:: Error + + .. index:: module: macerrors + + This exception is raised on MacOS generated errors, either from functions in + this module or from other mac-specific modules like the toolbox interfaces. The + arguments are the integer error code (the :cdata:`OSErr` value) and a textual + description of the error code. Symbolic names for all known error codes are + defined in the standard module :mod:`macerrors`. + + +.. function:: GetErrorString(errno) + + Return the textual description of MacOS error code *errno*. + + +.. function:: DebugStr(message [, object]) + + On Mac OS X the string is simply printed to stderr (on older Mac OS systems more + elaborate functionality was available), but it provides a convenient location to + attach a breakpoint in a low-level debugger like :program:`gdb`. + + +.. function:: SysBeep() + + Ring the bell. + + +.. function:: GetTicks() + + Get the number of clock ticks (1/60th of a second) since system boot. + + +.. function:: GetCreatorAndType(file) + + Return the file creator and file type as two four-character strings. The *file* + parameter can be a pathname or an ``FSSpec`` or ``FSRef`` object. + + +.. function:: SetCreatorAndType(file, creator, type) + + Set the file creator and file type. The *file* parameter can be a pathname or an + ``FSSpec`` or ``FSRef`` object. *creator* and *type* must be four character + strings. + + +.. function:: openrf(name [, mode]) + + Open the resource fork of a file. Arguments are the same as for the built-in + function :func:`open`. The object returned has file-like semantics, but it is + not a Python file object, so there may be subtle differences. + + +.. function:: WMAvailable() + + Checks whether the current process has access to the window manager. The method + will return ``False`` if the window manager is not available, for instance when + running on Mac OS X Server or when logged in via ssh, or when the current + interpreter is not running from a fullblown application bundle. A script runs + from an application bundle either when it has been started with + :program:`pythonw` instead of :program:`python` or when running as an applet. + diff --git a/Doc/library/macosa.rst b/Doc/library/macosa.rst new file mode 100644 index 0000000..67475ed --- /dev/null +++ b/Doc/library/macosa.rst @@ -0,0 +1,92 @@ + +.. _mac-scripting: + +********************* +MacPython OSA Modules +********************* + +This chapter describes the current implementation of the Open Scripting +Architecure (OSA, also commonly referred to as AppleScript) for Python, allowing +you to control scriptable applications from your Python program, and with a +fairly pythonic interface. Development on this set of modules has stopped, and a +replacement is expected for Python 2.5. + +For a description of the various components of AppleScript and OSA, and to get +an understanding of the architecture and terminology, you should read Apple's +documentation. The "Applescript Language Guide" explains the conceptual model +and the terminology, and documents the standard suite. The "Open Scripting +Architecture" document explains how to use OSA from an application programmers +point of view. In the Apple Help Viewer these books are located in the Developer +Documentation, Core Technologies section. + +As an example of scripting an application, the following piece of AppleScript +will get the name of the frontmost :program:`Finder` window and print it:: + + tell application "Finder" + get name of window 1 + end tell + +In Python, the following code fragment will do the same:: + + import Finder + + f = Finder.Finder() + print f.get(f.window(1).name) + +As distributed the Python library includes packages that implement the standard +suites, plus packages that interface to a small number of common applications. + +To send AppleEvents to an application you must first create the Python package +interfacing to the terminology of the application (what :program:`Script Editor` +calls the "Dictionary"). This can be done from within the :program:`PythonIDE` +or by running the :file:`gensuitemodule.py` module as a standalone program from +the command line. + +The generated output is a package with a number of modules, one for every suite +used in the program plus an :mod:`__init__` module to glue it all together. The +Python inheritance graph follows the AppleScript inheritance graph, so if a +program's dictionary specifies that it includes support for the Standard Suite, +but extends one or two verbs with extra arguments then the output suite will +contain a module :mod:`Standard_Suite` that imports and re-exports everything +from :mod:`StdSuites.Standard_Suite` but overrides the methods that have extra +functionality. The output of :mod:`gensuitemodule` is pretty readable, and +contains the documentation that was in the original AppleScript dictionary in +Python docstrings, so reading it is a good source of documentation. + +The output package implements a main class with the same name as the package +which contains all the AppleScript verbs as methods, with the direct object as +the first argument and all optional parameters as keyword arguments. AppleScript +classes are also implemented as Python classes, as are comparisons and all the +other thingies. + +The main Python class implementing the verbs also allows access to the +properties and elements declared in the AppleScript class "application". In the +current release that is as far as the object orientation goes, so in the example +above we need to use ``f.get(f.window(1).name)`` instead of the more Pythonic +``f.window(1).name.get()``. + +If an AppleScript identifier is not a Python identifier the name is mangled +according to a small number of rules: + +* spaces are replaced with underscores + +* other non-alphanumeric characters are replaced with ``_xx_`` where ``xx`` is + the hexadecimal character value + +* any Python reserved word gets an underscore appended + +Python also has support for creating scriptable applications in Python, but The +following modules are relevant to MacPython AppleScript support: + +.. toctree:: + + gensuitemodule.rst + aetools.rst + aepack.rst + aetypes.rst + miniaeframe.rst + + +In addition, support modules have been pre-generated for :mod:`Finder`, +:mod:`Terminal`, :mod:`Explorer`, :mod:`Netscape`, :mod:`CodeWarrior`, +:mod:`SystemEvents` and :mod:`StdSuites`. diff --git a/Doc/library/macostools.rst b/Doc/library/macostools.rst new file mode 100644 index 0000000..275100e --- /dev/null +++ b/Doc/library/macostools.rst @@ -0,0 +1,115 @@ + +:mod:`macostools` --- Convenience routines for file manipulation +================================================================ + +.. module:: macostools + :platform: Mac + :synopsis: Convenience routines for file manipulation. + + +This module contains some convenience routines for file-manipulation on the +Macintosh. All file parameters can be specified as pathnames, :class:`FSRef` or +:class:`FSSpec` objects. This module expects a filesystem which supports forked +files, so it should not be used on UFS partitions. + +The :mod:`macostools` module defines the following functions: + + +.. function:: copy(src, dst[, createpath[, copytimes]]) + + Copy file *src* to *dst*. If *createpath* is non-zero the folders leading to + *dst* are created if necessary. The method copies data and resource fork and + some finder information (creator, type, flags) and optionally the creation, + modification and backup times (default is to copy them). Custom icons, comments + and icon position are not copied. + + +.. function:: copytree(src, dst) + + Recursively copy a file tree from *src* to *dst*, creating folders as needed. + *src* and *dst* should be specified as pathnames. + + +.. function:: mkalias(src, dst) + + Create a finder alias *dst* pointing to *src*. + + +.. function:: touched(dst) + + Tell the finder that some bits of finder-information such as creator or type for + file *dst* has changed. The file can be specified by pathname or fsspec. This + call should tell the finder to redraw the files icon. + + .. deprecated:: 2.6 + The function is a no-op on OS X. + + +.. data:: BUFSIZ + + The buffer size for ``copy``, default 1 megabyte. + +Note that the process of creating finder aliases is not specified in the Apple +documentation. Hence, aliases created with :func:`mkalias` could conceivably +have incompatible behaviour in some cases. + + +:mod:`findertools` --- The :program:`finder`'s Apple Events interface +===================================================================== + +.. module:: findertools + :platform: Mac + :synopsis: Wrappers around the finder's Apple Events interface. + + +.. index:: single: AppleEvents + +This module contains routines that give Python programs access to some +functionality provided by the finder. They are implemented as wrappers around +the AppleEvent interface to the finder. + +All file and folder parameters can be specified either as full pathnames, or as +:class:`FSRef` or :class:`FSSpec` objects. + +The :mod:`findertools` module defines the following functions: + + +.. function:: launch(file) + + Tell the finder to launch *file*. What launching means depends on the file: + applications are started, folders are opened and documents are opened in the + correct application. + + +.. function:: Print(file) + + Tell the finder to print a file. The behaviour is identical to selecting the + file and using the print command in the finder's file menu. + + +.. function:: copy(file, destdir) + + Tell the finder to copy a file or folder *file* to folder *destdir*. The + function returns an :class:`Alias` object pointing to the new file. + + +.. function:: move(file, destdir) + + Tell the finder to move a file or folder *file* to folder *destdir*. The + function returns an :class:`Alias` object pointing to the new file. + + +.. function:: sleep() + + Tell the finder to put the Macintosh to sleep, if your machine supports it. + + +.. function:: restart() + + Tell the finder to perform an orderly restart of the machine. + + +.. function:: shutdown() + + Tell the finder to perform an orderly shutdown of the machine. + diff --git a/Doc/library/macpath.rst b/Doc/library/macpath.rst new file mode 100644 index 0000000..66c54e5 --- /dev/null +++ b/Doc/library/macpath.rst @@ -0,0 +1,17 @@ + +:mod:`macpath` --- MacOS 9 path manipulation functions +====================================================== + +.. module:: macpath + :synopsis: MacOS 9 path manipulation functions. + + +This module is the Mac OS 9 (and earlier) implementation of the :mod:`os.path` +module. It can be used to manipulate old-style Macintosh pathnames on Mac OS X +(or any other platform). + +The following functions are available in this module: :func:`normcase`, +:func:`normpath`, :func:`isabs`, :func:`join`, :func:`split`, :func:`isdir`, +:func:`isfile`, :func:`walk`, :func:`exists`. For other functions available in +:mod:`os.path` dummy counterparts are available. + diff --git a/Doc/library/mailbox.rst b/Doc/library/mailbox.rst new file mode 100644 index 0000000..ce8dc59 --- /dev/null +++ b/Doc/library/mailbox.rst @@ -0,0 +1,1679 @@ + +:mod:`mailbox` --- Manipulate mailboxes in various formats +========================================================== + +.. module:: mailbox + :synopsis: Manipulate mailboxes in various formats +.. moduleauthor:: Gregory K. Johnson <gkj@gregorykjohnson.com> +.. sectionauthor:: Gregory K. Johnson <gkj@gregorykjohnson.com> + + +This module defines two classes, :class:`Mailbox` and :class:`Message`, for +accessing and manipulating on-disk mailboxes and the messages they contain. +:class:`Mailbox` offers a dictionary-like mapping from keys to messages. +:class:`Message` extends the :mod:`email.Message` module's :class:`Message` +class with format-specific state and behavior. Supported mailbox formats are +Maildir, mbox, MH, Babyl, and MMDF. + + +.. seealso:: + + Module :mod:`email` + Represent and manipulate messages. + + +.. _mailbox-objects: + +:class:`Mailbox` objects +------------------------ + + +.. class:: Mailbox + + A mailbox, which may be inspected and modified. + +The :class:`Mailbox` class defines an interface and is not intended to be +instantiated. Instead, format-specific subclasses should inherit from +:class:`Mailbox` and your code should instantiate a particular subclass. + +The :class:`Mailbox` interface is dictionary-like, with small keys corresponding +to messages. Keys are issued by the :class:`Mailbox` instance with which they +will be used and are only meaningful to that :class:`Mailbox` instance. A key +continues to identify a message even if the corresponding message is modified, +such as by replacing it with another message. + +Messages may be added to a :class:`Mailbox` instance using the set-like method +:meth:`add` and removed using a ``del`` statement or the set-like methods +:meth:`remove` and :meth:`discard`. + +:class:`Mailbox` interface semantics differ from dictionary semantics in some +noteworthy ways. Each time a message is requested, a new representation +(typically a :class:`Message` instance) is generated based upon the current +state of the mailbox. Similarly, when a message is added to a :class:`Mailbox` +instance, the provided message representation's contents are copied. In neither +case is a reference to the message representation kept by the :class:`Mailbox` +instance. + +The default :class:`Mailbox` iterator iterates over message representations, not +keys as the default dictionary iterator does. Moreover, modification of a +mailbox during iteration is safe and well-defined. Messages added to the mailbox +after an iterator is created will not be seen by the iterator. Messages removed +from the mailbox before the iterator yields them will be silently skipped, +though using a key from an iterator may result in a :exc:`KeyError` exception if +the corresponding message is subsequently removed. + +.. warning:: + + Be very cautious when modifying mailboxes that might be simultaneously changed + by some other process. The safest mailbox format to use for such tasks is + Maildir; try to avoid using single-file formats such as mbox for concurrent + writing. If you're modifying a mailbox, you *must* lock it by calling the + :meth:`lock` and :meth:`unlock` methods *before* reading any messages in the + file or making any changes by adding or deleting a message. Failing to lock the + mailbox runs the risk of losing messages or corrupting the entire mailbox. + +:class:`Mailbox` instances have the following methods: + + +.. method:: Mailbox.add(message) + + Add *message* to the mailbox and return the key that has been assigned to it. + + Parameter *message* may be a :class:`Message` instance, an + :class:`email.Message.Message` instance, a string, or a file-like object (which + should be open in text mode). If *message* is an instance of the appropriate + format-specific :class:`Message` subclass (e.g., if it's an :class:`mboxMessage` + instance and this is an :class:`mbox` instance), its format-specific information + is used. Otherwise, reasonable defaults for format-specific information are + used. + + +.. method:: Mailbox.remove(key) + Mailbox.__delitem__(key) + Mailbox.discard(key) + + Delete the message corresponding to *key* from the mailbox. + + If no such message exists, a :exc:`KeyError` exception is raised if the method + was called as :meth:`remove` or :meth:`__delitem__` but no exception is raised + if the method was called as :meth:`discard`. The behavior of :meth:`discard` may + be preferred if the underlying mailbox format supports concurrent modification + by other processes. + + +.. method:: Mailbox.__setitem__(key, message) + + Replace the message corresponding to *key* with *message*. Raise a + :exc:`KeyError` exception if no message already corresponds to *key*. + + As with :meth:`add`, parameter *message* may be a :class:`Message` instance, an + :class:`email.Message.Message` instance, a string, or a file-like object (which + should be open in text mode). If *message* is an instance of the appropriate + format-specific :class:`Message` subclass (e.g., if it's an :class:`mboxMessage` + instance and this is an :class:`mbox` instance), its format-specific information + is used. Otherwise, the format-specific information of the message that + currently corresponds to *key* is left unchanged. + + +.. method:: Mailbox.iterkeys() + Mailbox.keys() + + Return an iterator over all keys if called as :meth:`iterkeys` or return a list + of keys if called as :meth:`keys`. + + +.. method:: Mailbox.itervalues() + Mailbox.__iter__() + Mailbox.values() + + Return an iterator over representations of all messages if called as + :meth:`itervalues` or :meth:`__iter__` or return a list of such representations + if called as :meth:`values`. The messages are represented as instances of the + appropriate format-specific :class:`Message` subclass unless a custom message + factory was specified when the :class:`Mailbox` instance was initialized. + + .. note:: + + The behavior of :meth:`__iter__` is unlike that of dictionaries, which iterate + over keys. + + +.. method:: Mailbox.iteritems() + Mailbox.items() + + Return an iterator over (*key*, *message*) pairs, where *key* is a key and + *message* is a message representation, if called as :meth:`iteritems` or return + a list of such pairs if called as :meth:`items`. The messages are represented as + instances of the appropriate format-specific :class:`Message` subclass unless a + custom message factory was specified when the :class:`Mailbox` instance was + initialized. + + +.. method:: Mailbox.get(key[, default=None]) + Mailbox.__getitem__(key) + + Return a representation of the message corresponding to *key*. If no such + message exists, *default* is returned if the method was called as :meth:`get` + and a :exc:`KeyError` exception is raised if the method was called as + :meth:`__getitem__`. The message is represented as an instance of the + appropriate format-specific :class:`Message` subclass unless a custom message + factory was specified when the :class:`Mailbox` instance was initialized. + + +.. method:: Mailbox.get_message(key) + + Return a representation of the message corresponding to *key* as an instance of + the appropriate format-specific :class:`Message` subclass, or raise a + :exc:`KeyError` exception if no such message exists. + + +.. method:: Mailbox.get_string(key) + + Return a string representation of the message corresponding to *key*, or raise a + :exc:`KeyError` exception if no such message exists. + + +.. method:: Mailbox.get_file(key) + + Return a file-like representation of the message corresponding to *key*, or + raise a :exc:`KeyError` exception if no such message exists. The file-like + object behaves as if open in binary mode. This file should be closed once it is + no longer needed. + + .. note:: + + Unlike other representations of messages, file-like representations are not + necessarily independent of the :class:`Mailbox` instance that created them or of + the underlying mailbox. More specific documentation is provided by each + subclass. + + +.. method:: Mailbox.has_key(key) + Mailbox.__contains__(key) + + Return ``True`` if *key* corresponds to a message, ``False`` otherwise. + + +.. method:: Mailbox.__len__() + + Return a count of messages in the mailbox. + + +.. method:: Mailbox.clear() + + Delete all messages from the mailbox. + + +.. method:: Mailbox.pop(key[, default]) + + Return a representation of the message corresponding to *key* and delete the + message. If no such message exists, return *default* if it was supplied or else + raise a :exc:`KeyError` exception. The message is represented as an instance of + the appropriate format-specific :class:`Message` subclass unless a custom + message factory was specified when the :class:`Mailbox` instance was + initialized. + + +.. method:: Mailbox.popitem() + + Return an arbitrary (*key*, *message*) pair, where *key* is a key and *message* + is a message representation, and delete the corresponding message. If the + mailbox is empty, raise a :exc:`KeyError` exception. The message is represented + as an instance of the appropriate format-specific :class:`Message` subclass + unless a custom message factory was specified when the :class:`Mailbox` instance + was initialized. + + +.. method:: Mailbox.update(arg) + + Parameter *arg* should be a *key*-to-*message* mapping or an iterable of (*key*, + *message*) pairs. Updates the mailbox so that, for each given *key* and + *message*, the message corresponding to *key* is set to *message* as if by using + :meth:`__setitem__`. As with :meth:`__setitem__`, each *key* must already + correspond to a message in the mailbox or else a :exc:`KeyError` exception will + be raised, so in general it is incorrect for *arg* to be a :class:`Mailbox` + instance. + + .. note:: + + Unlike with dictionaries, keyword arguments are not supported. + + +.. method:: Mailbox.flush() + + Write any pending changes to the filesystem. For some :class:`Mailbox` + subclasses, changes are always written immediately and :meth:`flush` does + nothing, but you should still make a habit of calling this method. + + +.. method:: Mailbox.lock() + + Acquire an exclusive advisory lock on the mailbox so that other processes know + not to modify it. An :exc:`ExternalClashError` is raised if the lock is not + available. The particular locking mechanisms used depend upon the mailbox + format. You should *always* lock the mailbox before making any modifications + to its contents. + + +.. method:: Mailbox.unlock() + + Release the lock on the mailbox, if any. + + +.. method:: Mailbox.close() + + Flush the mailbox, unlock it if necessary, and close any open files. For some + :class:`Mailbox` subclasses, this method does nothing. + + +.. _mailbox-maildir: + +:class:`Maildir` +^^^^^^^^^^^^^^^^ + + +.. class:: Maildir(dirname[, factory=rfc822.Message[, create=True]]) + + A subclass of :class:`Mailbox` for mailboxes in Maildir format. Parameter + *factory* is a callable object that accepts a file-like message representation + (which behaves as if opened in binary mode) and returns a custom representation. + If *factory* is ``None``, :class:`MaildirMessage` is used as the default message + representation. If *create* is ``True``, the mailbox is created if it does not + exist. + + It is for historical reasons that *factory* defaults to :class:`rfc822.Message` + and that *dirname* is named as such rather than *path*. For a :class:`Maildir` + instance that behaves like instances of other :class:`Mailbox` subclasses, set + *factory* to ``None``. + +Maildir is a directory-based mailbox format invented for the qmail mail transfer +agent and now widely supported by other programs. Messages in a Maildir mailbox +are stored in separate files within a common directory structure. This design +allows Maildir mailboxes to be accessed and modified by multiple unrelated +programs without data corruption, so file locking is unnecessary. + +Maildir mailboxes contain three subdirectories, namely: :file:`tmp`, +:file:`new`, and :file:`cur`. Messages are created momentarily in the +:file:`tmp` subdirectory and then moved to the :file:`new` subdirectory to +finalize delivery. A mail user agent may subsequently move the message to the +:file:`cur` subdirectory and store information about the state of the message in +a special "info" section appended to its file name. + +Folders of the style introduced by the Courier mail transfer agent are also +supported. Any subdirectory of the main mailbox is considered a folder if +``'.'`` is the first character in its name. Folder names are represented by +:class:`Maildir` without the leading ``'.'``. Each folder is itself a Maildir +mailbox but should not contain other folders. Instead, a logical nesting is +indicated using ``'.'`` to delimit levels, e.g., "Archived.2005.07". + +.. note:: + + The Maildir specification requires the use of a colon (``':'``) in certain + message file names. However, some operating systems do not permit this character + in file names, If you wish to use a Maildir-like format on such an operating + system, you should specify another character to use instead. The exclamation + point (``'!'``) is a popular choice. For example:: + + import mailbox + mailbox.Maildir.colon = '!' + + The :attr:`colon` attribute may also be set on a per-instance basis. + +:class:`Maildir` instances have all of the methods of :class:`Mailbox` in +addition to the following: + + +.. method:: Maildir.list_folders() + + Return a list of the names of all folders. + + +.. method:: Maildir.get_folder(folder) + + Return a :class:`Maildir` instance representing the folder whose name is + *folder*. A :exc:`NoSuchMailboxError` exception is raised if the folder does not + exist. + + +.. method:: Maildir.add_folder(folder) + + Create a folder whose name is *folder* and return a :class:`Maildir` instance + representing it. + + +.. method:: Maildir.remove_folder(folder) + + Delete the folder whose name is *folder*. If the folder contains any messages, a + :exc:`NotEmptyError` exception will be raised and the folder will not be + deleted. + + +.. method:: Maildir.clean() + + Delete temporary files from the mailbox that have not been accessed in the last + 36 hours. The Maildir specification says that mail-reading programs should do + this occasionally. + +Some :class:`Mailbox` methods implemented by :class:`Maildir` deserve special +remarks: + + +.. method:: Maildir.add(message) + Maildir.__setitem__(key, message) + Maildir.update(arg) + + .. warning:: + + These methods generate unique file names based upon the current process ID. When + using multiple threads, undetected name clashes may occur and cause corruption + of the mailbox unless threads are coordinated to avoid using these methods to + manipulate the same mailbox simultaneously. + + +.. method:: Maildir.flush() + + All changes to Maildir mailboxes are immediately applied, so this method does + nothing. + + +.. method:: Maildir.lock() + Maildir.unlock() + + Maildir mailboxes do not support (or require) locking, so these methods do + nothing. + + +.. method:: Maildir.close() + + :class:`Maildir` instances do not keep any open files and the underlying + mailboxes do not support locking, so this method does nothing. + + +.. method:: Maildir.get_file(key) + + Depending upon the host platform, it may not be possible to modify or remove the + underlying message while the returned file remains open. + + +.. seealso:: + + `maildir man page from qmail <http://www.qmail.org/man/man5/maildir.html>`_ + The original specification of the format. + + `Using maildir format <http://cr.yp.to/proto/maildir.html>`_ + Notes on Maildir by its inventor. Includes an updated name-creation scheme and + details on "info" semantics. + + `maildir man page from Courier <http://www.courier-mta.org/?maildir.html>`_ + Another specification of the format. Describes a common extension for supporting + folders. + + +.. _mailbox-mbox: + +:class:`mbox` +^^^^^^^^^^^^^ + + +.. class:: mbox(path[, factory=None[, create=True]]) + + A subclass of :class:`Mailbox` for mailboxes in mbox format. Parameter *factory* + is a callable object that accepts a file-like message representation (which + behaves as if opened in binary mode) and returns a custom representation. If + *factory* is ``None``, :class:`mboxMessage` is used as the default message + representation. If *create* is ``True``, the mailbox is created if it does not + exist. + +The mbox format is the classic format for storing mail on Unix systems. All +messages in an mbox mailbox are stored in a single file with the beginning of +each message indicated by a line whose first five characters are "From ". + +Several variations of the mbox format exist to address perceived shortcomings in +the original. In the interest of compatibility, :class:`mbox` implements the +original format, which is sometimes referred to as :dfn:`mboxo`. This means that +the :mailheader:`Content-Length` header, if present, is ignored and that any +occurrences of "From " at the beginning of a line in a message body are +transformed to ">From " when storing the message, although occurences of ">From +" are not transformed to "From " when reading the message. + +Some :class:`Mailbox` methods implemented by :class:`mbox` deserve special +remarks: + + +.. method:: mbox.get_file(key) + + Using the file after calling :meth:`flush` or :meth:`close` on the :class:`mbox` + instance may yield unpredictable results or raise an exception. + + +.. method:: mbox.lock() + mbox.unlock() + + Three locking mechanisms are used---dot locking and, if available, the + :cfunc:`flock` and :cfunc:`lockf` system calls. + + +.. seealso:: + + `mbox man page from qmail <http://www.qmail.org/man/man5/mbox.html>`_ + A specification of the format and its variations. + + `mbox man page from tin <http://www.tin.org/bin/man.cgi?section=5&topic=mbox>`_ + Another specification of the format, with details on locking. + + `Configuring Netscape Mail on Unix: Why The Content-Length Format is Bad <http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html>`_ + An argument for using the original mbox format rather than a variation. + + `"mbox" is a family of several mutually incompatible mailbox formats <http://homepages.tesco.net./~J.deBoynePollard/FGA/mail-mbox-formats.html>`_ + A history of mbox variations. + + +.. _mailbox-mh: + +:class:`MH` +^^^^^^^^^^^ + + +.. class:: MH(path[, factory=None[, create=True]]) + + A subclass of :class:`Mailbox` for mailboxes in MH format. Parameter *factory* + is a callable object that accepts a file-like message representation (which + behaves as if opened in binary mode) and returns a custom representation. If + *factory* is ``None``, :class:`MHMessage` is used as the default message + representation. If *create* is ``True``, the mailbox is created if it does not + exist. + +MH is a directory-based mailbox format invented for the MH Message Handling +System, a mail user agent. Each message in an MH mailbox resides in its own +file. An MH mailbox may contain other MH mailboxes (called :dfn:`folders`) in +addition to messages. Folders may be nested indefinitely. MH mailboxes also +support :dfn:`sequences`, which are named lists used to logically group messages +without moving them to sub-folders. Sequences are defined in a file called +:file:`.mh_sequences` in each folder. + +The :class:`MH` class manipulates MH mailboxes, but it does not attempt to +emulate all of :program:`mh`'s behaviors. In particular, it does not modify and +is not affected by the :file:`context` or :file:`.mh_profile` files that are +used by :program:`mh` to store its state and configuration. + +:class:`MH` instances have all of the methods of :class:`Mailbox` in addition to +the following: + + +.. method:: MH.list_folders() + + Return a list of the names of all folders. + + +.. method:: MH.get_folder(folder) + + Return an :class:`MH` instance representing the folder whose name is *folder*. A + :exc:`NoSuchMailboxError` exception is raised if the folder does not exist. + + +.. method:: MH.add_folder(folder) + + Create a folder whose name is *folder* and return an :class:`MH` instance + representing it. + + +.. method:: MH.remove_folder(folder) + + Delete the folder whose name is *folder*. If the folder contains any messages, a + :exc:`NotEmptyError` exception will be raised and the folder will not be + deleted. + + +.. method:: MH.get_sequences() + + Return a dictionary of sequence names mapped to key lists. If there are no + sequences, the empty dictionary is returned. + + +.. method:: MH.set_sequences(sequences) + + Re-define the sequences that exist in the mailbox based upon *sequences*, a + dictionary of names mapped to key lists, like returned by :meth:`get_sequences`. + + +.. method:: MH.pack() + + Rename messages in the mailbox as necessary to eliminate gaps in numbering. + Entries in the sequences list are updated correspondingly. + + .. note:: + + Already-issued keys are invalidated by this operation and should not be + subsequently used. + +Some :class:`Mailbox` methods implemented by :class:`MH` deserve special +remarks: + + +.. method:: MH.remove(key) + MH.__delitem__(key) + MH.discard(key) + + These methods immediately delete the message. The MH convention of marking a + message for deletion by prepending a comma to its name is not used. + + +.. method:: MH.lock() + MH.unlock() + + Three locking mechanisms are used---dot locking and, if available, the + :cfunc:`flock` and :cfunc:`lockf` system calls. For MH mailboxes, locking the + mailbox means locking the :file:`.mh_sequences` file and, only for the duration + of any operations that affect them, locking individual message files. + + +.. method:: MH.get_file(key) + + Depending upon the host platform, it may not be possible to remove the + underlying message while the returned file remains open. + + +.. method:: MH.flush() + + All changes to MH mailboxes are immediately applied, so this method does + nothing. + + +.. method:: MH.close() + + :class:`MH` instances do not keep any open files, so this method is equivelant + to :meth:`unlock`. + + +.. seealso:: + + `nmh - Message Handling System <http://www.nongnu.org/nmh/>`_ + Home page of :program:`nmh`, an updated version of the original :program:`mh`. + + `MH & nmh: Email for Users & Programmers <http://www.ics.uci.edu/~mh/book/>`_ + A GPL-licensed book on :program:`mh` and :program:`nmh`, with some information + on the mailbox format. + + +.. _mailbox-babyl: + +:class:`Babyl` +^^^^^^^^^^^^^^ + + +.. class:: Babyl(path[, factory=None[, create=True]]) + + A subclass of :class:`Mailbox` for mailboxes in Babyl format. Parameter + *factory* is a callable object that accepts a file-like message representation + (which behaves as if opened in binary mode) and returns a custom representation. + If *factory* is ``None``, :class:`BabylMessage` is used as the default message + representation. If *create* is ``True``, the mailbox is created if it does not + exist. + +Babyl is a single-file mailbox format used by the Rmail mail user agent included +with Emacs. The beginning of a message is indicated by a line containing the two +characters Control-Underscore (``'\037'``) and Control-L (``'\014'``). The end +of a message is indicated by the start of the next message or, in the case of +the last message, a line containing a Control-Underscore (``'\037'``) +character. + +Messages in a Babyl mailbox have two sets of headers, original headers and +so-called visible headers. Visible headers are typically a subset of the +original headers that have been reformatted or abridged to be more +attractive. Each message in a Babyl mailbox also has an accompanying list of +:dfn:`labels`, or short strings that record extra information about the message, +and a list of all user-defined labels found in the mailbox is kept in the Babyl +options section. + +:class:`Babyl` instances have all of the methods of :class:`Mailbox` in addition +to the following: + + +.. method:: Babyl.get_labels() + + Return a list of the names of all user-defined labels used in the mailbox. + + .. note:: + + The actual messages are inspected to determine which labels exist in the mailbox + rather than consulting the list of labels in the Babyl options section, but the + Babyl section is updated whenever the mailbox is modified. + +Some :class:`Mailbox` methods implemented by :class:`Babyl` deserve special +remarks: + + +.. method:: Babyl.get_file(key) + + In Babyl mailboxes, the headers of a message are not stored contiguously with + the body of the message. To generate a file-like representation, the headers and + body are copied together into a :class:`StringIO` instance (from the + :mod:`StringIO` module), which has an API identical to that of a file. As a + result, the file-like object is truly independent of the underlying mailbox but + does not save memory compared to a string representation. + + +.. method:: Babyl.lock() + Babyl.unlock() + + Three locking mechanisms are used---dot locking and, if available, the + :cfunc:`flock` and :cfunc:`lockf` system calls. + + +.. seealso:: + + `Format of Version 5 Babyl Files <http://quimby.gnus.org/notes/BABYL>`_ + A specification of the Babyl format. + + `Reading Mail with Rmail <http://www.gnu.org/software/emacs/manual/html_node/Rmail.html>`_ + The Rmail manual, with some information on Babyl semantics. + + +.. _mailbox-mmdf: + +:class:`MMDF` +^^^^^^^^^^^^^ + + +.. class:: MMDF(path[, factory=None[, create=True]]) + + A subclass of :class:`Mailbox` for mailboxes in MMDF format. Parameter *factory* + is a callable object that accepts a file-like message representation (which + behaves as if opened in binary mode) and returns a custom representation. If + *factory* is ``None``, :class:`MMDFMessage` is used as the default message + representation. If *create* is ``True``, the mailbox is created if it does not + exist. + +MMDF is a single-file mailbox format invented for the Multichannel Memorandum +Distribution Facility, a mail transfer agent. Each message is in the same form +as an mbox message but is bracketed before and after by lines containing four +Control-A (``'\001'``) characters. As with the mbox format, the beginning of +each message is indicated by a line whose first five characters are "From ", but +additional occurrences of "From " are not transformed to ">From " when storing +messages because the extra message separator lines prevent mistaking such +occurrences for the starts of subsequent messages. + +Some :class:`Mailbox` methods implemented by :class:`MMDF` deserve special +remarks: + + +.. method:: MMDF.get_file(key) + + Using the file after calling :meth:`flush` or :meth:`close` on the :class:`MMDF` + instance may yield unpredictable results or raise an exception. + + +.. method:: MMDF.lock() + MMDF.unlock() + + Three locking mechanisms are used---dot locking and, if available, the + :cfunc:`flock` and :cfunc:`lockf` system calls. + + +.. seealso:: + + `mmdf man page from tin <http://www.tin.org/bin/man.cgi?section=5&topic=mmdf>`_ + A specification of MMDF format from the documentation of tin, a newsreader. + + `MMDF <http://en.wikipedia.org/wiki/MMDF>`_ + A Wikipedia article describing the Multichannel Memorandum Distribution + Facility. + + +.. _mailbox-message-objects: + +:class:`Message` objects +------------------------ + + +.. class:: Message([message]) + + A subclass of the :mod:`email.Message` module's :class:`Message`. Subclasses of + :class:`mailbox.Message` add mailbox-format-specific state and behavior. + + If *message* is omitted, the new instance is created in a default, empty state. + If *message* is an :class:`email.Message.Message` instance, its contents are + copied; furthermore, any format-specific information is converted insofar as + possible if *message* is a :class:`Message` instance. If *message* is a string + or a file, it should contain an :rfc:`2822`\ -compliant message, which is read + and parsed. + +The format-specific state and behaviors offered by subclasses vary, but in +general it is only the properties that are not specific to a particular mailbox +that are supported (although presumably the properties are specific to a +particular mailbox format). For example, file offsets for single-file mailbox +formats and file names for directory-based mailbox formats are not retained, +because they are only applicable to the original mailbox. But state such as +whether a message has been read by the user or marked as important is retained, +because it applies to the message itself. + +There is no requirement that :class:`Message` instances be used to represent +messages retrieved using :class:`Mailbox` instances. In some situations, the +time and memory required to generate :class:`Message` representations might not +not acceptable. For such situations, :class:`Mailbox` instances also offer +string and file-like representations, and a custom message factory may be +specified when a :class:`Mailbox` instance is initialized. + + +.. _mailbox-maildirmessage: + +:class:`MaildirMessage` +^^^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: MaildirMessage([message]) + + A message with Maildir-specific behaviors. Parameter *message* has the same + meaning as with the :class:`Message` constructor. + +Typically, a mail user agent application moves all of the messages in the +:file:`new` subdirectory to the :file:`cur` subdirectory after the first time +the user opens and closes the mailbox, recording that the messages are old +whether or not they've actually been read. Each message in :file:`cur` has an +"info" section added to its file name to store information about its state. +(Some mail readers may also add an "info" section to messages in :file:`new`.) +The "info" section may take one of two forms: it may contain "2," followed by a +list of standardized flags (e.g., "2,FR") or it may contain "1," followed by +so-called experimental information. Standard flags for Maildir messages are as +follows: + ++------+---------+--------------------------------+ +| Flag | Meaning | Explanation | ++======+=========+================================+ +| D | Draft | Under composition | ++------+---------+--------------------------------+ +| F | Flagged | Marked as important | ++------+---------+--------------------------------+ +| P | Passed | Forwarded, resent, or bounced | ++------+---------+--------------------------------+ +| R | Replied | Replied to | ++------+---------+--------------------------------+ +| S | Seen | Read | ++------+---------+--------------------------------+ +| T | Trashed | Marked for subsequent deletion | ++------+---------+--------------------------------+ + +:class:`MaildirMessage` instances offer the following methods: + + +.. method:: MaildirMessage.get_subdir() + + Return either "new" (if the message should be stored in the :file:`new` + subdirectory) or "cur" (if the message should be stored in the :file:`cur` + subdirectory). + + .. note:: + + A message is typically moved from :file:`new` to :file:`cur` after its mailbox + has been accessed, whether or not the message is has been read. A message + ``msg`` has been read if ``"S" not in msg.get_flags()`` is ``True``. + + +.. method:: MaildirMessage.set_subdir(subdir) + + Set the subdirectory the message should be stored in. Parameter *subdir* must be + either "new" or "cur". + + +.. method:: MaildirMessage.get_flags() + + Return a string specifying the flags that are currently set. If the message + complies with the standard Maildir format, the result is the concatenation in + alphabetical order of zero or one occurrence of each of ``'D'``, ``'F'``, + ``'P'``, ``'R'``, ``'S'``, and ``'T'``. The empty string is returned if no flags + are set or if "info" contains experimental semantics. + + +.. method:: MaildirMessage.set_flags(flags) + + Set the flags specified by *flags* and unset all others. + + +.. method:: MaildirMessage.add_flag(flag) + + Set the flag(s) specified by *flag* without changing other flags. To add more + than one flag at a time, *flag* may be a string of more than one character. The + current "info" is overwritten whether or not it contains experimental + information rather than flags. + + +.. method:: MaildirMessage.remove_flag(flag) + + Unset the flag(s) specified by *flag* without changing other flags. To remove + more than one flag at a time, *flag* maybe a string of more than one character. + If "info" contains experimental information rather than flags, the current + "info" is not modified. + + +.. method:: MaildirMessage.get_date() + + Return the delivery date of the message as a floating-point number representing + seconds since the epoch. + + +.. method:: MaildirMessage.set_date(date) + + Set the delivery date of the message to *date*, a floating-point number + representing seconds since the epoch. + + +.. method:: MaildirMessage.get_info() + + Return a string containing the "info" for a message. This is useful for + accessing and modifying "info" that is experimental (i.e., not a list of flags). + + +.. method:: MaildirMessage.set_info(info) + + Set "info" to *info*, which should be a string. + +When a :class:`MaildirMessage` instance is created based upon an +:class:`mboxMessage` or :class:`MMDFMessage` instance, the :mailheader:`Status` +and :mailheader:`X-Status` headers are omitted and the following conversions +take place: + ++--------------------+----------------------------------------------+ +| Resulting state | :class:`mboxMessage` or :class:`MMDFMessage` | +| | state | ++====================+==============================================+ +| "cur" subdirectory | O flag | ++--------------------+----------------------------------------------+ +| F flag | F flag | ++--------------------+----------------------------------------------+ +| R flag | A flag | ++--------------------+----------------------------------------------+ +| S flag | R flag | ++--------------------+----------------------------------------------+ +| T flag | D flag | ++--------------------+----------------------------------------------+ + +When a :class:`MaildirMessage` instance is created based upon an +:class:`MHMessage` instance, the following conversions take place: + ++-------------------------------+--------------------------+ +| Resulting state | :class:`MHMessage` state | ++===============================+==========================+ +| "cur" subdirectory | "unseen" sequence | ++-------------------------------+--------------------------+ +| "cur" subdirectory and S flag | no "unseen" sequence | ++-------------------------------+--------------------------+ +| F flag | "flagged" sequence | ++-------------------------------+--------------------------+ +| R flag | "replied" sequence | ++-------------------------------+--------------------------+ + +When a :class:`MaildirMessage` instance is created based upon a +:class:`BabylMessage` instance, the following conversions take place: + ++-------------------------------+-------------------------------+ +| Resulting state | :class:`BabylMessage` state | ++===============================+===============================+ +| "cur" subdirectory | "unseen" label | ++-------------------------------+-------------------------------+ +| "cur" subdirectory and S flag | no "unseen" label | ++-------------------------------+-------------------------------+ +| P flag | "forwarded" or "resent" label | ++-------------------------------+-------------------------------+ +| R flag | "answered" label | ++-------------------------------+-------------------------------+ +| T flag | "deleted" label | ++-------------------------------+-------------------------------+ + + +.. _mailbox-mboxmessage: + +:class:`mboxMessage` +^^^^^^^^^^^^^^^^^^^^ + + +.. class:: mboxMessage([message]) + + A message with mbox-specific behaviors. Parameter *message* has the same meaning + as with the :class:`Message` constructor. + +Messages in an mbox mailbox are stored together in a single file. The sender's +envelope address and the time of delivery are typically stored in a line +beginning with "From " that is used to indicate the start of a message, though +there is considerable variation in the exact format of this data among mbox +implementations. Flags that indicate the state of the message, such as whether +it has been read or marked as important, are typically stored in +:mailheader:`Status` and :mailheader:`X-Status` headers. + +Conventional flags for mbox messages are as follows: + ++------+----------+--------------------------------+ +| Flag | Meaning | Explanation | ++======+==========+================================+ +| R | Read | Read | ++------+----------+--------------------------------+ +| O | Old | Previously detected by MUA | ++------+----------+--------------------------------+ +| D | Deleted | Marked for subsequent deletion | ++------+----------+--------------------------------+ +| F | Flagged | Marked as important | ++------+----------+--------------------------------+ +| A | Answered | Replied to | ++------+----------+--------------------------------+ + +The "R" and "O" flags are stored in the :mailheader:`Status` header, and the +"D", "F", and "A" flags are stored in the :mailheader:`X-Status` header. The +flags and headers typically appear in the order mentioned. + +:class:`mboxMessage` instances offer the following methods: + + +.. method:: mboxMessage.get_from() + + Return a string representing the "From " line that marks the start of the + message in an mbox mailbox. The leading "From " and the trailing newline are + excluded. + + +.. method:: mboxMessage.set_from(from_[, time_=None]) + + Set the "From " line to *from_*, which should be specified without a leading + "From " or trailing newline. For convenience, *time_* may be specified and will + be formatted appropriately and appended to *from_*. If *time_* is specified, it + should be a :class:`struct_time` instance, a tuple suitable for passing to + :meth:`time.strftime`, or ``True`` (to use :meth:`time.gmtime`). + + +.. method:: mboxMessage.get_flags() + + Return a string specifying the flags that are currently set. If the message + complies with the conventional format, the result is the concatenation in the + following order of zero or one occurrence of each of ``'R'``, ``'O'``, ``'D'``, + ``'F'``, and ``'A'``. + + +.. method:: mboxMessage.set_flags(flags) + + Set the flags specified by *flags* and unset all others. Parameter *flags* + should be the concatenation in any order of zero or more occurrences of each of + ``'R'``, ``'O'``, ``'D'``, ``'F'``, and ``'A'``. + + +.. method:: mboxMessage.add_flag(flag) + + Set the flag(s) specified by *flag* without changing other flags. To add more + than one flag at a time, *flag* may be a string of more than one character. + + +.. method:: mboxMessage.remove_flag(flag) + + Unset the flag(s) specified by *flag* without changing other flags. To remove + more than one flag at a time, *flag* maybe a string of more than one character. + +When an :class:`mboxMessage` instance is created based upon a +:class:`MaildirMessage` instance, a "From " line is generated based upon the +:class:`MaildirMessage` instance's delivery date, and the following conversions +take place: + ++-----------------+-------------------------------+ +| Resulting state | :class:`MaildirMessage` state | ++=================+===============================+ +| R flag | S flag | ++-----------------+-------------------------------+ +| O flag | "cur" subdirectory | ++-----------------+-------------------------------+ +| D flag | T flag | ++-----------------+-------------------------------+ +| F flag | F flag | ++-----------------+-------------------------------+ +| A flag | R flag | ++-----------------+-------------------------------+ + +When an :class:`mboxMessage` instance is created based upon an +:class:`MHMessage` instance, the following conversions take place: + ++-------------------+--------------------------+ +| Resulting state | :class:`MHMessage` state | ++===================+==========================+ +| R flag and O flag | no "unseen" sequence | ++-------------------+--------------------------+ +| O flag | "unseen" sequence | ++-------------------+--------------------------+ +| F flag | "flagged" sequence | ++-------------------+--------------------------+ +| A flag | "replied" sequence | ++-------------------+--------------------------+ + +When an :class:`mboxMessage` instance is created based upon a +:class:`BabylMessage` instance, the following conversions take place: + ++-------------------+-----------------------------+ +| Resulting state | :class:`BabylMessage` state | ++===================+=============================+ +| R flag and O flag | no "unseen" label | ++-------------------+-----------------------------+ +| O flag | "unseen" label | ++-------------------+-----------------------------+ +| D flag | "deleted" label | ++-------------------+-----------------------------+ +| A flag | "answered" label | ++-------------------+-----------------------------+ + +When a :class:`Message` instance is created based upon an :class:`MMDFMessage` +instance, the "From " line is copied and all flags directly correspond: + ++-----------------+----------------------------+ +| Resulting state | :class:`MMDFMessage` state | ++=================+============================+ +| R flag | R flag | ++-----------------+----------------------------+ +| O flag | O flag | ++-----------------+----------------------------+ +| D flag | D flag | ++-----------------+----------------------------+ +| F flag | F flag | ++-----------------+----------------------------+ +| A flag | A flag | ++-----------------+----------------------------+ + + +.. _mailbox-mhmessage: + +:class:`MHMessage` +^^^^^^^^^^^^^^^^^^ + + +.. class:: MHMessage([message]) + + A message with MH-specific behaviors. Parameter *message* has the same meaning + as with the :class:`Message` constructor. + +MH messages do not support marks or flags in the traditional sense, but they do +support sequences, which are logical groupings of arbitrary messages. Some mail +reading programs (although not the standard :program:`mh` and :program:`nmh`) +use sequences in much the same way flags are used with other formats, as +follows: + ++----------+------------------------------------------+ +| Sequence | Explanation | ++==========+==========================================+ +| unseen | Not read, but previously detected by MUA | ++----------+------------------------------------------+ +| replied | Replied to | ++----------+------------------------------------------+ +| flagged | Marked as important | ++----------+------------------------------------------+ + +:class:`MHMessage` instances offer the following methods: + + +.. method:: MHMessage.get_sequences() + + Return a list of the names of sequences that include this message. + + +.. method:: MHMessage.set_sequences(sequences) + + Set the list of sequences that include this message. + + +.. method:: MHMessage.add_sequence(sequence) + + Add *sequence* to the list of sequences that include this message. + + +.. method:: MHMessage.remove_sequence(sequence) + + Remove *sequence* from the list of sequences that include this message. + +When an :class:`MHMessage` instance is created based upon a +:class:`MaildirMessage` instance, the following conversions take place: + ++--------------------+-------------------------------+ +| Resulting state | :class:`MaildirMessage` state | ++====================+===============================+ +| "unseen" sequence | no S flag | ++--------------------+-------------------------------+ +| "replied" sequence | R flag | ++--------------------+-------------------------------+ +| "flagged" sequence | F flag | ++--------------------+-------------------------------+ + +When an :class:`MHMessage` instance is created based upon an +:class:`mboxMessage` or :class:`MMDFMessage` instance, the :mailheader:`Status` +and :mailheader:`X-Status` headers are omitted and the following conversions +take place: + ++--------------------+----------------------------------------------+ +| Resulting state | :class:`mboxMessage` or :class:`MMDFMessage` | +| | state | ++====================+==============================================+ +| "unseen" sequence | no R flag | ++--------------------+----------------------------------------------+ +| "replied" sequence | A flag | ++--------------------+----------------------------------------------+ +| "flagged" sequence | F flag | ++--------------------+----------------------------------------------+ + +When an :class:`MHMessage` instance is created based upon a +:class:`BabylMessage` instance, the following conversions take place: + ++--------------------+-----------------------------+ +| Resulting state | :class:`BabylMessage` state | ++====================+=============================+ +| "unseen" sequence | "unseen" label | ++--------------------+-----------------------------+ +| "replied" sequence | "answered" label | ++--------------------+-----------------------------+ + + +.. _mailbox-babylmessage: + +:class:`BabylMessage` +^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: BabylMessage([message]) + + A message with Babyl-specific behaviors. Parameter *message* has the same + meaning as with the :class:`Message` constructor. + +Certain message labels, called :dfn:`attributes`, are defined by convention to +have special meanings. The attributes are as follows: + ++-----------+------------------------------------------+ +| Label | Explanation | ++===========+==========================================+ +| unseen | Not read, but previously detected by MUA | ++-----------+------------------------------------------+ +| deleted | Marked for subsequent deletion | ++-----------+------------------------------------------+ +| filed | Copied to another file or mailbox | ++-----------+------------------------------------------+ +| answered | Replied to | ++-----------+------------------------------------------+ +| forwarded | Forwarded | ++-----------+------------------------------------------+ +| edited | Modified by the user | ++-----------+------------------------------------------+ +| resent | Resent | ++-----------+------------------------------------------+ + +By default, Rmail displays only visible headers. The :class:`BabylMessage` +class, though, uses the original headers because they are more complete. Visible +headers may be accessed explicitly if desired. + +:class:`BabylMessage` instances offer the following methods: + + +.. method:: BabylMessage.get_labels() + + Return a list of labels on the message. + + +.. method:: BabylMessage.set_labels(labels) + + Set the list of labels on the message to *labels*. + + +.. method:: BabylMessage.add_label(label) + + Add *label* to the list of labels on the message. + + +.. method:: BabylMessage.remove_label(label) + + Remove *label* from the list of labels on the message. + + +.. method:: BabylMessage.get_visible() + + Return an :class:`Message` instance whose headers are the message's visible + headers and whose body is empty. + + +.. method:: BabylMessage.set_visible(visible) + + Set the message's visible headers to be the same as the headers in *message*. + Parameter *visible* should be a :class:`Message` instance, an + :class:`email.Message.Message` instance, a string, or a file-like object (which + should be open in text mode). + + +.. method:: BabylMessage.update_visible() + + When a :class:`BabylMessage` instance's original headers are modified, the + visible headers are not automatically modified to correspond. This method + updates the visible headers as follows: each visible header with a corresponding + original header is set to the value of the original header, each visible header + without a corresponding original header is removed, and any of + :mailheader:`Date`, :mailheader:`From`, :mailheader:`Reply-To`, + :mailheader:`To`, :mailheader:`CC`, and :mailheader:`Subject` that are present + in the original headers but not the visible headers are added to the visible + headers. + +When a :class:`BabylMessage` instance is created based upon a +:class:`MaildirMessage` instance, the following conversions take place: + ++-------------------+-------------------------------+ +| Resulting state | :class:`MaildirMessage` state | ++===================+===============================+ +| "unseen" label | no S flag | ++-------------------+-------------------------------+ +| "deleted" label | T flag | ++-------------------+-------------------------------+ +| "answered" label | R flag | ++-------------------+-------------------------------+ +| "forwarded" label | P flag | ++-------------------+-------------------------------+ + +When a :class:`BabylMessage` instance is created based upon an +:class:`mboxMessage` or :class:`MMDFMessage` instance, the :mailheader:`Status` +and :mailheader:`X-Status` headers are omitted and the following conversions +take place: + ++------------------+----------------------------------------------+ +| Resulting state | :class:`mboxMessage` or :class:`MMDFMessage` | +| | state | ++==================+==============================================+ +| "unseen" label | no R flag | ++------------------+----------------------------------------------+ +| "deleted" label | D flag | ++------------------+----------------------------------------------+ +| "answered" label | A flag | ++------------------+----------------------------------------------+ + +When a :class:`BabylMessage` instance is created based upon an +:class:`MHMessage` instance, the following conversions take place: + ++------------------+--------------------------+ +| Resulting state | :class:`MHMessage` state | ++==================+==========================+ +| "unseen" label | "unseen" sequence | ++------------------+--------------------------+ +| "answered" label | "replied" sequence | ++------------------+--------------------------+ + + +.. _mailbox-mmdfmessage: + +:class:`MMDFMessage` +^^^^^^^^^^^^^^^^^^^^ + + +.. class:: MMDFMessage([message]) + + A message with MMDF-specific behaviors. Parameter *message* has the same meaning + as with the :class:`Message` constructor. + +As with message in an mbox mailbox, MMDF messages are stored with the sender's +address and the delivery date in an initial line beginning with "From ". +Likewise, flags that indicate the state of the message are typically stored in +:mailheader:`Status` and :mailheader:`X-Status` headers. + +Conventional flags for MMDF messages are identical to those of mbox message and +are as follows: + ++------+----------+--------------------------------+ +| Flag | Meaning | Explanation | ++======+==========+================================+ +| R | Read | Read | ++------+----------+--------------------------------+ +| O | Old | Previously detected by MUA | ++------+----------+--------------------------------+ +| D | Deleted | Marked for subsequent deletion | ++------+----------+--------------------------------+ +| F | Flagged | Marked as important | ++------+----------+--------------------------------+ +| A | Answered | Replied to | ++------+----------+--------------------------------+ + +The "R" and "O" flags are stored in the :mailheader:`Status` header, and the +"D", "F", and "A" flags are stored in the :mailheader:`X-Status` header. The +flags and headers typically appear in the order mentioned. + +:class:`MMDFMessage` instances offer the following methods, which are identical +to those offered by :class:`mboxMessage`: + + +.. method:: MMDFMessage.get_from() + + Return a string representing the "From " line that marks the start of the + message in an mbox mailbox. The leading "From " and the trailing newline are + excluded. + + +.. method:: MMDFMessage.set_from(from_[, time_=None]) + + Set the "From " line to *from_*, which should be specified without a leading + "From " or trailing newline. For convenience, *time_* may be specified and will + be formatted appropriately and appended to *from_*. If *time_* is specified, it + should be a :class:`struct_time` instance, a tuple suitable for passing to + :meth:`time.strftime`, or ``True`` (to use :meth:`time.gmtime`). + + +.. method:: MMDFMessage.get_flags() + + Return a string specifying the flags that are currently set. If the message + complies with the conventional format, the result is the concatenation in the + following order of zero or one occurrence of each of ``'R'``, ``'O'``, ``'D'``, + ``'F'``, and ``'A'``. + + +.. method:: MMDFMessage.set_flags(flags) + + Set the flags specified by *flags* and unset all others. Parameter *flags* + should be the concatenation in any order of zero or more occurrences of each of + ``'R'``, ``'O'``, ``'D'``, ``'F'``, and ``'A'``. + + +.. method:: MMDFMessage.add_flag(flag) + + Set the flag(s) specified by *flag* without changing other flags. To add more + than one flag at a time, *flag* may be a string of more than one character. + + +.. method:: MMDFMessage.remove_flag(flag) + + Unset the flag(s) specified by *flag* without changing other flags. To remove + more than one flag at a time, *flag* maybe a string of more than one character. + +When an :class:`MMDFMessage` instance is created based upon a +:class:`MaildirMessage` instance, a "From " line is generated based upon the +:class:`MaildirMessage` instance's delivery date, and the following conversions +take place: + ++-----------------+-------------------------------+ +| Resulting state | :class:`MaildirMessage` state | ++=================+===============================+ +| R flag | S flag | ++-----------------+-------------------------------+ +| O flag | "cur" subdirectory | ++-----------------+-------------------------------+ +| D flag | T flag | ++-----------------+-------------------------------+ +| F flag | F flag | ++-----------------+-------------------------------+ +| A flag | R flag | ++-----------------+-------------------------------+ + +When an :class:`MMDFMessage` instance is created based upon an +:class:`MHMessage` instance, the following conversions take place: + ++-------------------+--------------------------+ +| Resulting state | :class:`MHMessage` state | ++===================+==========================+ +| R flag and O flag | no "unseen" sequence | ++-------------------+--------------------------+ +| O flag | "unseen" sequence | ++-------------------+--------------------------+ +| F flag | "flagged" sequence | ++-------------------+--------------------------+ +| A flag | "replied" sequence | ++-------------------+--------------------------+ + +When an :class:`MMDFMessage` instance is created based upon a +:class:`BabylMessage` instance, the following conversions take place: + ++-------------------+-----------------------------+ +| Resulting state | :class:`BabylMessage` state | ++===================+=============================+ +| R flag and O flag | no "unseen" label | ++-------------------+-----------------------------+ +| O flag | "unseen" label | ++-------------------+-----------------------------+ +| D flag | "deleted" label | ++-------------------+-----------------------------+ +| A flag | "answered" label | ++-------------------+-----------------------------+ + +When an :class:`MMDFMessage` instance is created based upon an +:class:`mboxMessage` instance, the "From " line is copied and all flags directly +correspond: + ++-----------------+----------------------------+ +| Resulting state | :class:`mboxMessage` state | ++=================+============================+ +| R flag | R flag | ++-----------------+----------------------------+ +| O flag | O flag | ++-----------------+----------------------------+ +| D flag | D flag | ++-----------------+----------------------------+ +| F flag | F flag | ++-----------------+----------------------------+ +| A flag | A flag | ++-----------------+----------------------------+ + + +Exceptions +---------- + +The following exception classes are defined in the :mod:`mailbox` module: + + +.. class:: Error() + + The based class for all other module-specific exceptions. + + +.. class:: NoSuchMailboxError() + + Raised when a mailbox is expected but is not found, such as when instantiating a + :class:`Mailbox` subclass with a path that does not exist (and with the *create* + parameter set to ``False``), or when opening a folder that does not exist. + + +.. class:: NotEmptyErrorError() + + Raised when a mailbox is not empty but is expected to be, such as when deleting + a folder that contains messages. + + +.. class:: ExternalClashError() + + Raised when some mailbox-related condition beyond the control of the program + causes it to be unable to proceed, such as when failing to acquire a lock that + another program already holds a lock, or when a uniquely-generated file name + already exists. + + +.. class:: FormatError() + + Raised when the data in a file cannot be parsed, such as when an :class:`MH` + instance attempts to read a corrupted :file:`.mh_sequences` file. + + +.. _mailbox-deprecated: + +Deprecated classes and methods +------------------------------ + +Older versions of the :mod:`mailbox` module do not support modification of +mailboxes, such as adding or removing message, and do not provide classes to +represent format-specific message properties. For backward compatibility, the +older mailbox classes are still available, but the newer classes should be used +in preference to them. + +Older mailbox objects support only iteration and provide a single public method: + + +.. method:: oldmailbox.next() + + Return the next message in the mailbox, created with the optional *factory* + argument passed into the mailbox object's constructor. By default this is an + :class:`rfc822.Message` object (see the :mod:`rfc822` module). Depending on the + mailbox implementation the *fp* attribute of this object may be a true file + object or a class instance simulating a file object, taking care of things like + message boundaries if multiple mail messages are contained in a single file, + etc. If no more messages are available, this method returns ``None``. + +Most of the older mailbox classes have names that differ from the current +mailbox class names, except for :class:`Maildir`. For this reason, the new +:class:`Maildir` class defines a :meth:`next` method and its constructor differs +slightly from those of the other new mailbox classes. + +The older mailbox classes whose names are not the same as their newer +counterparts are as follows: + + +.. class:: UnixMailbox(fp[, factory]) + + Access to a classic Unix-style mailbox, where all messages are contained in a + single file and separated by ``From`` (a.k.a. ``From_``) lines. The file object + *fp* points to the mailbox file. The optional *factory* parameter is a callable + that should create new message objects. *factory* is called with one argument, + *fp* by the :meth:`next` method of the mailbox object. The default is the + :class:`rfc822.Message` class (see the :mod:`rfc822` module -- and the note + below). + + .. note:: + + For reasons of this module's internal implementation, you will probably want to + open the *fp* object in binary mode. This is especially important on Windows. + + For maximum portability, messages in a Unix-style mailbox are separated by any + line that begins exactly with the string ``'From '`` (note the trailing space) + if preceded by exactly two newlines. Because of the wide-range of variations in + practice, nothing else on the ``From_`` line should be considered. However, the + current implementation doesn't check for the leading two newlines. This is + usually fine for most applications. + + The :class:`UnixMailbox` class implements a more strict version of ``From_`` + line checking, using a regular expression that usually correctly matched + ``From_`` delimiters. It considers delimiter line to be separated by ``From + name time`` lines. For maximum portability, use the + :class:`PortableUnixMailbox` class instead. This class is identical to + :class:`UnixMailbox` except that individual messages are separated by only + ``From`` lines. + + For more information, see `Configuring Netscape Mail on Unix: Why the + Content-Length Format is Bad + <http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html>`_. + + +.. class:: PortableUnixMailbox(fp[, factory]) + + A less-strict version of :class:`UnixMailbox`, which considers only the ``From`` + at the beginning of the line separating messages. The "*name* *time*" portion + of the From line is ignored, to protect against some variations that are + observed in practice. This works since lines in the message which begin with + ``'From '`` are quoted by mail handling software at delivery-time. + + +.. class:: MmdfMailbox(fp[, factory]) + + Access an MMDF-style mailbox, where all messages are contained in a single file + and separated by lines consisting of 4 control-A characters. The file object + *fp* points to the mailbox file. Optional *factory* is as with the + :class:`UnixMailbox` class. + + +.. class:: MHMailbox(dirname[, factory]) + + Access an MH mailbox, a directory with each message in a separate file with a + numeric name. The name of the mailbox directory is passed in *dirname*. + *factory* is as with the :class:`UnixMailbox` class. + + +.. class:: BabylMailbox(fp[, factory]) + + Access a Babyl mailbox, which is similar to an MMDF mailbox. In Babyl format, + each message has two sets of headers, the *original* headers and the *visible* + headers. The original headers appear before a line containing only ``'*** EOOH + ***'`` (End-Of-Original-Headers) and the visible headers appear after the + ``EOOH`` line. Babyl-compliant mail readers will show you only the visible + headers, and :class:`BabylMailbox` objects will return messages containing only + the visible headers. You'll have to do your own parsing of the mailbox file to + get at the original headers. Mail messages start with the EOOH line and end + with a line containing only ``'\037\014'``. *factory* is as with the + :class:`UnixMailbox` class. + +If you wish to use the older mailbox classes with the :mod:`email` module rather +than the deprecated :mod:`rfc822` module, you can do so as follows:: + + import email + import email.Errors + import mailbox + + def msgfactory(fp): + try: + return email.message_from_file(fp) + except email.Errors.MessageParseError: + # Don't return None since that will + # stop the mailbox iterator + return '' + + mbox = mailbox.UnixMailbox(fp, msgfactory) + +Alternatively, if you know your mailbox contains only well-formed MIME messages, +you can simplify this to:: + + import email + import mailbox + + mbox = mailbox.UnixMailbox(fp, email.message_from_file) + + +.. _mailbox-examples: + +Examples +-------- + +A simple example of printing the subjects of all messages in a mailbox that seem +interesting:: + + import mailbox + for message in mailbox.mbox('~/mbox'): + subject = message['subject'] # Could possibly be None. + if subject and 'python' in subject.lower(): + print subject + +To copy all mail from a Babyl mailbox to an MH mailbox, converting all of the +format-specific information that can be converted:: + + import mailbox + destination = mailbox.MH('~/Mail') + destination.lock() + for message in mailbox.Babyl('~/RMAIL'): + destination.add(MHMessage(message)) + destination.flush() + destination.unlock() + +This example sorts mail from several mailing lists into different mailboxes, +being careful to avoid mail corruption due to concurrent modification by other +programs, mail loss due to interruption of the program, or premature termination +due to malformed messages in the mailbox:: + + import mailbox + import email.Errors + + list_names = ('python-list', 'python-dev', 'python-bugs') + + boxes = dict((name, mailbox.mbox('~/email/%s' % name)) for name in list_names) + inbox = mailbox.Maildir('~/Maildir', factory=None) + + for key in inbox.iterkeys(): + try: + message = inbox[key] + except email.Errors.MessageParseError: + continue # The message is malformed. Just leave it. + + for name in list_names: + list_id = message['list-id'] + if list_id and name in list_id: + # Get mailbox to use + box = boxes[name] + + # Write copy to disk before removing original. + # If there's a crash, you might duplicate a message, but + # that's better than losing a message completely. + box.lock() + box.add(message) + box.flush() + box.unlock() + + # Remove original message + inbox.lock() + inbox.discard(key) + inbox.flush() + inbox.unlock() + break # Found destination, so stop looking. + + for box in boxes.itervalues(): + box.close() + diff --git a/Doc/library/mailcap.rst b/Doc/library/mailcap.rst new file mode 100644 index 0000000..8dcb1ec --- /dev/null +++ b/Doc/library/mailcap.rst @@ -0,0 +1,74 @@ +:mod:`mailcap` --- Mailcap file handling +======================================== + +.. module:: mailcap + :synopsis: Mailcap file handling. + + + +Mailcap files are used to configure how MIME-aware applications such as mail +readers and Web browsers react to files with different MIME types. (The name +"mailcap" is derived from the phrase "mail capability".) For example, a mailcap +file might contain a line like ``video/mpeg; xmpeg %s``. Then, if the user +encounters an email message or Web document with the MIME type +:mimetype:`video/mpeg`, ``%s`` will be replaced by a filename (usually one +belonging to a temporary file) and the :program:`xmpeg` program can be +automatically started to view the file. + +The mailcap format is documented in :rfc:`1524`, "A User Agent Configuration +Mechanism For Multimedia Mail Format Information," but is not an Internet +standard. However, mailcap files are supported on most Unix systems. + + +.. function:: findmatch(caps, MIMEtype[, key[, filename[, plist]]]) + + Return a 2-tuple; the first element is a string containing the command line to + be executed (which can be passed to :func:`os.system`), and the second element + is the mailcap entry for a given MIME type. If no matching MIME type can be + found, ``(None, None)`` is returned. + + *key* is the name of the field desired, which represents the type of activity to + be performed; the default value is 'view', since in the most common case you + simply want to view the body of the MIME-typed data. Other possible values + might be 'compose' and 'edit', if you wanted to create a new body of the given + MIME type or alter the existing body data. See :rfc:`1524` for a complete list + of these fields. + + *filename* is the filename to be substituted for ``%s`` in the command line; the + default value is ``'/dev/null'`` which is almost certainly not what you want, so + usually you'll override it by specifying a filename. + + *plist* can be a list containing named parameters; the default value is simply + an empty list. Each entry in the list must be a string containing the parameter + name, an equals sign (``'='``), and the parameter's value. Mailcap entries can + contain named parameters like ``%{foo}``, which will be replaced by the value + of the parameter named 'foo'. For example, if the command line ``showpartial + %{id} %{number} %{total}`` was in a mailcap file, and *plist* was set to + ``['id=1', 'number=2', 'total=3']``, the resulting command line would be + ``'showpartial 1 2 3'``. + + In a mailcap file, the "test" field can optionally be specified to test some + external condition (such as the machine architecture, or the window system in + use) to determine whether or not the mailcap line applies. :func:`findmatch` + will automatically check such conditions and skip the entry if the check fails. + + +.. function:: getcaps() + + Returns a dictionary mapping MIME types to a list of mailcap file entries. This + dictionary must be passed to the :func:`findmatch` function. An entry is stored + as a list of dictionaries, but it shouldn't be necessary to know the details of + this representation. + + The information is derived from all of the mailcap files found on the system. + Settings in the user's mailcap file :file:`$HOME/.mailcap` will override + settings in the system mailcap files :file:`/etc/mailcap`, + :file:`/usr/etc/mailcap`, and :file:`/usr/local/etc/mailcap`. + +An example usage:: + + >>> import mailcap + >>> d=mailcap.getcaps() + >>> mailcap.findmatch(d, 'video/mpeg', filename='/tmp/tmp1223') + ('xmpeg /tmp/tmp1223', {'view': 'xmpeg %s'}) + diff --git a/Doc/library/markup.rst b/Doc/library/markup.rst new file mode 100644 index 0000000..dd0dd8f --- /dev/null +++ b/Doc/library/markup.rst @@ -0,0 +1,44 @@ + +.. _markup: + +********************************** +Structured Markup Processing Tools +********************************** + +Python supports a variety of modules to work with various forms of structured +data markup. This includes modules to work with the Standard Generalized Markup +Language (SGML) and the Hypertext Markup Language (HTML), and several interfaces +for working with the Extensible Markup Language (XML). + +It is important to note that modules in the :mod:`xml` package require that +there be at least one SAX-compliant XML parser available. Starting with Python +2.3, the Expat parser is included with Python, so the :mod:`xml.parsers.expat` +module will always be available. You may still want to be aware of the `PyXML +add-on package <http://pyxml.sourceforge.net/>`_; that package provides an +extended set of XML libraries for Python. + +The documentation for the :mod:`xml.dom` and :mod:`xml.sax` packages are the +definition of the Python bindings for the DOM and SAX interfaces. + + +.. toctree:: + + htmlparser.rst + sgmllib.rst + htmllib.rst + pyexpat.rst + xml.dom.rst + xml.dom.minidom.rst + xml.dom.pulldom.rst + xml.sax.rst + xml.sax.handler.rst + xml.sax.utils.rst + xml.sax.reader.rst + xml.etree.elementtree.rst + +.. seealso:: + + `Python/XML Libraries <http://pyxml.sourceforge.net/>`_ + Home page for the PyXML package, containing an extension of :mod:`xml` package + bundled with Python. + diff --git a/Doc/library/marshal.rst b/Doc/library/marshal.rst new file mode 100644 index 0000000..010ebc3 --- /dev/null +++ b/Doc/library/marshal.rst @@ -0,0 +1,127 @@ + +:mod:`marshal` --- Internal Python object serialization +======================================================= + +.. module:: marshal + :synopsis: Convert Python objects to streams of bytes and back (with different + constraints). + + +This module contains functions that can read and write Python values in a binary +format. The format is specific to Python, but independent of machine +architecture issues (e.g., you can write a Python value to a file on a PC, +transport the file to a Sun, and read it back there). Details of the format are +undocumented on purpose; it may change between Python versions (although it +rarely does). [#]_ + +.. index:: + module: pickle + module: shelve + object: code + +This is not a general "persistence" module. For general persistence and +transfer of Python objects through RPC calls, see the modules :mod:`pickle` and +:mod:`shelve`. The :mod:`marshal` module exists mainly to support reading and +writing the "pseudo-compiled" code for Python modules of :file:`.pyc` files. +Therefore, the Python maintainers reserve the right to modify the marshal format +in backward incompatible ways should the need arise. If you're serializing and +de-serializing Python objects, use the :mod:`pickle` module instead. + +.. warning:: + + The :mod:`marshal` module is not intended to be secure against erroneous or + maliciously constructed data. Never unmarshal data received from an + untrusted or unauthenticated source. + +Not all Python object types are supported; in general, only objects whose value +is independent from a particular invocation of Python can be written and read by +this module. The following types are supported: ``None``, integers, long +integers, floating point numbers, strings, Unicode objects, tuples, lists, +dictionaries, and code objects, where it should be understood that tuples, lists +and dictionaries are only supported as long as the values contained therein are +themselves supported; and recursive lists and dictionaries should not be written +(they will cause infinite loops). + +**Caveat:** On machines where C's ``long int`` type has more than 32 bits (such +as the DEC Alpha), it is possible to create plain Python integers that are +longer than 32 bits. If such an integer is marshaled and read back in on a +machine where C's ``long int`` type has only 32 bits, a Python long integer +object is returned instead. While of a different type, the numeric value is the +same. (This behavior is new in Python 2.2. In earlier versions, all but the +least-significant 32 bits of the value were lost, and a warning message was +printed.) + +There are functions that read/write files as well as functions operating on +strings. + +The module defines these functions: + + +.. function:: dump(value, file[, version]) + + Write the value on the open file. The value must be a supported type. The + file must be an open file object such as ``sys.stdout`` or returned by + :func:`open` or :func:`os.popen`. It must be opened in binary mode (``'wb'`` + or ``'w+b'``). + + If the value has (or contains an object that has) an unsupported type, a + :exc:`ValueError` exception is raised --- but garbage data will also be written + to the file. The object will not be properly read back by :func:`load`. + + .. versionadded:: 2.4 + The *version* argument indicates the data format that ``dump`` should use + (see below). + + +.. function:: load(file) + + Read one value from the open file and return it. If no valid value is read + (e.g. because the data has a different Python version's incompatible marshal + format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. The + file must be an open file object opened in binary mode (``'rb'`` or + ``'r+b'``). + + .. warning:: + + If an object containing an unsupported type was marshalled with :func:`dump`, + :func:`load` will substitute ``None`` for the unmarshallable type. + + +.. function:: dumps(value[, version]) + + Return the string that would be written to a file by ``dump(value, file)``. The + value must be a supported type. Raise a :exc:`ValueError` exception if value + has (or contains an object that has) an unsupported type. + + .. versionadded:: 2.4 + The *version* argument indicates the data format that ``dumps`` should use + (see below). + + +.. function:: loads(string) + + Convert the string to a value. If no valid value is found, raise + :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. Extra characters in the + string are ignored. + + +In addition, the following constants are defined: + +.. data:: version + + Indicates the format that the module uses. Version 0 is the historical format, + version 1 (added in Python 2.4) shares interned strings and version 2 (added in + Python 2.5) uses a binary format for floating point numbers. The current version + is 2. + + .. versionadded:: 2.4 + + +.. rubric:: Footnotes + +.. [#] The name of this module stems from a bit of terminology used by the designers of + Modula-3 (amongst others), who use the term "marshalling" for shipping of data + around in a self-contained form. Strictly speaking, "to marshal" means to + convert some data from internal to external form (in an RPC buffer for instance) + and "unmarshalling" for the reverse process. + diff --git a/Doc/library/math.rst b/Doc/library/math.rst new file mode 100644 index 0000000..17c75d3 --- /dev/null +++ b/Doc/library/math.rst @@ -0,0 +1,227 @@ + +:mod:`math` --- Mathematical functions +====================================== + +.. module:: math + :synopsis: Mathematical functions (sin() etc.). + + +This module is always available. It provides access to the mathematical +functions defined by the C standard. + +These functions cannot be used with complex numbers; use the functions of the +same name from the :mod:`cmath` module if you require support for complex +numbers. The distinction between functions which support complex numbers and +those which don't is made since most users do not want to learn quite as much +mathematics as required to understand complex numbers. Receiving an exception +instead of a complex result allows earlier detection of the unexpected complex +number used as a parameter, so that the programmer can determine how and why it +was generated in the first place. + +The following functions are provided by this module. Except when explicitly +noted otherwise, all return values are floats. + +Number-theoretic and representation functions: + + +.. function:: ceil(x) + + Return the ceiling of *x* as a float, the smallest integer value greater than or + equal to *x*. + + +.. function:: fabs(x) + + Return the absolute value of *x*. + + +.. function:: floor(x) + + Return the floor of *x* as a float, the largest integer value less than or equal + to *x*. + + +.. function:: fmod(x, y) + + Return ``fmod(x, y)``, as defined by the platform C library. Note that the + Python expression ``x % y`` may not return the same result. The intent of the C + standard is that ``fmod(x, y)`` be exactly (mathematically; to infinite + precision) equal to ``x - n*y`` for some integer *n* such that the result has + the same sign as *x* and magnitude less than ``abs(y)``. Python's ``x % y`` + returns a result with the sign of *y* instead, and may not be exactly computable + for float arguments. For example, ``fmod(-1e-100, 1e100)`` is ``-1e-100``, but + the result of Python's ``-1e-100 % 1e100`` is ``1e100-1e-100``, which cannot be + represented exactly as a float, and rounds to the surprising ``1e100``. For + this reason, function :func:`fmod` is generally preferred when working with + floats, while Python's ``x % y`` is preferred when working with integers. + + +.. function:: frexp(x) + + Return the mantissa and exponent of *x* as the pair ``(m, e)``. *m* is a float + and *e* is an integer such that ``x == m * 2**e`` exactly. If *x* is zero, + returns ``(0.0, 0)``, otherwise ``0.5 <= abs(m) < 1``. This is used to "pick + apart" the internal representation of a float in a portable way. + + +.. function:: ldexp(x, i) + + Return ``x * (2**i)``. This is essentially the inverse of function + :func:`frexp`. + + +.. function:: modf(x) + + Return the fractional and integer parts of *x*. Both results carry the sign of + *x*, and both are floats. + +Note that :func:`frexp` and :func:`modf` have a different call/return pattern +than their C equivalents: they take a single argument and return a pair of +values, rather than returning their second return value through an 'output +parameter' (there is no such thing in Python). + +For the :func:`ceil`, :func:`floor`, and :func:`modf` functions, note that *all* +floating-point numbers of sufficiently large magnitude are exact integers. +Python floats typically carry no more than 53 bits of precision (the same as the +platform C double type), in which case any float *x* with ``abs(x) >= 2**52`` +necessarily has no fractional bits. + +Power and logarithmic functions: + + +.. function:: exp(x) + + Return ``e**x``. + + +.. function:: log(x[, base]) + + Return the logarithm of *x* to the given *base*. If the *base* is not specified, + return the natural logarithm of *x* (that is, the logarithm to base *e*). + + .. versionchanged:: 2.3 + *base* argument added. + + +.. function:: log10(x) + + Return the base-10 logarithm of *x*. + + +.. function:: pow(x, y) + + Return ``x**y``. + + +.. function:: sqrt(x) + + Return the square root of *x*. + +Trigonometric functions: + + +.. function:: acos(x) + + Return the arc cosine of *x*, in radians. + + +.. function:: asin(x) + + Return the arc sine of *x*, in radians. + + +.. function:: atan(x) + + Return the arc tangent of *x*, in radians. + + +.. function:: atan2(y, x) + + Return ``atan(y / x)``, in radians. The result is between ``-pi`` and ``pi``. + The vector in the plane from the origin to point ``(x, y)`` makes this angle + with the positive X axis. The point of :func:`atan2` is that the signs of both + inputs are known to it, so it can compute the correct quadrant for the angle. + For example, ``atan(1``) and ``atan2(1, 1)`` are both ``pi/4``, but ``atan2(-1, + -1)`` is ``-3*pi/4``. + + +.. function:: cos(x) + + Return the cosine of *x* radians. + + +.. function:: hypot(x, y) + + Return the Euclidean norm, ``sqrt(x*x + y*y)``. This is the length of the vector + from the origin to point ``(x, y)``. + + +.. function:: sin(x) + + Return the sine of *x* radians. + + +.. function:: tan(x) + + Return the tangent of *x* radians. + +Angular conversion: + + +.. function:: degrees(x) + + Converts angle *x* from radians to degrees. + + +.. function:: radians(x) + + Converts angle *x* from degrees to radians. + +Hyperbolic functions: + + +.. function:: cosh(x) + + Return the hyperbolic cosine of *x*. + + +.. function:: sinh(x) + + Return the hyperbolic sine of *x*. + + +.. function:: tanh(x) + + Return the hyperbolic tangent of *x*. + +The module also defines two mathematical constants: + + +.. data:: pi + + The mathematical constant *pi*. + + +.. data:: e + + The mathematical constant *e*. + +.. note:: + + The :mod:`math` module consists mostly of thin wrappers around the platform C + math library functions. Behavior in exceptional cases is loosely specified + by the C standards, and Python inherits much of its math-function + error-reporting behavior from the platform C implementation. As a result, + the specific exceptions raised in error cases (and even whether some + arguments are considered to be exceptional at all) are not defined in any + useful cross-platform or cross-release way. For example, whether + ``math.log(0)`` returns ``-Inf`` or raises :exc:`ValueError` or + :exc:`OverflowError` isn't defined, and in cases where ``math.log(0)`` raises + :exc:`OverflowError`, ``math.log(0L)`` may raise :exc:`ValueError` instead. + + +.. seealso:: + + Module :mod:`cmath` + Complex number versions of many of these functions. + diff --git a/Doc/library/mhlib.rst b/Doc/library/mhlib.rst new file mode 100644 index 0000000..15d2b05 --- /dev/null +++ b/Doc/library/mhlib.rst @@ -0,0 +1,205 @@ + +:mod:`mhlib` --- Access to MH mailboxes +======================================= + +.. module:: mhlib + :synopsis: Manipulate MH mailboxes from Python. + + +.. % LaTeX'ized from the comments in the module by Skip Montanaro +.. % <skip@mojam.com>. + +The :mod:`mhlib` module provides a Python interface to MH folders and their +contents. + +The module contains three basic classes, :class:`MH`, which represents a +particular collection of folders, :class:`Folder`, which represents a single +folder, and :class:`Message`, which represents a single message. + + +.. class:: MH([path[, profile]]) + + :class:`MH` represents a collection of MH folders. + + +.. class:: Folder(mh, name) + + The :class:`Folder` class represents a single folder and its messages. + + +.. class:: Message(folder, number[, name]) + + :class:`Message` objects represent individual messages in a folder. The Message + class is derived from :class:`mimetools.Message`. + + +.. _mh-objects: + +MH Objects +---------- + +:class:`MH` instances have the following methods: + + +.. method:: MH.error(format[, ...]) + + Print an error message -- can be overridden. + + +.. method:: MH.getprofile(key) + + Return a profile entry (``None`` if not set). + + +.. method:: MH.getpath() + + Return the mailbox pathname. + + +.. method:: MH.getcontext() + + Return the current folder name. + + +.. method:: MH.setcontext(name) + + Set the current folder name. + + +.. method:: MH.listfolders() + + Return a list of top-level folders. + + +.. method:: MH.listallfolders() + + Return a list of all folders. + + +.. method:: MH.listsubfolders(name) + + Return a list of direct subfolders of the given folder. + + +.. method:: MH.listallsubfolders(name) + + Return a list of all subfolders of the given folder. + + +.. method:: MH.makefolder(name) + + Create a new folder. + + +.. method:: MH.deletefolder(name) + + Delete a folder -- must have no subfolders. + + +.. method:: MH.openfolder(name) + + Return a new open folder object. + + +.. _mh-folder-objects: + +Folder Objects +-------------- + +:class:`Folder` instances represent open folders and have the following methods: + + +.. method:: Folder.error(format[, ...]) + + Print an error message -- can be overridden. + + +.. method:: Folder.getfullname() + + Return the folder's full pathname. + + +.. method:: Folder.getsequencesfilename() + + Return the full pathname of the folder's sequences file. + + +.. method:: Folder.getmessagefilename(n) + + Return the full pathname of message *n* of the folder. + + +.. method:: Folder.listmessages() + + Return a list of messages in the folder (as numbers). + + +.. method:: Folder.getcurrent() + + Return the current message number. + + +.. method:: Folder.setcurrent(n) + + Set the current message number to *n*. + + +.. method:: Folder.parsesequence(seq) + + Parse msgs syntax into list of messages. + + +.. method:: Folder.getlast() + + Get last message, or ``0`` if no messages are in the folder. + + +.. method:: Folder.setlast(n) + + Set last message (internal use only). + + +.. method:: Folder.getsequences() + + Return dictionary of sequences in folder. The sequence names are used as keys, + and the values are the lists of message numbers in the sequences. + + +.. method:: Folder.putsequences(dict) + + Return dictionary of sequences in folder name: list. + + +.. method:: Folder.removemessages(list) + + Remove messages in list from folder. + + +.. method:: Folder.refilemessages(list, tofolder) + + Move messages in list to other folder. + + +.. method:: Folder.movemessage(n, tofolder, ton) + + Move one message to a given destination in another folder. + + +.. method:: Folder.copymessage(n, tofolder, ton) + + Copy one message to a given destination in another folder. + + +.. _mh-message-objects: + +Message Objects +--------------- + +The :class:`Message` class adds one method to those of +:class:`mimetools.Message`: + + +.. method:: Message.openmessage(n) + + Return a new open message object (costs a file descriptor). + diff --git a/Doc/library/mimetools.rst b/Doc/library/mimetools.rst new file mode 100644 index 0000000..603bec6 --- /dev/null +++ b/Doc/library/mimetools.rst @@ -0,0 +1,130 @@ + +:mod:`mimetools` --- Tools for parsing MIME messages +==================================================== + +.. module:: mimetools + :synopsis: Tools for parsing MIME-style message bodies. + + +.. deprecated:: 2.3 + The :mod:`email` package should be used in preference to the :mod:`mimetools` + module. This module is present only to maintain backward compatibility. + +.. index:: module: rfc822 + +This module defines a subclass of the :mod:`rfc822` module's :class:`Message` +class and a number of utility functions that are useful for the manipulation for +MIME multipart or encoded message. + +It defines the following items: + + +.. class:: Message(fp[, seekable]) + + Return a new instance of the :class:`Message` class. This is a subclass of the + :class:`rfc822.Message` class, with some additional methods (see below). The + *seekable* argument has the same meaning as for :class:`rfc822.Message`. + + +.. function:: choose_boundary() + + Return a unique string that has a high likelihood of being usable as a part + boundary. The string has the form ``'hostipaddr.uid.pid.timestamp.random'``. + + +.. function:: decode(input, output, encoding) + + Read data encoded using the allowed MIME *encoding* from open file object + *input* and write the decoded data to open file object *output*. Valid values + for *encoding* include ``'base64'``, ``'quoted-printable'``, ``'uuencode'``, + ``'x-uuencode'``, ``'uue'``, ``'x-uue'``, ``'7bit'``, and ``'8bit'``. Decoding + messages encoded in ``'7bit'`` or ``'8bit'`` has no effect. The input is simply + copied to the output. + + +.. function:: encode(input, output, encoding) + + Read data from open file object *input* and write it encoded using the allowed + MIME *encoding* to open file object *output*. Valid values for *encoding* are + the same as for :meth:`decode`. + + +.. function:: copyliteral(input, output) + + Read lines from open file *input* until EOF and write them to open file + *output*. + + +.. function:: copybinary(input, output) + + Read blocks until EOF from open file *input* and write them to open file + *output*. The block size is currently fixed at 8192. + + +.. seealso:: + + Module :mod:`email` + Comprehensive email handling package; supersedes the :mod:`mimetools` module. + + Module :mod:`rfc822` + Provides the base class for :class:`mimetools.Message`. + + Module :mod:`multifile` + Support for reading files which contain distinct parts, such as MIME data. + + http://www.cs.uu.nl/wais/html/na-dir/mail/mime-faq/.html + The MIME Frequently Asked Questions document. For an overview of MIME, see the + answer to question 1.1 in Part 1 of this document. + + +.. _mimetools-message-objects: + +Additional Methods of Message Objects +------------------------------------- + +The :class:`Message` class defines the following methods in addition to the +:class:`rfc822.Message` methods: + + +.. method:: Message.getplist() + + Return the parameter list of the :mailheader:`Content-Type` header. This is a + list of strings. For parameters of the form ``key=value``, *key* is converted + to lower case but *value* is not. For example, if the message contains the + header ``Content-type: text/html; spam=1; Spam=2; Spam`` then :meth:`getplist` + will return the Python list ``['spam=1', 'spam=2', 'Spam']``. + + +.. method:: Message.getparam(name) + + Return the *value* of the first parameter (as returned by :meth:`getplist`) of + the form ``name=value`` for the given *name*. If *value* is surrounded by + quotes of the form '``<``...\ ``>``' or '``"``...\ ``"``', these are removed. + + +.. method:: Message.getencoding() + + Return the encoding specified in the :mailheader:`Content-Transfer-Encoding` + message header. If no such header exists, return ``'7bit'``. The encoding is + converted to lower case. + + +.. method:: Message.gettype() + + Return the message type (of the form ``type/subtype``) as specified in the + :mailheader:`Content-Type` header. If no such header exists, return + ``'text/plain'``. The type is converted to lower case. + + +.. method:: Message.getmaintype() + + Return the main type as specified in the :mailheader:`Content-Type` header. If + no such header exists, return ``'text'``. The main type is converted to lower + case. + + +.. method:: Message.getsubtype() + + Return the subtype as specified in the :mailheader:`Content-Type` header. If no + such header exists, return ``'plain'``. The subtype is converted to lower case. + diff --git a/Doc/library/mimetypes.rst b/Doc/library/mimetypes.rst new file mode 100644 index 0000000..fd5e12d --- /dev/null +++ b/Doc/library/mimetypes.rst @@ -0,0 +1,232 @@ + +:mod:`mimetypes` --- Map filenames to MIME types +================================================ + +.. module:: mimetypes + :synopsis: Mapping of filename extensions to MIME types. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. index:: pair: MIME; content type + +The :mod:`mimetypes` module converts between a filename or URL and the MIME type +associated with the filename extension. Conversions are provided from filename +to MIME type and from MIME type to filename extension; encodings are not +supported for the latter conversion. + +The module provides one class and a number of convenience functions. The +functions are the normal interface to this module, but some applications may be +interested in the class as well. + +The functions described below provide the primary interface for this module. If +the module has not been initialized, they will call :func:`init` if they rely on +the information :func:`init` sets up. + + +.. function:: guess_type(filename[, strict]) + + .. index:: pair: MIME; headers + + Guess the type of a file based on its filename or URL, given by *filename*. The + return value is a tuple ``(type, encoding)`` where *type* is ``None`` if the + type can't be guessed (missing or unknown suffix) or a string of the form + ``'type/subtype'``, usable for a MIME :mailheader:`content-type` header. + + *encoding* is ``None`` for no encoding or the name of the program used to encode + (e.g. :program:`compress` or :program:`gzip`). The encoding is suitable for use + as a :mailheader:`Content-Encoding` header, *not* as a + :mailheader:`Content-Transfer-Encoding` header. The mappings are table driven. + Encoding suffixes are case sensitive; type suffixes are first tried case + sensitively, then case insensitively. + + Optional *strict* is a flag specifying whether the list of known MIME types + is limited to only the official types `registered with IANA + <http://www.isi.edu/in-notes/iana/assignments/media-types>`_ are recognized. + When *strict* is true (the default), only the IANA types are supported; when + *strict* is false, some additional non-standard but commonly used MIME types + are also recognized. + + +.. function:: guess_all_extensions(type[, strict]) + + Guess the extensions for a file based on its MIME type, given by *type*. The + return value is a list of strings giving all possible filename extensions, + including the leading dot (``'.'``). The extensions are not guaranteed to have + been associated with any particular data stream, but would be mapped to the MIME + type *type* by :func:`guess_type`. + + Optional *strict* has the same meaning as with the :func:`guess_type` function. + + +.. function:: guess_extension(type[, strict]) + + Guess the extension for a file based on its MIME type, given by *type*. The + return value is a string giving a filename extension, including the leading dot + (``'.'``). The extension is not guaranteed to have been associated with any + particular data stream, but would be mapped to the MIME type *type* by + :func:`guess_type`. If no extension can be guessed for *type*, ``None`` is + returned. + + Optional *strict* has the same meaning as with the :func:`guess_type` function. + +Some additional functions and data items are available for controlling the +behavior of the module. + + +.. function:: init([files]) + + Initialize the internal data structures. If given, *files* must be a sequence + of file names which should be used to augment the default type map. If omitted, + the file names to use are taken from :const:`knownfiles`. Each file named in + *files* or :const:`knownfiles` takes precedence over those named before it. + Calling :func:`init` repeatedly is allowed. + + +.. function:: read_mime_types(filename) + + Load the type map given in the file *filename*, if it exists. The type map is + returned as a dictionary mapping filename extensions, including the leading dot + (``'.'``), to strings of the form ``'type/subtype'``. If the file *filename* + does not exist or cannot be read, ``None`` is returned. + + +.. function:: add_type(type, ext[, strict]) + + Add a mapping from the mimetype *type* to the extension *ext*. When the + extension is already known, the new type will replace the old one. When the type + is already known the extension will be added to the list of known extensions. + + When *strict* is the mapping will added to the official MIME types, otherwise to + the non-standard ones. + + +.. data:: inited + + Flag indicating whether or not the global data structures have been initialized. + This is set to true by :func:`init`. + + +.. data:: knownfiles + + .. index:: single: file; mime.types + + List of type map file names commonly installed. These files are typically named + :file:`mime.types` and are installed in different locations by different + packages. + + +.. data:: suffix_map + + Dictionary mapping suffixes to suffixes. This is used to allow recognition of + encoded files for which the encoding and the type are indicated by the same + extension. For example, the :file:`.tgz` extension is mapped to :file:`.tar.gz` + to allow the encoding and type to be recognized separately. + + +.. data:: encodings_map + + Dictionary mapping filename extensions to encoding types. + + +.. data:: types_map + + Dictionary mapping filename extensions to MIME types. + + +.. data:: common_types + + Dictionary mapping filename extensions to non-standard, but commonly found MIME + types. + +The :class:`MimeTypes` class may be useful for applications which may want more +than one MIME-type database: + + +.. class:: MimeTypes([filenames]) + + This class represents a MIME-types database. By default, it provides access to + the same database as the rest of this module. The initial database is a copy of + that provided by the module, and may be extended by loading additional + :file:`mime.types`\ -style files into the database using the :meth:`read` or + :meth:`readfp` methods. The mapping dictionaries may also be cleared before + loading additional data if the default data is not desired. + + The optional *filenames* parameter can be used to cause additional files to be + loaded "on top" of the default database. + + .. versionadded:: 2.2 + +An example usage of the module:: + + >>> import mimetypes + >>> mimetypes.init() + >>> mimetypes.knownfiles + ['/etc/mime.types', '/etc/httpd/mime.types', ... ] + >>> mimetypes.suffix_map['.tgz'] + '.tar.gz' + >>> mimetypes.encodings_map['.gz'] + 'gzip' + >>> mimetypes.types_map['.tgz'] + 'application/x-tar-gz' + + +.. _mimetypes-objects: + +MimeTypes Objects +----------------- + +:class:`MimeTypes` instances provide an interface which is very like that of the +:mod:`mimetypes` module. + + +.. attribute:: MimeTypes.suffix_map + + Dictionary mapping suffixes to suffixes. This is used to allow recognition of + encoded files for which the encoding and the type are indicated by the same + extension. For example, the :file:`.tgz` extension is mapped to :file:`.tar.gz` + to allow the encoding and type to be recognized separately. This is initially a + copy of the global ``suffix_map`` defined in the module. + + +.. attribute:: MimeTypes.encodings_map + + Dictionary mapping filename extensions to encoding types. This is initially a + copy of the global ``encodings_map`` defined in the module. + + +.. attribute:: MimeTypes.types_map + + Dictionary mapping filename extensions to MIME types. This is initially a copy + of the global ``types_map`` defined in the module. + + +.. attribute:: MimeTypes.common_types + + Dictionary mapping filename extensions to non-standard, but commonly found MIME + types. This is initially a copy of the global ``common_types`` defined in the + module. + + +.. method:: MimeTypes.guess_extension(type[, strict]) + + Similar to the :func:`guess_extension` function, using the tables stored as part + of the object. + + +.. method:: MimeTypes.guess_type(url[, strict]) + + Similar to the :func:`guess_type` function, using the tables stored as part of + the object. + + +.. method:: MimeTypes.read(path) + + Load MIME information from a file named *path*. This uses :meth:`readfp` to + parse the file. + + +.. method:: MimeTypes.readfp(file) + + Load MIME type information from an open file. The file must have the format of + the standard :file:`mime.types` files. + diff --git a/Doc/library/miniaeframe.rst b/Doc/library/miniaeframe.rst new file mode 100644 index 0000000..5bf1b07 --- /dev/null +++ b/Doc/library/miniaeframe.rst @@ -0,0 +1,68 @@ + +:mod:`MiniAEFrame` --- Open Scripting Architecture server support +================================================================= + +.. module:: MiniAEFrame + :platform: Mac + :synopsis: Support to act as an Open Scripting Architecture (OSA) server ("Apple Events"). + + +.. index:: + single: Open Scripting Architecture + single: AppleEvents + module: FrameWork + +The module :mod:`MiniAEFrame` provides a framework for an application that can +function as an Open Scripting Architecture (OSA) server, i.e. receive and +process AppleEvents. It can be used in conjunction with :mod:`FrameWork` or +standalone. As an example, it is used in :program:`PythonCGISlave`. + +The :mod:`MiniAEFrame` module defines the following classes: + + +.. class:: AEServer() + + A class that handles AppleEvent dispatch. Your application should subclass this + class together with either :class:`MiniApplication` or + :class:`FrameWork.Application`. Your :meth:`__init__` method should call the + :meth:`__init__` method for both classes. + + +.. class:: MiniApplication() + + A class that is more or less compatible with :class:`FrameWork.Application` but + with less functionality. Its event loop supports the apple menu, command-dot and + AppleEvents; other events are passed on to the Python interpreter and/or Sioux. + Useful if your application wants to use :class:`AEServer` but does not provide + its own windows, etc. + + +.. _aeserver-objects: + +AEServer Objects +---------------- + + +.. method:: AEServer.installaehandler(classe, type, callback) + + Installs an AppleEvent handler. *classe* and *type* are the four-character OSA + Class and Type designators, ``'****'`` wildcards are allowed. When a matching + AppleEvent is received the parameters are decoded and your callback is invoked. + + +.. method:: AEServer.callback(_object, **kwargs) + + Your callback is called with the OSA Direct Object as first positional + parameter. The other parameters are passed as keyword arguments, with the + 4-character designator as name. Three extra keyword parameters are passed: + ``_class`` and ``_type`` are the Class and Type designators and ``_attributes`` + is a dictionary with the AppleEvent attributes. + + The return value of your method is packed with :func:`aetools.packevent` and + sent as reply. + +Note that there are some serious problems with the current design. AppleEvents +which have non-identifier 4-character designators for arguments are not +implementable, and it is not possible to return an error to the originator. This +will be addressed in a future release. + diff --git a/Doc/library/misc.rst b/Doc/library/misc.rst new file mode 100644 index 0000000..ee22561 --- /dev/null +++ b/Doc/library/misc.rst @@ -0,0 +1,14 @@ + +.. _misc: + +********************** +Miscellaneous Services +********************** + +The modules described in this chapter provide miscellaneous services that are +available in all Python versions. Here's an overview: + + +.. toctree:: + + formatter.rst diff --git a/Doc/library/mm.rst b/Doc/library/mm.rst new file mode 100644 index 0000000..a7fbbec --- /dev/null +++ b/Doc/library/mm.rst @@ -0,0 +1,23 @@ + +.. _mmedia: + +******************* +Multimedia Services +******************* + +The modules described in this chapter implement various algorithms or interfaces +that are mainly useful for multimedia applications. They are available at the +discretion of the installation. Here's an overview: + + +.. toctree:: + + audioop.rst + aifc.rst + sunau.rst + wave.rst + chunk.rst + colorsys.rst + imghdr.rst + sndhdr.rst + ossaudiodev.rst diff --git a/Doc/library/mmap.rst b/Doc/library/mmap.rst new file mode 100644 index 0000000..abe5b7b --- /dev/null +++ b/Doc/library/mmap.rst @@ -0,0 +1,173 @@ + +:mod:`mmap` --- Memory-mapped file support +========================================== + +.. module:: mmap + :synopsis: Interface to memory-mapped files for Unix and Windows. + + +Memory-mapped file objects behave like both strings and like file objects. +Unlike normal string objects, however, these are mutable. You can use mmap +objects in most places where strings are expected; for example, you can use the +:mod:`re` module to search through a memory-mapped file. Since they're mutable, +you can change a single character by doing ``obj[index] = 'a'``, or change a +substring by assigning to a slice: ``obj[i1:i2] = '...'``. You can also read +and write data starting at the current file position, and :meth:`seek` through +the file to different positions. + +A memory-mapped file is created by the :func:`mmap` function, which is different +on Unix and on Windows. In either case you must provide a file descriptor for a +file opened for update. If you wish to map an existing Python file object, use +its :meth:`fileno` method to obtain the correct value for the *fileno* +parameter. Otherwise, you can open the file using the :func:`os.open` function, +which returns a file descriptor directly (the file still needs to be closed when +done). + +For both the Unix and Windows versions of the function, *access* may be +specified as an optional keyword parameter. *access* accepts one of three +values: :const:`ACCESS_READ`, :const:`ACCESS_WRITE`, or :const:`ACCESS_COPY` to +specify readonly, write-through or copy-on-write memory respectively. *access* +can be used on both Unix and Windows. If *access* is not specified, Windows +mmap returns a write-through mapping. The initial memory values for all three +access types are taken from the specified file. Assignment to an +:const:`ACCESS_READ` memory map raises a :exc:`TypeError` exception. Assignment +to an :const:`ACCESS_WRITE` memory map affects both memory and the underlying +file. Assignment to an :const:`ACCESS_COPY` memory map affects memory but does +not update the underlying file. + +.. versionchanged:: 2.5 + To map anonymous memory, -1 should be passed as the fileno along with the + length. + + +.. function:: mmap(fileno, length[, tagname[, access]]) + + **(Windows version)** Maps *length* bytes from the file specified by the file + handle *fileno*, and returns a mmap object. If *length* is larger than the + current size of the file, the file is extended to contain *length* bytes. If + *length* is ``0``, the maximum length of the map is the current size of the + file, except that if the file is empty Windows raises an exception (you cannot + create an empty mapping on Windows). + + *tagname*, if specified and not ``None``, is a string giving a tag name for the + mapping. Windows allows you to have many different mappings against the same + file. If you specify the name of an existing tag, that tag is opened, otherwise + a new tag of this name is created. If this parameter is omitted or ``None``, + the mapping is created without a name. Avoiding the use of the tag parameter + will assist in keeping your code portable between Unix and Windows. + + +.. function:: mmap(fileno, length[, flags[, prot[, access]]]) + :noindex: + + **(Unix version)** Maps *length* bytes from the file specified by the file + descriptor *fileno*, and returns a mmap object. If *length* is ``0``, the + maximum length of the map will be the current size of the file when :func:`mmap` + is called. + + *flags* specifies the nature of the mapping. :const:`MAP_PRIVATE` creates a + private copy-on-write mapping, so changes to the contents of the mmap object + will be private to this process, and :const:`MAP_SHARED` creates a mapping + that's shared with all other processes mapping the same areas of the file. The + default value is :const:`MAP_SHARED`. + + *prot*, if specified, gives the desired memory protection; the two most useful + values are :const:`PROT_READ` and :const:`PROT_WRITE`, to specify that the pages + may be read or written. *prot* defaults to :const:`PROT_READ \| PROT_WRITE`. + + *access* may be specified in lieu of *flags* and *prot* as an optional keyword + parameter. It is an error to specify both *flags*, *prot* and *access*. See + the description of *access* above for information on how to use this parameter. + +Memory-mapped file objects support the following methods: + + +.. method:: mmap.close() + + Close the file. Subsequent calls to other methods of the object will result in + an exception being raised. + + +.. method:: mmap.find(string[, start]) + + Returns the lowest index in the object where the substring *string* is found. + Returns ``-1`` on failure. *start* is the index at which the search begins, and + defaults to zero. + + +.. method:: mmap.flush([offset, size]) + + Flushes changes made to the in-memory copy of a file back to disk. Without use + of this call there is no guarantee that changes are written back before the + object is destroyed. If *offset* and *size* are specified, only changes to the + given range of bytes will be flushed to disk; otherwise, the whole extent of the + mapping is flushed. + + +.. method:: mmap.move(dest, src, count) + + Copy the *count* bytes starting at offset *src* to the destination index *dest*. + If the mmap was created with :const:`ACCESS_READ`, then calls to move will throw + a :exc:`TypeError` exception. + + +.. method:: mmap.read(num) + + Return a string containing up to *num* bytes starting from the current file + position; the file position is updated to point after the bytes that were + returned. + + +.. method:: mmap.read_byte() + + Returns a string of length 1 containing the character at the current file + position, and advances the file position by 1. + + +.. method:: mmap.readline() + + Returns a single line, starting at the current file position and up to the next + newline. + + +.. method:: mmap.resize(newsize) + + Resizes the map and the underlying file, if any. If the mmap was created with + :const:`ACCESS_READ` or :const:`ACCESS_COPY`, resizing the map will throw a + :exc:`TypeError` exception. + + +.. method:: mmap.seek(pos[, whence]) + + Set the file's current position. *whence* argument is optional and defaults to + ``os.SEEK_SET`` or ``0`` (absolute file positioning); other values are + ``os.SEEK_CUR`` or ``1`` (seek relative to the current position) and + ``os.SEEK_END`` or ``2`` (seek relative to the file's end). + + +.. method:: mmap.size() + + Return the length of the file, which can be larger than the size of the + memory-mapped area. + + +.. method:: mmap.tell() + + Returns the current position of the file pointer. + + +.. method:: mmap.write(string) + + Write the bytes in *string* into memory at the current position of the file + pointer; the file position is updated to point after the bytes that were + written. If the mmap was created with :const:`ACCESS_READ`, then writing to it + will throw a :exc:`TypeError` exception. + + +.. method:: mmap.write_byte(byte) + + Write the single-character string *byte* into memory at the current position of + the file pointer; the file position is advanced by ``1``. If the mmap was + created with :const:`ACCESS_READ`, then writing to it will throw a + :exc:`TypeError` exception. + diff --git a/Doc/library/modulefinder.rst b/Doc/library/modulefinder.rst new file mode 100644 index 0000000..334bd5d --- /dev/null +++ b/Doc/library/modulefinder.rst @@ -0,0 +1,52 @@ + +:mod:`modulefinder` --- Find modules used by a script +===================================================== + +.. sectionauthor:: A.M. Kuchling <amk@amk.ca> + + +.. module:: modulefinder + :synopsis: Find modules used by a script. + + +.. versionadded:: 2.3 + +This module provides a :class:`ModuleFinder` class that can be used to determine +the set of modules imported by a script. ``modulefinder.py`` can also be run as +a script, giving the filename of a Python script as its argument, after which a +report of the imported modules will be printed. + + +.. function:: AddPackagePath(pkg_name, path) + + Record that the package named *pkg_name* can be found in the specified *path*. + + +.. function:: ReplacePackage(oldname, newname) + + Allows specifying that the module named *oldname* is in fact the package named + *newname*. The most common usage would be to handle how the :mod:`_xmlplus` + package replaces the :mod:`xml` package. + + +.. class:: ModuleFinder([path=None, debug=0, excludes=[], replace_paths=[]]) + + This class provides :meth:`run_script` and :meth:`report` methods to determine + the set of modules imported by a script. *path* can be a list of directories to + search for modules; if not specified, ``sys.path`` is used. *debug* sets the + debugging level; higher values make the class print debugging messages about + what it's doing. *excludes* is a list of module names to exclude from the + analysis. *replace_paths* is a list of ``(oldpath, newpath)`` tuples that will + be replaced in module paths. + + +.. method:: ModuleFinder.report() + + Print a report to standard output that lists the modules imported by the script + and their paths, as well as modules that are missing or seem to be missing. + + +.. method:: ModuleFinder.run_script(pathname) + + Analyze the contents of the *pathname* file, which must contain Python code. + diff --git a/Doc/library/modules.rst b/Doc/library/modules.rst new file mode 100644 index 0000000..2590a3a --- /dev/null +++ b/Doc/library/modules.rst @@ -0,0 +1,20 @@ + +.. _modules: + +***************** +Importing Modules +***************** + +The modules described in this chapter provide new ways to import other Python +modules and hooks for customizing the import process. + +The full list of modules described in this chapter is: + + +.. toctree:: + + imp.rst + zipimport.rst + pkgutil.rst + modulefinder.rst + runpy.rst diff --git a/Doc/library/msilib.rst b/Doc/library/msilib.rst new file mode 100644 index 0000000..6c7955a --- /dev/null +++ b/Doc/library/msilib.rst @@ -0,0 +1,537 @@ + +:mod:`msilib` --- Read and write Microsoft Installer files +========================================================== + +.. module:: msilib + :platform: Windows + :synopsis: Creation of Microsoft Installer files, and CAB files. +.. moduleauthor:: Martin v. Löwis <martin@v.loewis.de> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. index:: single: msi + +.. versionadded:: 2.5 + +The :mod:`msilib` supports the creation of Microsoft Installer (``.msi``) files. +Because these files often contain an embedded "cabinet" file (``.cab``), it also +exposes an API to create CAB files. Support for reading ``.cab`` files is +currently not implemented; read support for the ``.msi`` database is possible. + +This package aims to provide complete access to all tables in an ``.msi`` file, +therefore, it is a fairly low-level API. Two primary applications of this +package are the :mod:`distutils` command ``bdist_msi``, and the creation of +Python installer package itself (although that currently uses a different +version of ``msilib``). + +The package contents can be roughly split into four parts: low-level CAB +routines, low-level MSI routines, higher-level MSI routines, and standard table +structures. + + +.. function:: FCICreate(cabname, files) + + Create a new CAB file named *cabname*. *files* must be a list of tuples, each + containing the name of the file on disk, and the name of the file inside the CAB + file. + + The files are added to the CAB file in the order they appear in the list. All + files are added into a single CAB file, using the MSZIP compression algorithm. + + Callbacks to Python for the various steps of MSI creation are currently not + exposed. + + +.. function:: UUIDCreate() + + Return the string representation of a new unique identifier. This wraps the + Windows API functions :cfunc:`UuidCreate` and :cfunc:`UuidToString`. + + +.. function:: OpenDatabase(path, persist) + + Return a new database object by calling MsiOpenDatabase. *path* is the file + name of the MSI file; *persist* can be one of the constants + ``MSIDBOPEN_CREATEDIRECT``, ``MSIDBOPEN_CREATE``, ``MSIDBOPEN_DIRECT``, + ``MSIDBOPEN_READONLY``, or ``MSIDBOPEN_TRANSACT``, and may include the flag + ``MSIDBOPEN_PATCHFILE``. See the Microsoft documentation for the meaning of + these flags; depending on the flags, an existing database is opened, or a new + one created. + + +.. function:: CreateRecord(count) + + Return a new record object by calling :cfunc:`MSICreateRecord`. *count* is the + number of fields of the record. + + +.. function:: init_database(name, schema, ProductName, ProductCode, ProductVersion, Manufacturer) + + Create and return a new database *name*, initialize it with *schema*, and set + the properties *ProductName*, *ProductCode*, *ProductVersion*, and + *Manufacturer*. + + *schema* must be a module object containing ``tables`` and + ``_Validation_records`` attributes; typically, :mod:`msilib.schema` should be + used. + + The database will contain just the schema and the validation records when this + function returns. + + +.. function:: add_data(database, records) + + Add all *records* to *database*. *records* should be a list of tuples, each one + containing all fields of a record according to the schema of the table. For + optional fields, ``None`` can be passed. + + Field values can be int or long numbers, strings, or instances of the Binary + class. + + +.. class:: Binary(filename) + + Represents entries in the Binary table; inserting such an object using + :func:`add_data` reads the file named *filename* into the table. + + +.. function:: add_tables(database, module) + + Add all table content from *module* to *database*. *module* must contain an + attribute *tables* listing all tables for which content should be added, and one + attribute per table that has the actual content. + + This is typically used to install the sequence tables. + + +.. function:: add_stream(database, name, path) + + Add the file *path* into the ``_Stream`` table of *database*, with the stream + name *name*. + + +.. function:: gen_uuid() + + Return a new UUID, in the format that MSI typically requires (i.e. in curly + braces, and with all hexdigits in upper-case). + + +.. seealso:: + + `FCICreateFile <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/devnotes/winprog/fcicreate.asp>`_ + `UuidCreate <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/rpc/rpc/uuidcreate.asp>`_ + `UuidToString <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/rpc/rpc/uuidtostring.asp>`_ + +.. _database-objects: + +Database Objects +---------------- + + +.. method:: Database.OpenView(sql) + + Return a view object, by calling :cfunc:`MSIDatabaseOpenView`. *sql* is the SQL + statement to execute. + + +.. method:: Database.Commit() + + Commit the changes pending in the current transaction, by calling + :cfunc:`MSIDatabaseCommit`. + + +.. method:: Database.GetSummaryInformation(count) + + Return a new summary information object, by calling + :cfunc:`MsiGetSummaryInformation`. *count* is the maximum number of updated + values. + + +.. seealso:: + + `MSIOpenView <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiopenview.asp>`_ + `MSIDatabaseCommit <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msidatabasecommit.asp>`_ + `MSIGetSummaryInformation <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msigetsummaryinformation.asp>`_ + +.. _view-objects: + +View Objects +------------ + + +.. method:: View.Execute([params=None]) + + Execute the SQL query of the view, through :cfunc:`MSIViewExecute`. *params* is + an optional record describing actual values of the parameter tokens in the + query. + + +.. method:: View.GetColumnInfo(kind) + + Return a record describing the columns of the view, through calling + :cfunc:`MsiViewGetColumnInfo`. *kind* can be either ``MSICOLINFO_NAMES`` or + ``MSICOLINFO_TYPES``. + + +.. method:: View.Fetch() + + Return a result record of the query, through calling :cfunc:`MsiViewFetch`. + + +.. method:: View.Modify(kind, data) + + Modify the view, by calling :cfunc:`MsiViewModify`. *kind* can be one of + ``MSIMODIFY_SEEK``, ``MSIMODIFY_REFRESH``, ``MSIMODIFY_INSERT``, + ``MSIMODIFY_UPDATE``, ``MSIMODIFY_ASSIGN``, ``MSIMODIFY_REPLACE``, + ``MSIMODIFY_MERGE``, ``MSIMODIFY_DELETE``, ``MSIMODIFY_INSERT_TEMPORARY``, + ``MSIMODIFY_VALIDATE``, ``MSIMODIFY_VALIDATE_NEW``, + ``MSIMODIFY_VALIDATE_FIELD``, or ``MSIMODIFY_VALIDATE_DELETE``. + + *data* must be a record describing the new data. + + +.. method:: View.Close() + + Close the view, through :cfunc:`MsiViewClose`. + + +.. seealso:: + + `MsiViewExecute <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewexecute.asp>`_ + `MSIViewGetColumnInfo <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewgetcolumninfo.asp>`_ + `MsiViewFetch <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewfetch.asp>`_ + `MsiViewModify <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewmodify.asp>`_ + `MsiViewClose <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msiviewclose.asp>`_ + +.. _summary-objects: + +Summary Information Objects +--------------------------- + + +.. method:: SummaryInformation.GetProperty(field) + + Return a property of the summary, through :cfunc:`MsiSummaryInfoGetProperty`. + *field* is the name of the property, and can be one of the constants + ``PID_CODEPAGE``, ``PID_TITLE``, ``PID_SUBJECT``, ``PID_AUTHOR``, + ``PID_KEYWORDS``, ``PID_COMMENTS``, ``PID_TEMPLATE``, ``PID_LASTAUTHOR``, + ``PID_REVNUMBER``, ``PID_LASTPRINTED``, ``PID_CREATE_DTM``, + ``PID_LASTSAVE_DTM``, ``PID_PAGECOUNT``, ``PID_WORDCOUNT``, ``PID_CHARCOUNT``, + ``PID_APPNAME``, or ``PID_SECURITY``. + + +.. method:: SummaryInformation.GetPropertyCount() + + Return the number of summary properties, through + :cfunc:`MsiSummaryInfoGetPropertyCount`. + + +.. method:: SummaryInformation.SetProperty(field, value) + + Set a property through :cfunc:`MsiSummaryInfoSetProperty`. *field* can have the + same values as in :meth:`GetProperty`, *value* is the new value of the property. + Possible value types are integer and string. + + +.. method:: SummaryInformation.Persist() + + Write the modified properties to the summary information stream, using + :cfunc:`MsiSummaryInfoPersist`. + + +.. seealso:: + + `MsiSummaryInfoGetProperty <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfogetproperty.asp>`_ + `MsiSummaryInfoGetPropertyCount <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfogetpropertycount.asp>`_ + `MsiSummaryInfoSetProperty <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfosetproperty.asp>`_ + `MsiSummaryInfoPersist <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msisummaryinfopersist.asp>`_ + +.. _record-objects: + +Record Objects +-------------- + + +.. method:: Record.GetFieldCount() + + Return the number of fields of the record, through + :cfunc:`MsiRecordGetFieldCount`. + + +.. method:: Record.SetString(field, value) + + Set *field* to *value* through :cfunc:`MsiRecordSetString`. *field* must be an + integer; *value* a string. + + +.. method:: Record.SetStream(field, value) + + Set *field* to the contents of the file named *value*, through + :cfunc:`MsiRecordSetStream`. *field* must be an integer; *value* a string. + + +.. method:: Record.SetInteger(field, value) + + Set *field* to *value* through :cfunc:`MsiRecordSetInteger`. Both *field* and + *value* must be an integer. + + +.. method:: Record.ClearData() + + Set all fields of the record to 0, through :cfunc:`MsiRecordClearData`. + + +.. seealso:: + + `MsiRecordGetFieldCount <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordgetfieldcount.asp>`_ + `MsiRecordSetString <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordsetstring.asp>`_ + `MsiRecordSetStream <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordsetstream.asp>`_ + `MsiRecordSetInteger <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordsetinteger.asp>`_ + `MsiRecordClear <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/msirecordclear.asp>`_ + +.. _msi-errors: + +Errors +------ + +All wrappers around MSI functions raise :exc:`MsiError`; the string inside the +exception will contain more detail. + + +.. _cab: + +CAB Objects +----------- + + +.. class:: CAB(name) + + The class :class:`CAB` represents a CAB file. During MSI construction, files + will be added simultaneously to the ``Files`` table, and to a CAB file. Then, + when all files have been added, the CAB file can be written, then added to the + MSI file. + + *name* is the name of the CAB file in the MSI file. + + +.. method:: CAB.append(full, file, logical) + + Add the file with the pathname *full* to the CAB file, under the name *logical*. + If there is already a file named *logical*, a new file name is created. + + Return the index of the file in the CAB file, and the new name of the file + inside the CAB file. + + +.. method:: CAB.commit(database) + + Generate a CAB file, add it as a stream to the MSI file, put it into the + ``Media`` table, and remove the generated file from the disk. + + +.. _msi-directory: + +Directory Objects +----------------- + + +.. class:: Directory(database, cab, basedir, physical, logical, default, component, [componentflags]) + + Create a new directory in the Directory table. There is a current component at + each point in time for the directory, which is either explicitly created through + :meth:`start_component`, or implicitly when files are added for the first time. + Files are added into the current component, and into the cab file. To create a + directory, a base directory object needs to be specified (can be ``None``), the + path to the physical directory, and a logical directory name. *default* + specifies the DefaultDir slot in the directory table. *componentflags* specifies + the default flags that new components get. + + +.. method:: Directory.start_component([component[, feature[, flags[, keyfile[, uuid]]]]]) + + Add an entry to the Component table, and make this component the current + component for this directory. If no component name is given, the directory name + is used. If no *feature* is given, the current feature is used. If no *flags* + are given, the directory's default flags are used. If no *keyfile* is given, the + KeyPath is left null in the Component table. + + +.. method:: Directory.add_file(file[, src[, version[, language]]]) + + Add a file to the current component of the directory, starting a new one if + there is no current component. By default, the file name in the source and the + file table will be identical. If the *src* file is specified, it is interpreted + relative to the current directory. Optionally, a *version* and a *language* can + be specified for the entry in the File table. + + +.. method:: Directory.glob(pattern[, exclude]) + + Add a list of files to the current component as specified in the glob pattern. + Individual files can be excluded in the *exclude* list. + + +.. method:: Directory.remove_pyc() + + Remove ``.pyc``/``.pyo`` files on uninstall. + + +.. seealso:: + + `Directory Table <http://msdn.microsoft.com/library/en-us/msi/setup/directory_table.asp>`_ + `File Table <http://msdn.microsoft.com/library/en-us/msi/setup/file_table.asp>`_ + `Component Table <http://msdn.microsoft.com/library/en-us/msi/setup/component_table.asp>`_ + `FeatureComponents Table <http://msdn.microsoft.com/library/en-us/msi/setup/featurecomponents_table.asp>`_ + +.. _features: + +Features +-------- + + +.. class:: Feature(database, id, title, desc, display[, level=1[, parent[, directory[, attributes=0]]]]) + + Add a new record to the ``Feature`` table, using the values *id*, *parent.id*, + *title*, *desc*, *display*, *level*, *directory*, and *attributes*. The + resulting feature object can be passed to the :meth:`start_component` method of + :class:`Directory`. + + +.. method:: Feature.set_current() + + Make this feature the current feature of :mod:`msilib`. New components are + automatically added to the default feature, unless a feature is explicitly + specified. + + +.. seealso:: + + `Feature Table <http://msdn.microsoft.com/library/en-us/msi/setup/feature_table.asp>`_ + +.. _msi-gui: + +GUI classes +----------- + +:mod:`msilib` provides several classes that wrap the GUI tables in an MSI +database. However, no standard user interface is provided; use :mod:`bdist_msi` +to create MSI files with a user-interface for installing Python packages. + + +.. class:: Control(dlg, name) + + Base class of the dialog controls. *dlg* is the dialog object the control + belongs to, and *name* is the control's name. + + +.. method:: Control.event(event, argument[, condition=1[, ordering]]) + + Make an entry into the ``ControlEvent`` table for this control. + + +.. method:: Control.mapping(event, attribute) + + Make an entry into the ``EventMapping`` table for this control. + + +.. method:: Control.condition(action, condition) + + Make an entry into the ``ControlCondition`` table for this control. + + +.. class:: RadioButtonGroup(dlg, name, property) + + Create a radio button control named *name*. *property* is the installer property + that gets set when a radio button is selected. + + +.. method:: RadioButtonGroup.add(name, x, y, width, height, text [, value]) + + Add a radio button named *name* to the group, at the coordinates *x*, *y*, + *width*, *height*, and with the label *text*. If *value* is omitted, it defaults + to *name*. + + +.. class:: Dialog(db, name, x, y, w, h, attr, title, first, default, cancel) + + Return a new :class:`Dialog` object. An entry in the ``Dialog`` table is made, + with the specified coordinates, dialog attributes, title, name of the first, + default, and cancel controls. + + +.. method:: Dialog.control(name, type, x, y, width, height, attributes, property, text, control_next, help) + + Return a new :class:`Control` object. An entry in the ``Control`` table is made + with the specified parameters. + + This is a generic method; for specific types, specialized methods are provided. + + +.. method:: Dialog.text(name, x, y, width, height, attributes, text) + + Add and return a ``Text`` control. + + +.. method:: Dialog.bitmap(name, x, y, width, height, text) + + Add and return a ``Bitmap`` control. + + +.. method:: Dialog.line(name, x, y, width, height) + + Add and return a ``Line`` control. + + +.. method:: Dialog.pushbutton(name, x, y, width, height, attributes, text, next_control) + + Add and return a ``PushButton`` control. + + +.. method:: Dialog.radiogroup(name, x, y, width, height, attributes, property, text, next_control) + + Add and return a ``RadioButtonGroup`` control. + + +.. method:: Dialog.checkbox(name, x, y, width, height, attributes, property, text, next_control) + + Add and return a ``CheckBox`` control. + + +.. seealso:: + + `Dialog Table <http://msdn.microsoft.com/library/en-us/msi/setup/dialog_table.asp>`_ + `Control Table <http://msdn.microsoft.com/library/en-us/msi/setup/control_table.asp>`_ + `Control Types <http://msdn.microsoft.com/library/en-us/msi/setup/controls.asp>`_ + `ControlCondition Table <http://msdn.microsoft.com/library/en-us/msi/setup/controlcondition_table.asp>`_ + `ControlEvent Table <http://msdn.microsoft.com/library/en-us/msi/setup/controlevent_table.asp>`_ + `EventMapping Table <http://msdn.microsoft.com/library/en-us/msi/setup/eventmapping_table.asp>`_ + `RadioButton Table <http://msdn.microsoft.com/library/en-us/msi/setup/radiobutton_table.asp>`_ + +.. _msi-tables: + +Precomputed tables +------------------ + +:mod:`msilib` provides a few subpackages that contain only schema and table +definitions. Currently, these definitions are based on MSI version 2.0. + + +.. data:: schema + + This is the standard MSI schema for MSI 2.0, with the *tables* variable + providing a list of table definitions, and *_Validation_records* providing the + data for MSI validation. + + +.. data:: sequence + + This module contains table contents for the standard sequence tables: + *AdminExecuteSequence*, *AdminUISequence*, *AdvtExecuteSequence*, + *InstallExecuteSequence*, and *InstallUISequence*. + + +.. data:: text + + This module contains definitions for the UIText and ActionText tables, for the + standard installer actions. + diff --git a/Doc/library/msvcrt.rst b/Doc/library/msvcrt.rst new file mode 100644 index 0000000..d43bb4c --- /dev/null +++ b/Doc/library/msvcrt.rst @@ -0,0 +1,126 @@ + +:mod:`msvcrt` -- Useful routines from the MS VC++ runtime +========================================================= + +.. module:: msvcrt + :platform: Windows + :synopsis: Miscellaneous useful routines from the MS VC++ runtime. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +These functions provide access to some useful capabilities on Windows platforms. +Some higher-level modules use these functions to build the Windows +implementations of their services. For example, the :mod:`getpass` module uses +this in the implementation of the :func:`getpass` function. + +Further documentation on these functions can be found in the Platform API +documentation. + + +.. _msvcrt-files: + +File Operations +--------------- + + +.. function:: locking(fd, mode, nbytes) + + Lock part of a file based on file descriptor *fd* from the C runtime. Raises + :exc:`IOError` on failure. The locked region of the file extends from the + current file position for *nbytes* bytes, and may continue beyond the end of the + file. *mode* must be one of the :const:`LK_\*` constants listed below. Multiple + regions in a file may be locked at the same time, but may not overlap. Adjacent + regions are not merged; they must be unlocked individually. + + +.. data:: LK_LOCK + LK_RLCK + + Locks the specified bytes. If the bytes cannot be locked, the program + immediately tries again after 1 second. If, after 10 attempts, the bytes cannot + be locked, :exc:`IOError` is raised. + + +.. data:: LK_NBLCK + LK_NBRLCK + + Locks the specified bytes. If the bytes cannot be locked, :exc:`IOError` is + raised. + + +.. data:: LK_UNLCK + + Unlocks the specified bytes, which must have been previously locked. + + +.. function:: setmode(fd, flags) + + Set the line-end translation mode for the file descriptor *fd*. To set it to + text mode, *flags* should be :const:`os.O_TEXT`; for binary, it should be + :const:`os.O_BINARY`. + + +.. function:: open_osfhandle(handle, flags) + + Create a C runtime file descriptor from the file handle *handle*. The *flags* + parameter should be a bit-wise OR of :const:`os.O_APPEND`, :const:`os.O_RDONLY`, + and :const:`os.O_TEXT`. The returned file descriptor may be used as a parameter + to :func:`os.fdopen` to create a file object. + + +.. function:: get_osfhandle(fd) + + Return the file handle for the file descriptor *fd*. Raises :exc:`IOError` if + *fd* is not recognized. + + +.. _msvcrt-console: + +Console I/O +----------- + + +.. function:: kbhit() + + Return true if a keypress is waiting to be read. + + +.. function:: getch() + + Read a keypress and return the resulting character. Nothing is echoed to the + console. This call will block if a keypress is not already available, but will + not wait for :kbd:`Enter` to be pressed. If the pressed key was a special + function key, this will return ``'\000'`` or ``'\xe0'``; the next call will + return the keycode. The :kbd:`Control-C` keypress cannot be read with this + function. + + +.. function:: getche() + + Similar to :func:`getch`, but the keypress will be echoed if it represents a + printable character. + + +.. function:: putch(char) + + Print the character *char* to the console without buffering. + + +.. function:: ungetch(char) + + Cause the character *char* to be "pushed back" into the console buffer; it will + be the next character read by :func:`getch` or :func:`getche`. + + +.. _msvcrt-other: + +Other Functions +--------------- + + +.. function:: heapmin() + + Force the :cfunc:`malloc` heap to clean itself up and return unused blocks to + the operating system. This only works on Windows NT. On failure, this raises + :exc:`IOError`. + diff --git a/Doc/library/multifile.rst b/Doc/library/multifile.rst new file mode 100644 index 0000000..c36ccb7 --- /dev/null +++ b/Doc/library/multifile.rst @@ -0,0 +1,190 @@ + +:mod:`multifile` --- Support for files containing distinct parts +================================================================ + +.. module:: multifile + :synopsis: Support for reading files which contain distinct parts, such as some MIME data. +.. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com> + + +.. deprecated:: 2.5 + The :mod:`email` package should be used in preference to the :mod:`multifile` + module. This module is present only to maintain backward compatibility. + +The :class:`MultiFile` object enables you to treat sections of a text file as +file-like input objects, with ``''`` being returned by :meth:`readline` when a +given delimiter pattern is encountered. The defaults of this class are designed +to make it useful for parsing MIME multipart messages, but by subclassing it and +overriding methods it can be easily adapted for more general use. + + +.. class:: MultiFile(fp[, seekable]) + + Create a multi-file. You must instantiate this class with an input object + argument for the :class:`MultiFile` instance to get lines from, such as a file + object returned by :func:`open`. + + :class:`MultiFile` only ever looks at the input object's :meth:`readline`, + :meth:`seek` and :meth:`tell` methods, and the latter two are only needed if you + want random access to the individual MIME parts. To use :class:`MultiFile` on a + non-seekable stream object, set the optional *seekable* argument to false; this + will prevent using the input object's :meth:`seek` and :meth:`tell` methods. + +It will be useful to know that in :class:`MultiFile`'s view of the world, text +is composed of three kinds of lines: data, section-dividers, and end-markers. +MultiFile is designed to support parsing of messages that may have multiple +nested message parts, each with its own pattern for section-divider and +end-marker lines. + + +.. seealso:: + + Module :mod:`email` + Comprehensive email handling package; supersedes the :mod:`multifile` module. + + +.. _multifile-objects: + +MultiFile Objects +----------------- + +A :class:`MultiFile` instance has the following methods: + + +.. method:: MultiFile.readline(str) + + Read a line. If the line is data (not a section-divider or end-marker or real + EOF) return it. If the line matches the most-recently-stacked boundary, return + ``''`` and set ``self.last`` to 1 or 0 according as the match is or is not an + end-marker. If the line matches any other stacked boundary, raise an error. On + encountering end-of-file on the underlying stream object, the method raises + :exc:`Error` unless all boundaries have been popped. + + +.. method:: MultiFile.readlines(str) + + Return all lines remaining in this part as a list of strings. + + +.. method:: MultiFile.read() + + Read all lines, up to the next section. Return them as a single (multiline) + string. Note that this doesn't take a size argument! + + +.. method:: MultiFile.seek(pos[, whence]) + + Seek. Seek indices are relative to the start of the current section. The *pos* + and *whence* arguments are interpreted as for a file seek. + + +.. method:: MultiFile.tell() + + Return the file position relative to the start of the current section. + + +.. method:: MultiFile.next() + + Skip lines to the next section (that is, read lines until a section-divider or + end-marker has been consumed). Return true if there is such a section, false if + an end-marker is seen. Re-enable the most-recently-pushed boundary. + + +.. method:: MultiFile.is_data(str) + + Return true if *str* is data and false if it might be a section boundary. As + written, it tests for a prefix other than ``'-``\ ``-'`` at start of line (which + all MIME boundaries have) but it is declared so it can be overridden in derived + classes. + + Note that this test is used intended as a fast guard for the real boundary + tests; if it always returns false it will merely slow processing, not cause it + to fail. + + +.. method:: MultiFile.push(str) + + Push a boundary string. When a decorated version of this boundary is found as + an input line, it will be interpreted as a section-divider or end-marker + (depending on the decoration, see :rfc:`2045`). All subsequent reads will + return the empty string to indicate end-of-file, until a call to :meth:`pop` + removes the boundary a or :meth:`next` call reenables it. + + It is possible to push more than one boundary. Encountering the + most-recently-pushed boundary will return EOF; encountering any other + boundary will raise an error. + + +.. method:: MultiFile.pop() + + Pop a section boundary. This boundary will no longer be interpreted as EOF. + + +.. method:: MultiFile.section_divider(str) + + Turn a boundary into a section-divider line. By default, this method + prepends ``'--'`` (which MIME section boundaries have) but it is declared so + it can be overridden in derived classes. This method need not append LF or + CR-LF, as comparison with the result ignores trailing whitespace. + + +.. method:: MultiFile.end_marker(str) + + Turn a boundary string into an end-marker line. By default, this method + prepends ``'--'`` and appends ``'--'`` (like a MIME-multipart end-of-message + marker) but it is declared so it can be overridden in derived classes. This + method need not append LF or CR-LF, as comparison with the result ignores + trailing whitespace. + +Finally, :class:`MultiFile` instances have two public instance variables: + + +.. attribute:: MultiFile.level + + Nesting depth of the current part. + + +.. attribute:: MultiFile.last + + True if the last end-of-file was for an end-of-message marker. + + +.. _multifile-example: + +:class:`MultiFile` Example +-------------------------- + +.. sectionauthor:: Skip Montanaro <skip@mojam.com> + + +:: + + import mimetools + import multifile + import StringIO + + def extract_mime_part_matching(stream, mimetype): + """Return the first element in a multipart MIME message on stream + matching mimetype.""" + + msg = mimetools.Message(stream) + msgtype = msg.gettype() + params = msg.getplist() + + data = StringIO.StringIO() + if msgtype[:10] == "multipart/": + + file = multifile.MultiFile(stream) + file.push(msg.getparam("boundary")) + while file.next(): + submsg = mimetools.Message(file) + try: + data = StringIO.StringIO() + mimetools.decode(file, data, submsg.getencoding()) + except ValueError: + continue + if submsg.gettype() == mimetype: + break + file.pop() + return data.getvalue() + diff --git a/Doc/library/mutex.rst b/Doc/library/mutex.rst new file mode 100644 index 0000000..523692f --- /dev/null +++ b/Doc/library/mutex.rst @@ -0,0 +1,62 @@ + +:mod:`mutex` --- Mutual exclusion support +========================================= + +.. module:: mutex + :synopsis: Lock and queue for mutual exclusion. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`mutex` module defines a class that allows mutual-exclusion via +acquiring and releasing locks. It does not require (or imply) threading or +multi-tasking, though it could be useful for those purposes. + +The :mod:`mutex` module defines the following class: + + +.. class:: mutex() + + Create a new (unlocked) mutex. + + A mutex has two pieces of state --- a "locked" bit and a queue. When the mutex + is not locked, the queue is empty. Otherwise, the queue contains zero or more + ``(function, argument)`` pairs representing functions (or methods) waiting to + acquire the lock. When the mutex is unlocked while the queue is not empty, the + first queue entry is removed and its ``function(argument)`` pair called, + implying it now has the lock. + + Of course, no multi-threading is implied -- hence the funny interface for + :meth:`lock`, where a function is called once the lock is acquired. + + +.. _mutex-objects: + +Mutex Objects +------------- + +:class:`mutex` objects have following methods: + + +.. method:: mutex.test() + + Check whether the mutex is locked. + + +.. method:: mutex.testandset() + + "Atomic" test-and-set, grab the lock if it is not set, and return ``True``, + otherwise, return ``False``. + + +.. method:: mutex.lock(function, argument) + + Execute ``function(argument)``, unless the mutex is locked. In the case it is + locked, place the function and argument on the queue. See :meth:`unlock` for + explanation of when ``function(argument)`` is executed in that case. + + +.. method:: mutex.unlock() + + Unlock the mutex if queue is empty, otherwise execute the first element in the + queue. + diff --git a/Doc/library/netdata.rst b/Doc/library/netdata.rst new file mode 100644 index 0000000..add01d2 --- /dev/null +++ b/Doc/library/netdata.rst @@ -0,0 +1,26 @@ + +.. _netdata: + +********************** +Internet Data Handling +********************** + +This chapter describes modules which support handling data formats commonly used +on the Internet. + + +.. toctree:: + + email.rst + mailcap.rst + mailbox.rst + mhlib.rst + mimetools.rst + mimetypes.rst + multifile.rst + rfc822.rst + base64.rst + binhex.rst + binascii.rst + quopri.rst + uu.rst diff --git a/Doc/library/netrc.rst b/Doc/library/netrc.rst new file mode 100644 index 0000000..bf3d92e --- /dev/null +++ b/Doc/library/netrc.rst @@ -0,0 +1,78 @@ + +:mod:`netrc` --- netrc file processing +====================================== + +.. module:: netrc + :synopsis: Loading of .netrc files. +.. moduleauthor:: Eric S. Raymond <esr@snark.thyrsus.com> +.. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com> + + +.. % Note the \protect needed for \file... ;-( + +.. versionadded:: 1.5.2 + +The :class:`netrc` class parses and encapsulates the netrc file format used by +the Unix :program:`ftp` program and other FTP clients. + + +.. class:: netrc([file]) + + A :class:`netrc` instance or subclass instance encapsulates data from a netrc + file. The initialization argument, if present, specifies the file to parse. If + no argument is given, the file :file:`.netrc` in the user's home directory will + be read. Parse errors will raise :exc:`NetrcParseError` with diagnostic + information including the file name, line number, and terminating token. + + +.. exception:: NetrcParseError + + Exception raised by the :class:`netrc` class when syntactical errors are + encountered in source text. Instances of this exception provide three + interesting attributes: :attr:`msg` is a textual explanation of the error, + :attr:`filename` is the name of the source file, and :attr:`lineno` gives the + line number on which the error was found. + + +.. _netrc-objects: + +netrc Objects +------------- + +A :class:`netrc` instance has the following methods: + + +.. method:: netrc.authenticators(host) + + Return a 3-tuple ``(login, account, password)`` of authenticators for *host*. + If the netrc file did not contain an entry for the given host, return the tuple + associated with the 'default' entry. If neither matching host nor default entry + is available, return ``None``. + + +.. method:: netrc.__repr__() + + Dump the class data as a string in the format of a netrc file. (This discards + comments and may reorder the entries.) + +Instances of :class:`netrc` have public instance variables: + + +.. attribute:: netrc.hosts + + Dictionary mapping host names to ``(login, account, password)`` tuples. The + 'default' entry, if any, is represented as a pseudo-host by that name. + + +.. attribute:: netrc.macros + + Dictionary mapping macro names to string lists. + +.. note:: + + Passwords are limited to a subset of the ASCII character set. Versions of + this module prior to 2.3 were extremely limited. Starting with 2.3, all + ASCII punctuation is allowed in passwords. However, note that whitespace and + non-printable characters are not allowed in passwords. This is a limitation + of the way the .netrc file is parsed and may be removed in the future. + diff --git a/Doc/library/new.rst b/Doc/library/new.rst new file mode 100644 index 0000000..852fb58 --- /dev/null +++ b/Doc/library/new.rst @@ -0,0 +1,53 @@ + +:mod:`new` --- Creation of runtime internal objects +=================================================== + +.. module:: new + :synopsis: Interface to the creation of runtime implementation objects. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`new` module allows an interface to the interpreter object creation +functions. This is for use primarily in marshal-type functions, when a new +object needs to be created "magically" and not by using the regular creation +functions. This module provides a low-level interface to the interpreter, so +care must be exercised when using this module. It is possible to supply +non-sensical arguments which crash the interpreter when the object is used. + +The :mod:`new` module defines the following functions: + + +.. function:: instancemethod(function, instance, class) + + This function will return a method object, bound to *instance*, or unbound if + *instance* is ``None``. *function* must be callable. + + +.. function:: function(code, globals[, name[, argdefs[, closure]]]) + + Returns a (Python) function with the given code and globals. If *name* is given, + it must be a string or ``None``. If it is a string, the function will have the + given name, otherwise the function name will be taken from ``code.co_name``. If + *argdefs* is given, it must be a tuple and will be used to determine the default + values of parameters. If *closure* is given, it must be ``None`` or a tuple of + cell objects containing objects to bind to the names in ``code.co_freevars``. + + +.. function:: code(argcount, nlocals, stacksize, flags, codestring, constants, names, varnames, filename, name, firstlineno, lnotab) + + This function is an interface to the :cfunc:`PyCode_New` C function. + + .. % XXX This is still undocumented!!!!!!!!!!! + + +.. function:: module(name[, doc]) + + This function returns a new module object with name *name*. *name* must be a + string. The optional *doc* argument can have any type. + + +.. function:: classobj(name, baseclasses, dict) + + This function returns a new class object, with name *name*, derived from + *baseclasses* (which should be a tuple of classes) and with namespace *dict*. + diff --git a/Doc/library/nis.rst b/Doc/library/nis.rst new file mode 100644 index 0000000..77684bf --- /dev/null +++ b/Doc/library/nis.rst @@ -0,0 +1,68 @@ + +:mod:`nis` --- Interface to Sun's NIS (Yellow Pages) +==================================================== + +.. module:: nis + :platform: Unix + :synopsis: Interface to Sun's NIS (Yellow Pages) library. +.. moduleauthor:: Fred Gansevles <Fred.Gansevles@cs.utwente.nl> +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`nis` module gives a thin wrapper around the NIS library, useful for +central administration of several hosts. + +Because NIS exists only on Unix systems, this module is only available for Unix. + +The :mod:`nis` module defines the following functions: + + +.. function:: match(key, mapname[, domain=default_domain]) + + Return the match for *key* in map *mapname*, or raise an error + (:exc:`nis.error`) if there is none. Both should be strings, *key* is 8-bit + clean. Return value is an arbitrary array of bytes (may contain ``NULL`` and + other joys). + + Note that *mapname* is first checked if it is an alias to another name. + + .. versionchanged:: 2.5 + The *domain* argument allows to override the NIS domain used for the lookup. If + unspecified, lookup is in the default NIS domain. + + +.. function:: cat(mapname[, domain=default_domain]) + + Return a dictionary mapping *key* to *value* such that ``match(key, + mapname)==value``. Note that both keys and values of the dictionary are + arbitrary arrays of bytes. + + Note that *mapname* is first checked if it is an alias to another name. + + .. versionchanged:: 2.5 + The *domain* argument allows to override the NIS domain used for the lookup. If + unspecified, lookup is in the default NIS domain. + + +.. function:: maps([domain=default_domain]) + + Return a list of all valid maps. + + .. versionchanged:: 2.5 + The *domain* argument allows to override the NIS domain used for the lookup. If + unspecified, lookup is in the default NIS domain. + + +.. function:: get_default_domain() + + Return the system default NIS domain. + + .. versionadded:: 2.5 + +The :mod:`nis` module defines the following exception: + + +.. exception:: error + + An error raised when a NIS function returns an error code. + diff --git a/Doc/library/nntplib.rst b/Doc/library/nntplib.rst new file mode 100644 index 0000000..5bc947e --- /dev/null +++ b/Doc/library/nntplib.rst @@ -0,0 +1,350 @@ + +:mod:`nntplib` --- NNTP protocol client +======================================= + +.. module:: nntplib + :synopsis: NNTP protocol client (requires sockets). + + +.. index:: + pair: NNTP; protocol + single: Network News Transfer Protocol + +This module defines the class :class:`NNTP` which implements the client side of +the NNTP protocol. It can be used to implement a news reader or poster, or +automated news processors. For more information on NNTP (Network News Transfer +Protocol), see Internet :rfc:`977`. + +Here are two small examples of how it can be used. To list some statistics +about a newsgroup and print the subjects of the last 10 articles:: + + >>> s = NNTP('news.cwi.nl') + >>> resp, count, first, last, name = s.group('comp.lang.python') + >>> print 'Group', name, 'has', count, 'articles, range', first, 'to', last + Group comp.lang.python has 59 articles, range 3742 to 3803 + >>> resp, subs = s.xhdr('subject', first + '-' + last) + >>> for id, sub in subs[-10:]: print id, sub + ... + 3792 Re: Removing elements from a list while iterating... + 3793 Re: Who likes Info files? + 3794 Emacs and doc strings + 3795 a few questions about the Mac implementation + 3796 Re: executable python scripts + 3797 Re: executable python scripts + 3798 Re: a few questions about the Mac implementation + 3799 Re: PROPOSAL: A Generic Python Object Interface for Python C Modules + 3802 Re: executable python scripts + 3803 Re: \POSIX{} wait and SIGCHLD + >>> s.quit() + '205 news.cwi.nl closing connection. Goodbye.' + +To post an article from a file (this assumes that the article has valid +headers):: + + >>> s = NNTP('news.cwi.nl') + >>> f = open('/tmp/article') + >>> s.post(f) + '240 Article posted successfully.' + >>> s.quit() + '205 news.cwi.nl closing connection. Goodbye.' + +The module itself defines the following items: + + +.. class:: NNTP(host[, port [, user[, password [, readermode] [, usenetrc]]]]) + + Return a new instance of the :class:`NNTP` class, representing a connection + to the NNTP server running on host *host*, listening at port *port*. The + default *port* is 119. If the optional *user* and *password* are provided, + or if suitable credentials are present in :file:`/.netrc` and the optional + flag *usenetrc* is true (the default), the ``AUTHINFO USER`` and ``AUTHINFO + PASS`` commands are used to identify and authenticate the user to the server. + If the optional flag *readermode* is true, then a ``mode reader`` command is + sent before authentication is performed. Reader mode is sometimes necessary + if you are connecting to an NNTP server on the local machine and intend to + call reader-specific commands, such as ``group``. If you get unexpected + :exc:`NNTPPermanentError`\ s, you might need to set *readermode*. + *readermode* defaults to ``None``. *usenetrc* defaults to ``True``. + + .. versionchanged:: 2.4 + *usenetrc* argument added. + + +.. exception:: NNTPError + + Derived from the standard exception :exc:`Exception`, this is the base class for + all exceptions raised by the :mod:`nntplib` module. + + +.. exception:: NNTPReplyError + + Exception raised when an unexpected reply is received from the server. For + backwards compatibility, the exception ``error_reply`` is equivalent to this + class. + + +.. exception:: NNTPTemporaryError + + Exception raised when an error code in the range 400--499 is received. For + backwards compatibility, the exception ``error_temp`` is equivalent to this + class. + + +.. exception:: NNTPPermanentError + + Exception raised when an error code in the range 500--599 is received. For + backwards compatibility, the exception ``error_perm`` is equivalent to this + class. + + +.. exception:: NNTPProtocolError + + Exception raised when a reply is received from the server that does not begin + with a digit in the range 1--5. For backwards compatibility, the exception + ``error_proto`` is equivalent to this class. + + +.. exception:: NNTPDataError + + Exception raised when there is some error in the response data. For backwards + compatibility, the exception ``error_data`` is equivalent to this class. + + +.. _nntp-objects: + +NNTP Objects +------------ + +NNTP instances have the following methods. The *response* that is returned as +the first item in the return tuple of almost all methods is the server's +response: a string beginning with a three-digit code. If the server's response +indicates an error, the method raises one of the above exceptions. + + +.. method:: NNTP.getwelcome() + + Return the welcome message sent by the server in reply to the initial + connection. (This message sometimes contains disclaimers or help information + that may be relevant to the user.) + + +.. method:: NNTP.set_debuglevel(level) + + Set the instance's debugging level. This controls the amount of debugging + output printed. The default, ``0``, produces no debugging output. A value of + ``1`` produces a moderate amount of debugging output, generally a single line + per request or response. A value of ``2`` or higher produces the maximum amount + of debugging output, logging each line sent and received on the connection + (including message text). + + +.. method:: NNTP.newgroups(date, time, [file]) + + Send a ``NEWGROUPS`` command. The *date* argument should be a string of the + form ``'yymmdd'`` indicating the date, and *time* should be a string of the form + ``'hhmmss'`` indicating the time. Return a pair ``(response, groups)`` where + *groups* is a list of group names that are new since the given date and time. If + the *file* parameter is supplied, then the output of the ``NEWGROUPS`` command + is stored in a file. If *file* is a string, then the method will open a file + object with that name, write to it then close it. If *file* is a file object, + then it will start calling :meth:`write` on it to store the lines of the command + output. If *file* is supplied, then the returned *list* is an empty list. + + +.. method:: NNTP.newnews(group, date, time, [file]) + + Send a ``NEWNEWS`` command. Here, *group* is a group name or ``'*'``, and + *date* and *time* have the same meaning as for :meth:`newgroups`. Return a pair + ``(response, articles)`` where *articles* is a list of message ids. If the + *file* parameter is supplied, then the output of the ``NEWNEWS`` command is + stored in a file. If *file* is a string, then the method will open a file + object with that name, write to it then close it. If *file* is a file object, + then it will start calling :meth:`write` on it to store the lines of the command + output. If *file* is supplied, then the returned *list* is an empty list. + + +.. method:: NNTP.list([file]) + + Send a ``LIST`` command. Return a pair ``(response, list)`` where *list* is a + list of tuples. Each tuple has the form ``(group, last, first, flag)``, where + *group* is a group name, *last* and *first* are the last and first article + numbers (as strings), and *flag* is ``'y'`` if posting is allowed, ``'n'`` if + not, and ``'m'`` if the newsgroup is moderated. (Note the ordering: *last*, + *first*.) If the *file* parameter is supplied, then the output of the ``LIST`` + command is stored in a file. If *file* is a string, then the method will open + a file object with that name, write to it then close it. If *file* is a file + object, then it will start calling :meth:`write` on it to store the lines of the + command output. If *file* is supplied, then the returned *list* is an empty + list. + + +.. method:: NNTP.descriptions(grouppattern) + + Send a ``LIST NEWSGROUPS`` command, where *grouppattern* is a wildmat string as + specified in RFC2980 (it's essentially the same as DOS or UNIX shell wildcard + strings). Return a pair ``(response, list)``, where *list* is a list of tuples + containing ``(name, title)``. + + .. versionadded:: 2.4 + + +.. method:: NNTP.description(group) + + Get a description for a single group *group*. If more than one group matches + (if 'group' is a real wildmat string), return the first match. If no group + matches, return an empty string. + + This elides the response code from the server. If the response code is needed, + use :meth:`descriptions`. + + .. versionadded:: 2.4 + + +.. method:: NNTP.group(name) + + Send a ``GROUP`` command, where *name* is the group name. Return a tuple + ``(response, count, first, last, name)`` where *count* is the (estimated) number + of articles in the group, *first* is the first article number in the group, + *last* is the last article number in the group, and *name* is the group name. + The numbers are returned as strings. + + +.. method:: NNTP.help([file]) + + Send a ``HELP`` command. Return a pair ``(response, list)`` where *list* is a + list of help strings. If the *file* parameter is supplied, then the output of + the ``HELP`` command is stored in a file. If *file* is a string, then the + method will open a file object with that name, write to it then close it. If + *file* is a file object, then it will start calling :meth:`write` on it to store + the lines of the command output. If *file* is supplied, then the returned *list* + is an empty list. + + +.. method:: NNTP.stat(id) + + Send a ``STAT`` command, where *id* is the message id (enclosed in ``'<'`` and + ``'>'``) or an article number (as a string). Return a triple ``(response, + number, id)`` where *number* is the article number (as a string) and *id* is the + message id (enclosed in ``'<'`` and ``'>'``). + + +.. method:: NNTP.next() + + Send a ``NEXT`` command. Return as for :meth:`stat`. + + +.. method:: NNTP.last() + + Send a ``LAST`` command. Return as for :meth:`stat`. + + +.. method:: NNTP.head(id) + + Send a ``HEAD`` command, where *id* has the same meaning as for :meth:`stat`. + Return a tuple ``(response, number, id, list)`` where the first three are the + same as for :meth:`stat`, and *list* is a list of the article's headers (an + uninterpreted list of lines, without trailing newlines). + + +.. method:: NNTP.body(id,[file]) + + Send a ``BODY`` command, where *id* has the same meaning as for :meth:`stat`. + If the *file* parameter is supplied, then the body is stored in a file. If + *file* is a string, then the method will open a file object with that name, + write to it then close it. If *file* is a file object, then it will start + calling :meth:`write` on it to store the lines of the body. Return as for + :meth:`head`. If *file* is supplied, then the returned *list* is an empty list. + + +.. method:: NNTP.article(id) + + Send an ``ARTICLE`` command, where *id* has the same meaning as for + :meth:`stat`. Return as for :meth:`head`. + + +.. method:: NNTP.slave() + + Send a ``SLAVE`` command. Return the server's *response*. + + +.. method:: NNTP.xhdr(header, string, [file]) + + Send an ``XHDR`` command. This command is not defined in the RFC but is a + common extension. The *header* argument is a header keyword, e.g. + ``'subject'``. The *string* argument should have the form ``'first-last'`` + where *first* and *last* are the first and last article numbers to search. + Return a pair ``(response, list)``, where *list* is a list of pairs ``(id, + text)``, where *id* is an article number (as a string) and *text* is the text of + the requested header for that article. If the *file* parameter is supplied, then + the output of the ``XHDR`` command is stored in a file. If *file* is a string, + then the method will open a file object with that name, write to it then close + it. If *file* is a file object, then it will start calling :meth:`write` on it + to store the lines of the command output. If *file* is supplied, then the + returned *list* is an empty list. + + +.. method:: NNTP.post(file) + + Post an article using the ``POST`` command. The *file* argument is an open file + object which is read until EOF using its :meth:`readline` method. It should be + a well-formed news article, including the required headers. The :meth:`post` + method automatically escapes lines beginning with ``.``. + + +.. method:: NNTP.ihave(id, file) + + Send an ``IHAVE`` command. *id* is a message id (enclosed in ``'<'`` and + ``'>'``). If the response is not an error, treat *file* exactly as for the + :meth:`post` method. + + +.. method:: NNTP.date() + + Return a triple ``(response, date, time)``, containing the current date and time + in a form suitable for the :meth:`newnews` and :meth:`newgroups` methods. This + is an optional NNTP extension, and may not be supported by all servers. + + +.. method:: NNTP.xgtitle(name, [file]) + + Process an ``XGTITLE`` command, returning a pair ``(response, list)``, where + *list* is a list of tuples containing ``(name, title)``. If the *file* parameter + is supplied, then the output of the ``XGTITLE`` command is stored in a file. + If *file* is a string, then the method will open a file object with that name, + write to it then close it. If *file* is a file object, then it will start + calling :meth:`write` on it to store the lines of the command output. If *file* + is supplied, then the returned *list* is an empty list. This is an optional NNTP + extension, and may not be supported by all servers. + + .. % XXX huh? Should that be name, description? + + RFC2980 says "It is suggested that this extension be deprecated". Use + :meth:`descriptions` or :meth:`description` instead. + + +.. method:: NNTP.xover(start, end, [file]) + + Return a pair ``(resp, list)``. *list* is a list of tuples, one for each + article in the range delimited by the *start* and *end* article numbers. Each + tuple is of the form ``(article number, subject, poster, date, id, references, + size, lines)``. If the *file* parameter is supplied, then the output of the + ``XOVER`` command is stored in a file. If *file* is a string, then the method + will open a file object with that name, write to it then close it. If *file* + is a file object, then it will start calling :meth:`write` on it to store the + lines of the command output. If *file* is supplied, then the returned *list* is + an empty list. This is an optional NNTP extension, and may not be supported by + all servers. + + +.. method:: NNTP.xpath(id) + + Return a pair ``(resp, path)``, where *path* is the directory path to the + article with message ID *id*. This is an optional NNTP extension, and may not + be supported by all servers. + + +.. method:: NNTP.quit() + + Send a ``QUIT`` command and close the connection. Once this method has been + called, no other methods of the NNTP object should be called. + diff --git a/Doc/library/numeric.rst b/Doc/library/numeric.rst new file mode 100644 index 0000000..0d9d59f --- /dev/null +++ b/Doc/library/numeric.rst @@ -0,0 +1,25 @@ + +.. _numeric: + +******************************** +Numeric and Mathematical Modules +******************************** + +The modules described in this chapter provide numeric and math-related functions +and data types. The :mod:`math` and :mod:`cmath` contain various mathematical +functions for floating-point and complex numbers. For users more interested in +decimal accuracy than in speed, the :mod:`decimal` module supports exact +representations of decimal numbers. + +The following modules are documented in this chapter: + + +.. toctree:: + + math.rst + cmath.rst + decimal.rst + random.rst + itertools.rst + functools.rst + operator.rst diff --git a/Doc/library/objects.rst b/Doc/library/objects.rst new file mode 100644 index 0000000..c6cc9e4 --- /dev/null +++ b/Doc/library/objects.rst @@ -0,0 +1,32 @@ + +.. _builtin: + +**************** +Built-in Objects +**************** + +.. index:: + pair: built-in; types + pair: built-in; exceptions + pair: built-in; functions + pair: built-in; constants + single: symbol table + +Names for built-in exceptions and functions and a number of constants are found +in a separate symbol table. This table is searched last when the interpreter +looks up the meaning of a name, so local and global user-defined names can +override built-in names. Built-in types are described together here for easy +reference. [#]_ + +The tables in this chapter document the priorities of operators by listing them +in order of ascending priority (within a table) and grouping operators that have +the same priority in the same box. Binary operators of the same priority group +from left to right. (Unary operators group from right to left, but there you +have no real choice.) See :ref:`operator-summary` for the complete picture on +operator priorities. + +.. rubric:: Footnotes + +.. [#] Most descriptions sorely lack explanations of the exceptions that may be raised + --- this will be fixed in a future version of this manual. + diff --git a/Doc/library/operator.rst b/Doc/library/operator.rst new file mode 100644 index 0000000..4e85569 --- /dev/null +++ b/Doc/library/operator.rst @@ -0,0 +1,612 @@ +:mod:`operator` --- Standard operators as functions +=================================================== + +.. module:: operator + :synopsis: Functions corresponding to the standard operators. +.. sectionauthor:: Skip Montanaro <skip@automatrix.com> + + + +The :mod:`operator` module exports a set of functions implemented in C +corresponding to the intrinsic operators of Python. For example, +``operator.add(x, y)`` is equivalent to the expression ``x+y``. The function +names are those used for special class methods; variants without leading and +trailing ``__`` are also provided for convenience. + +The functions fall into categories that perform object comparisons, logical +operations, mathematical operations, sequence operations, and abstract type +tests. + +The object comparison functions are useful for all objects, and are named after +the rich comparison operators they support: + + +.. function:: lt(a, b) + le(a, b) + eq(a, b) + ne(a, b) + ge(a, b) + gt(a, b) + __lt__(a, b) + __le__(a, b) + __eq__(a, b) + __ne__(a, b) + __ge__(a, b) + __gt__(a, b) + + Perform "rich comparisons" between *a* and *b*. Specifically, ``lt(a, b)`` is + equivalent to ``a < b``, ``le(a, b)`` is equivalent to ``a <= b``, ``eq(a, + b)`` is equivalent to ``a == b``, ``ne(a, b)`` is equivalent to ``a != b``, + ``gt(a, b)`` is equivalent to ``a > b`` and ``ge(a, b)`` is equivalent to ``a + >= b``. Note that unlike the built-in :func:`cmp`, these functions can + return any value, which may or may not be interpretable as a Boolean value. + See :ref:`comparisons` for more information about rich comparisons. + + .. versionadded:: 2.2 + +The logical operations are also generally applicable to all objects, and support +truth tests, identity tests, and boolean operations: + + +.. function:: not_(o) + __not__(o) + + Return the outcome of :keyword:`not` *o*. (Note that there is no + :meth:`__not__` method for object instances; only the interpreter core defines + this operation. The result is affected by the :meth:`__bool__` and + :meth:`__len__` methods.) + + +.. function:: truth(o) + + Return :const:`True` if *o* is true, and :const:`False` otherwise. This is + equivalent to using the :class:`bool` constructor. + + +.. function:: is_(a, b) + + Return ``a is b``. Tests object identity. + + .. versionadded:: 2.3 + + +.. function:: is_not(a, b) + + Return ``a is not b``. Tests object identity. + + .. versionadded:: 2.3 + +The mathematical and bitwise operations are the most numerous: + + +.. function:: abs(o) + __abs__(o) + + Return the absolute value of *o*. + + +.. function:: add(a, b) + __add__(a, b) + + Return ``a + b``, for *a* and *b* numbers. + + +.. function:: and_(a, b) + __and__(a, b) + + Return the bitwise and of *a* and *b*. + + +.. function:: div(a, b) + __div__(a, b) + + Return ``a / b`` when ``__future__.division`` is not in effect. This is + also known as "classic" division. + + +.. function:: floordiv(a, b) + __floordiv__(a, b) + + Return ``a // b``. + + .. versionadded:: 2.2 + + +.. function:: inv(o) + invert(o) + __inv__(o) + __invert__(o) + + Return the bitwise inverse of the number *o*. This is equivalent to ``~o``. + + .. versionadded:: 2.0 + The names :func:`invert` and :func:`__invert__`. + + +.. function:: lshift(a, b) + __lshift__(a, b) + + Return *a* shifted left by *b*. + + +.. function:: mod(a, b) + __mod__(a, b) + + Return ``a % b``. + + +.. function:: mul(a, b) + __mul__(a, b) + + Return ``a * b``, for *a* and *b* numbers. + + +.. function:: neg(o) + __neg__(o) + + Return *o* negated. + + +.. function:: or_(a, b) + __or__(a, b) + + Return the bitwise or of *a* and *b*. + + +.. function:: pos(o) + __pos__(o) + + Return *o* positive. + + +.. function:: pow(a, b) + __pow__(a, b) + + Return ``a ** b``, for *a* and *b* numbers. + + .. versionadded:: 2.3 + + +.. function:: rshift(a, b) + __rshift__(a, b) + + Return *a* shifted right by *b*. + + +.. function:: sub(a, b) + __sub__(a, b) + + Return ``a - b``. + + +.. function:: truediv(a, b) + __truediv__(a, b) + + Return ``a / b`` when ``__future__.division`` is in effect. This is also + known as "true" division. + + .. versionadded:: 2.2 + + +.. function:: xor(a, b) + __xor__(a, b) + + Return the bitwise exclusive or of *a* and *b*. + + +.. function:: index(a) + __index__(a) + + Return *a* converted to an integer. Equivalent to ``a.__index__()``. + + .. versionadded:: 2.5 + + +Operations which work with sequences include: + +.. function:: concat(a, b) + __concat__(a, b) + + Return ``a + b`` for *a* and *b* sequences. + + +.. function:: contains(a, b) + __contains__(a, b) + + Return the outcome of the test ``b in a``. Note the reversed operands. + + .. versionadded:: 2.0 + The name :func:`__contains__`. + + +.. function:: countOf(a, b) + + Return the number of occurrences of *b* in *a*. + + +.. function:: delitem(a, b) + __delitem__(a, b) + + Remove the value of *a* at index *b*. + + +.. function:: delslice(a, b, c) + __delslice__(a, b, c) + + Delete the slice of *a* from index *b* to index *c-1*. + + +.. function:: getitem(a, b) + __getitem__(a, b) + + Return the value of *a* at index *b*. + + +.. function:: getslice(a, b, c) + __getslice__(a, b, c) + + Return the slice of *a* from index *b* to index *c-1*. + + +.. function:: indexOf(a, b) + + Return the index of the first of occurrence of *b* in *a*. + + +.. function:: repeat(a, b) + __repeat__(a, b) + + Return ``a * b`` where *a* is a sequence and *b* is an integer. + + +.. function:: sequenceIncludes(...) + + .. deprecated:: 2.0 + Use :func:`contains` instead. + + Alias for :func:`contains`. + + +.. function:: setitem(a, b, c) + __setitem__(a, b, c) + + Set the value of *a* at index *b* to *c*. + + +.. function:: setslice(a, b, c, v) + __setslice__(a, b, c, v) + + Set the slice of *a* from index *b* to index *c-1* to the sequence *v*. + +Many operations have an "in-place" version. The following functions provide a +more primitive access to in-place operators than the usual syntax does; for +example, the statement ``x += y`` is equivalent to ``x = operator.iadd(x, y)``. +Another way to put it is to say that ``z = operator.iadd(x, y)`` is equivalent +to the compound statement ``z = x; z += y``. + + +.. function:: iadd(a, b) + __iadd__(a, b) + + ``a = iadd(a, b)`` is equivalent to ``a += b``. + + .. versionadded:: 2.5 + + +.. function:: iand(a, b) + __iand__(a, b) + + ``a = iand(a, b)`` is equivalent to ``a &= b``. + + .. versionadded:: 2.5 + + +.. function:: iconcat(a, b) + __iconcat__(a, b) + + ``a = iconcat(a, b)`` is equivalent to ``a += b`` for *a* and *b* sequences. + + .. versionadded:: 2.5 + + +.. function:: idiv(a, b) + __idiv__(a, b) + + ``a = idiv(a, b)`` is equivalent to ``a /= b`` when ``__future__.division`` is + not in effect. + + .. versionadded:: 2.5 + + +.. function:: ifloordiv(a, b) + __ifloordiv__(a, b) + + ``a = ifloordiv(a, b)`` is equivalent to ``a //= b``. + + .. versionadded:: 2.5 + + +.. function:: ilshift(a, b) + __ilshift__(a, b) + + ``a = ilshift(a, b)`` is equivalent to ``a <``\ ``<= b``. + + .. versionadded:: 2.5 + + +.. function:: imod(a, b) + __imod__(a, b) + + ``a = imod(a, b)`` is equivalent to ``a %= b``. + + .. versionadded:: 2.5 + + +.. function:: imul(a, b) + __imul__(a, b) + + ``a = imul(a, b)`` is equivalent to ``a *= b``. + + .. versionadded:: 2.5 + + +.. function:: ior(a, b) + __ior__(a, b) + + ``a = ior(a, b)`` is equivalent to ``a |= b``. + + .. versionadded:: 2.5 + + +.. function:: ipow(a, b) + __ipow__(a, b) + + ``a = ipow(a, b)`` is equivalent to ``a **= b``. + + .. versionadded:: 2.5 + + +.. function:: irepeat(a, b) + __irepeat__(a, b) + + ``a = irepeat(a, b)`` is equivalent to ``a *= b`` where *a* is a sequence and + *b* is an integer. + + .. versionadded:: 2.5 + + +.. function:: irshift(a, b) + __irshift__(a, b) + + ``a = irshift(a, b)`` is equivalent to ``a >>= b``. + + .. versionadded:: 2.5 + + +.. function:: isub(a, b) + __isub__(a, b) + + ``a = isub(a, b)`` is equivalent to ``a -= b``. + + .. versionadded:: 2.5 + + +.. function:: itruediv(a, b) + __itruediv__(a, b) + + ``a = itruediv(a, b)`` is equivalent to ``a /= b`` when ``__future__.division`` + is in effect. + + .. versionadded:: 2.5 + + +.. function:: ixor(a, b) + __ixor__(a, b) + + ``a = ixor(a, b)`` is equivalent to ``a ^= b``. + + .. versionadded:: 2.5 + + +The :mod:`operator` module also defines a few predicates to test the type of +objects. + +.. note:: + + Be careful not to misinterpret the results of these functions; only + :func:`isCallable` has any measure of reliability with instance objects. + For example:: + + >>> class C: + ... pass + ... + >>> import operator + >>> o = C() + >>> operator.isMappingType(o) + True + + +.. function:: isCallable(o) + + .. deprecated:: 2.0 + Use the :func:`callable` built-in function instead. + + Returns true if the object *o* can be called like a function, otherwise it + returns false. True is returned for functions, bound and unbound methods, class + objects, and instance objects which support the :meth:`__call__` method. + + +.. function:: isMappingType(o) + + Returns true if the object *o* supports the mapping interface. This is true for + dictionaries and all instance objects defining :meth:`__getitem__`. + + .. warning:: + + There is no reliable way to test if an instance supports the complete mapping + protocol since the interface itself is ill-defined. This makes this test less + useful than it otherwise might be. + + +.. function:: isNumberType(o) + + Returns true if the object *o* represents a number. This is true for all + numeric types implemented in C. + + .. warning:: + + There is no reliable way to test if an instance supports the complete numeric + interface since the interface itself is ill-defined. This makes this test less + useful than it otherwise might be. + + +.. function:: isSequenceType(o) + + Returns true if the object *o* supports the sequence protocol. This returns true + for all objects which define sequence methods in C, and for all instance objects + defining :meth:`__getitem__`. + + .. warning:: + + There is no reliable way to test if an instance supports the complete sequence + interface since the interface itself is ill-defined. This makes this test less + useful than it otherwise might be. + +Example: Build a dictionary that maps the ordinals from ``0`` to ``255`` to +their character equivalents. :: + + >>> import operator + >>> d = {} + >>> keys = range(256) + >>> vals = map(chr, keys) + >>> map(operator.setitem, [d]*len(keys), keys, vals) + +.. XXX: find a better, readable, example + +The :mod:`operator` module also defines tools for generalized attribute and item +lookups. These are useful for making fast field extractors as arguments for +:func:`map`, :func:`sorted`, :meth:`itertools.groupby`, or other functions that +expect a function argument. + + +.. function:: attrgetter(attr[, args...]) + + Return a callable object that fetches *attr* from its operand. If more than one + attribute is requested, returns a tuple of attributes. After, + ``f=attrgetter('name')``, the call ``f(b)`` returns ``b.name``. After, + ``f=attrgetter('name', 'date')``, the call ``f(b)`` returns ``(b.name, + b.date)``. + + .. versionadded:: 2.4 + + .. versionchanged:: 2.5 + Added support for multiple attributes. + + +.. function:: itemgetter(item[, args...]) + + Return a callable object that fetches *item* from its operand. If more than one + item is requested, returns a tuple of items. After, ``f=itemgetter(2)``, the + call ``f(b)`` returns ``b[2]``. After, ``f=itemgetter(2,5,3)``, the call + ``f(b)`` returns ``(b[2], b[5], b[3])``. + + .. versionadded:: 2.4 + + .. versionchanged:: 2.5 + Added support for multiple item extraction. + +Examples:: + + >>> from operator import itemgetter + >>> inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)] + >>> getcount = itemgetter(1) + >>> map(getcount, inventory) + [3, 2, 5, 1] + >>> sorted(inventory, key=getcount) + [('orange', 1), ('banana', 2), ('apple', 3), ('pear', 5)] + + +.. _operator-map: + +Mapping Operators to Functions +------------------------------ + +This table shows how abstract operations correspond to operator symbols in the +Python syntax and the functions in the :mod:`operator` module. + ++-----------------------+-------------------------+---------------------------------+ +| Operation | Syntax | Function | ++=======================+=========================+=================================+ +| Addition | ``a + b`` | ``add(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Concatenation | ``seq1 + seq2`` | ``concat(seq1, seq2)`` | ++-----------------------+-------------------------+---------------------------------+ +| Containment Test | ``o in seq`` | ``contains(seq, o)`` | ++-----------------------+-------------------------+---------------------------------+ +| Division | ``a / b`` | ``div(a, b)`` (without | +| | | ``__future__.division``) | ++-----------------------+-------------------------+---------------------------------+ +| Division | ``a / b`` | ``truediv(a, b)`` (with | +| | | ``__future__.division``) | ++-----------------------+-------------------------+---------------------------------+ +| Division | ``a // b`` | ``floordiv(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Bitwise And | ``a & b`` | ``and_(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Bitwise Exclusive Or | ``a ^ b`` | ``xor(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Bitwise Inversion | ``~ a`` | ``invert(a)`` | ++-----------------------+-------------------------+---------------------------------+ +| Bitwise Or | ``a | b`` | ``or_(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Exponentiation | ``a ** b`` | ``pow(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Identity | ``a is b`` | ``is_(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Identity | ``a is not b`` | ``is_not(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Indexed Assignment | ``o[k] = v`` | ``setitem(o, k, v)`` | ++-----------------------+-------------------------+---------------------------------+ +| Indexed Deletion | ``del o[k]`` | ``delitem(o, k)`` | ++-----------------------+-------------------------+---------------------------------+ +| Indexing | ``o[k]`` | ``getitem(o, k)`` | ++-----------------------+-------------------------+---------------------------------+ +| Left Shift | ``a << b`` | ``lshift(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Modulo | ``a % b`` | ``mod(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Multiplication | ``a * b`` | ``mul(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Negation (Arithmetic) | ``- a`` | ``neg(a)`` | ++-----------------------+-------------------------+---------------------------------+ +| Negation (Logical) | ``not a`` | ``not_(a)`` | ++-----------------------+-------------------------+---------------------------------+ +| Right Shift | ``a >> b`` | ``rshift(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Sequence Repitition | ``seq * i`` | ``repeat(seq, i)`` | ++-----------------------+-------------------------+---------------------------------+ +| Slice Assignment | ``seq[i:j] = values`` | ``setslice(seq, i, j, values)`` | ++-----------------------+-------------------------+---------------------------------+ +| Slice Deletion | ``del seq[i:j]`` | ``delslice(seq, i, j)`` | ++-----------------------+-------------------------+---------------------------------+ +| Slicing | ``seq[i:j]`` | ``getslice(seq, i, j)`` | ++-----------------------+-------------------------+---------------------------------+ +| String Formatting | ``s % o`` | ``mod(s, o)`` | ++-----------------------+-------------------------+---------------------------------+ +| Subtraction | ``a - b`` | ``sub(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Truth Test | ``o`` | ``truth(o)`` | ++-----------------------+-------------------------+---------------------------------+ +| Ordering | ``a < b`` | ``lt(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Ordering | ``a <= b`` | ``le(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Equality | ``a == b`` | ``eq(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Difference | ``a != b`` | ``ne(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Ordering | ``a >= b`` | ``ge(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ +| Ordering | ``a > b`` | ``gt(a, b)`` | ++-----------------------+-------------------------+---------------------------------+ + diff --git a/Doc/library/optparse.rst b/Doc/library/optparse.rst new file mode 100644 index 0000000..cfcd8a6 --- /dev/null +++ b/Doc/library/optparse.rst @@ -0,0 +1,1827 @@ +.. % THIS FILE IS AUTO-GENERATED! DO NOT EDIT! +.. % (Your changes will be lost the next time it is generated.) + + +:mod:`optparse` --- More powerful command line option parser +============================================================ + +.. module:: optparse + :synopsis: More convenient, flexible, and powerful command-line parsing library. +.. moduleauthor:: Greg Ward <gward@python.net> + + +.. versionadded:: 2.3 + +.. sectionauthor:: Greg Ward <gward@python.net> + + +``optparse`` is a more convenient, flexible, and powerful library for parsing +command-line options than ``getopt``. ``optparse`` uses a more declarative +style of command-line parsing: you create an instance of :class:`OptionParser`, +populate it with options, and parse the command line. ``optparse`` allows users +to specify options in the conventional GNU/POSIX syntax, and additionally +generates usage and help messages for you. + +.. % An intro blurb used only when generating LaTeX docs for the Python +.. % manual (based on README.txt). + +Here's an example of using ``optparse`` in a simple script:: + + from optparse import OptionParser + [...] + parser = OptionParser() + parser.add_option("-f", "--file", dest="filename", + help="write report to FILE", metavar="FILE") + parser.add_option("-q", "--quiet", + action="store_false", dest="verbose", default=True, + help="don't print status messages to stdout") + + (options, args) = parser.parse_args() + +With these few lines of code, users of your script can now do the "usual thing" +on the command-line, for example:: + + <yourscript> --file=outfile -q + +As it parses the command line, ``optparse`` sets attributes of the ``options`` +object returned by :meth:`parse_args` based on user-supplied command-line +values. When :meth:`parse_args` returns from parsing this command line, +``options.filename`` will be ``"outfile"`` and ``options.verbose`` will be +``False``. ``optparse`` supports both long and short options, allows short +options to be merged together, and allows options to be associated with their +arguments in a variety of ways. Thus, the following command lines are all +equivalent to the above example:: + + <yourscript> -f outfile --quiet + <yourscript> --quiet --file outfile + <yourscript> -q -foutfile + <yourscript> -qfoutfile + +Additionally, users can run one of :: + + <yourscript> -h + <yourscript> --help + +and ``optparse`` will print out a brief summary of your script's options:: + + usage: <yourscript> [options] + + options: + -h, --help show this help message and exit + -f FILE, --file=FILE write report to FILE + -q, --quiet don't print status messages to stdout + +where the value of *yourscript* is determined at runtime (normally from +``sys.argv[0]``). + +.. % $Id: intro.txt 413 2004-09-28 00:59:13Z greg $ + + +.. _optparse-background: + +Background +---------- + +:mod:`optparse` was explicitly designed to encourage the creation of programs +with straightforward, conventional command-line interfaces. To that end, it +supports only the most common command-line syntax and semantics conventionally +used under Unix. If you are unfamiliar with these conventions, read this +section to acquaint yourself with them. + + +.. _optparse-terminology: + +Terminology +^^^^^^^^^^^ + +argument + a string entered on the command-line, and passed by the shell to ``execl()`` or + ``execv()``. In Python, arguments are elements of ``sys.argv[1:]`` + (``sys.argv[0]`` is the name of the program being executed). Unix shells also + use the term "word". + + It is occasionally desirable to substitute an argument list other than + ``sys.argv[1:]``, so you should read "argument" as "an element of + ``sys.argv[1:]``, or of some other list provided as a substitute for + ``sys.argv[1:]``". + +option + an argument used to supply extra information to guide or customize the execution + of a program. There are many different syntaxes for options; the traditional + Unix syntax is a hyphen ("-") followed by a single letter, e.g. ``"-x"`` or + ``"-F"``. Also, traditional Unix syntax allows multiple options to be merged + into a single argument, e.g. ``"-x -F"`` is equivalent to ``"-xF"``. The GNU + project introduced ``"--"`` followed by a series of hyphen-separated words, e.g. + ``"--file"`` or ``"--dry-run"``. These are the only two option syntaxes + provided by :mod:`optparse`. + + Some other option syntaxes that the world has seen include: + + * a hyphen followed by a few letters, e.g. ``"-pf"`` (this is *not* the same + as multiple options merged into a single argument) + + * a hyphen followed by a whole word, e.g. ``"-file"`` (this is technically + equivalent to the previous syntax, but they aren't usually seen in the same + program) + + * a plus sign followed by a single letter, or a few letters, or a word, e.g. + ``"+f"``, ``"+rgb"`` + + * a slash followed by a letter, or a few letters, or a word, e.g. ``"/f"``, + ``"/file"`` + + These option syntaxes are not supported by :mod:`optparse`, and they never will + be. This is deliberate: the first three are non-standard on any environment, + and the last only makes sense if you're exclusively targeting VMS, MS-DOS, + and/or Windows. + +option argument + an argument that follows an option, is closely associated with that option, and + is consumed from the argument list when that option is. With :mod:`optparse`, + option arguments may either be in a separate argument from their option:: + + -f foo + --file foo + + or included in the same argument:: + + -ffoo + --file=foo + + Typically, a given option either takes an argument or it doesn't. Lots of people + want an "optional option arguments" feature, meaning that some options will take + an argument if they see it, and won't if they don't. This is somewhat + controversial, because it makes parsing ambiguous: if ``"-a"`` takes an optional + argument and ``"-b"`` is another option entirely, how do we interpret ``"-ab"``? + Because of this ambiguity, :mod:`optparse` does not support this feature. + +positional argument + something leftover in the argument list after options have been parsed, i.e. + after options and their arguments have been parsed and removed from the argument + list. + +required option + an option that must be supplied on the command-line; note that the phrase + "required option" is self-contradictory in English. :mod:`optparse` doesn't + prevent you from implementing required options, but doesn't give you much help + at it either. See ``examples/required_1.py`` and ``examples/required_2.py`` in + the :mod:`optparse` source distribution for two ways to implement required + options with :mod:`optparse`. + +For example, consider this hypothetical command-line:: + + prog -v --report /tmp/report.txt foo bar + +``"-v"`` and ``"--report"`` are both options. Assuming that :option:`--report` +takes one argument, ``"/tmp/report.txt"`` is an option argument. ``"foo"`` and +``"bar"`` are positional arguments. + + +.. _optparse-what-options-for: + +What are options for? +^^^^^^^^^^^^^^^^^^^^^ + +Options are used to provide extra information to tune or customize the execution +of a program. In case it wasn't clear, options are usually *optional*. A +program should be able to run just fine with no options whatsoever. (Pick a +random program from the Unix or GNU toolsets. Can it run without any options at +all and still make sense? The main exceptions are ``find``, ``tar``, and +``dd``\ ---all of which are mutant oddballs that have been rightly criticized +for their non-standard syntax and confusing interfaces.) + +Lots of people want their programs to have "required options". Think about it. +If it's required, then it's *not optional*! If there is a piece of information +that your program absolutely requires in order to run successfully, that's what +positional arguments are for. + +As an example of good command-line interface design, consider the humble ``cp`` +utility, for copying files. It doesn't make much sense to try to copy files +without supplying a destination and at least one source. Hence, ``cp`` fails if +you run it with no arguments. However, it has a flexible, useful syntax that +does not require any options at all:: + + cp SOURCE DEST + cp SOURCE ... DEST-DIR + +You can get pretty far with just that. Most ``cp`` implementations provide a +bunch of options to tweak exactly how the files are copied: you can preserve +mode and modification time, avoid following symlinks, ask before clobbering +existing files, etc. But none of this distracts from the core mission of +``cp``, which is to copy either one file to another, or several files to another +directory. + + +.. _optparse-what-positional-arguments-for: + +What are positional arguments for? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Positional arguments are for those pieces of information that your program +absolutely, positively requires to run. + +A good user interface should have as few absolute requirements as possible. If +your program requires 17 distinct pieces of information in order to run +successfully, it doesn't much matter *how* you get that information from the +user---most people will give up and walk away before they successfully run the +program. This applies whether the user interface is a command-line, a +configuration file, or a GUI: if you make that many demands on your users, most +of them will simply give up. + +In short, try to minimize the amount of information that users are absolutely +required to supply---use sensible defaults whenever possible. Of course, you +also want to make your programs reasonably flexible. That's what options are +for. Again, it doesn't matter if they are entries in a config file, widgets in +the "Preferences" dialog of a GUI, or command-line options---the more options +you implement, the more flexible your program is, and the more complicated its +implementation becomes. Too much flexibility has drawbacks as well, of course; +too many options can overwhelm users and make your code much harder to maintain. + +.. % $Id: tao.txt 413 2004-09-28 00:59:13Z greg $ + + +.. _optparse-tutorial: + +Tutorial +-------- + +While :mod:`optparse` is quite flexible and powerful, it's also straightforward +to use in most cases. This section covers the code patterns that are common to +any :mod:`optparse`\ -based program. + +First, you need to import the OptionParser class; then, early in the main +program, create an OptionParser instance:: + + from optparse import OptionParser + [...] + parser = OptionParser() + +Then you can start defining options. The basic syntax is:: + + parser.add_option(opt_str, ..., + attr=value, ...) + +Each option has one or more option strings, such as ``"-f"`` or ``"--file"``, +and several option attributes that tell :mod:`optparse` what to expect and what +to do when it encounters that option on the command line. + +Typically, each option will have one short option string and one long option +string, e.g.:: + + parser.add_option("-f", "--file", ...) + +You're free to define as many short option strings and as many long option +strings as you like (including zero), as long as there is at least one option +string overall. + +The option strings passed to :meth:`add_option` are effectively labels for the +option defined by that call. For brevity, we will frequently refer to +*encountering an option* on the command line; in reality, :mod:`optparse` +encounters *option strings* and looks up options from them. + +Once all of your options are defined, instruct :mod:`optparse` to parse your +program's command line:: + + (options, args) = parser.parse_args() + +(If you like, you can pass a custom argument list to :meth:`parse_args`, but +that's rarely necessary: by default it uses ``sys.argv[1:]``.) + +:meth:`parse_args` returns two values: + +* ``options``, an object containing values for all of your options---e.g. if + ``"--file"`` takes a single string argument, then ``options.file`` will be the + filename supplied by the user, or ``None`` if the user did not supply that + option + +* ``args``, the list of positional arguments leftover after parsing options + +This tutorial section only covers the four most important option attributes: +:attr:`action`, :attr:`type`, :attr:`dest` (destination), and :attr:`help`. Of +these, :attr:`action` is the most fundamental. + + +.. _optparse-understanding-option-actions: + +Understanding option actions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Actions tell :mod:`optparse` what to do when it encounters an option on the +command line. There is a fixed set of actions hard-coded into :mod:`optparse`; +adding new actions is an advanced topic covered in section +:ref:`optparse-extending-optparse`. Most actions tell +:mod:`optparse` to store a value in some variable---for example, take a string +from the command line and store it in an attribute of ``options``. + +If you don't specify an option action, :mod:`optparse` defaults to ``store``. + + +.. _optparse-store-action: + +The store action +^^^^^^^^^^^^^^^^ + +The most common option action is ``store``, which tells :mod:`optparse` to take +the next argument (or the remainder of the current argument), ensure that it is +of the correct type, and store it to your chosen destination. + +For example:: + + parser.add_option("-f", "--file", + action="store", type="string", dest="filename") + +Now let's make up a fake command line and ask :mod:`optparse` to parse it:: + + args = ["-f", "foo.txt"] + (options, args) = parser.parse_args(args) + +When :mod:`optparse` sees the option string ``"-f"``, it consumes the next +argument, ``"foo.txt"``, and stores it in ``options.filename``. So, after this +call to :meth:`parse_args`, ``options.filename`` is ``"foo.txt"``. + +Some other option types supported by :mod:`optparse` are ``int`` and ``float``. +Here's an option that expects an integer argument:: + + parser.add_option("-n", type="int", dest="num") + +Note that this option has no long option string, which is perfectly acceptable. +Also, there's no explicit action, since the default is ``store``. + +Let's parse another fake command-line. This time, we'll jam the option argument +right up against the option: since ``"-n42"`` (one argument) is equivalent to +``"-n 42"`` (two arguments), the code :: + + (options, args) = parser.parse_args(["-n42"]) + print options.num + +will print ``"42"``. + +If you don't specify a type, :mod:`optparse` assumes ``string``. Combined with +the fact that the default action is ``store``, that means our first example can +be a lot shorter:: + + parser.add_option("-f", "--file", dest="filename") + +If you don't supply a destination, :mod:`optparse` figures out a sensible +default from the option strings: if the first long option string is +``"--foo-bar"``, then the default destination is ``foo_bar``. If there are no +long option strings, :mod:`optparse` looks at the first short option string: the +default destination for ``"-f"`` is ``f``. + +:mod:`optparse` also includes built-in ``long`` and ``complex`` types. Adding +types is covered in section :ref:`optparse-extending-optparse`. + + +.. _optparse-handling-boolean-options: + +Handling boolean (flag) options +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Flag options---set a variable to true or false when a particular option is seen +---are quite common. :mod:`optparse` supports them with two separate actions, +``store_true`` and ``store_false``. For example, you might have a ``verbose`` +flag that is turned on with ``"-v"`` and off with ``"-q"``:: + + parser.add_option("-v", action="store_true", dest="verbose") + parser.add_option("-q", action="store_false", dest="verbose") + +Here we have two different options with the same destination, which is perfectly +OK. (It just means you have to be a bit careful when setting default values--- +see below.) + +When :mod:`optparse` encounters ``"-v"`` on the command line, it sets +``options.verbose`` to ``True``; when it encounters ``"-q"``, +``options.verbose`` is set to ``False``. + + +.. _optparse-other-actions: + +Other actions +^^^^^^^^^^^^^ + +Some other actions supported by :mod:`optparse` are: + +``store_const`` + store a constant value + +``append`` + append this option's argument to a list + +``count`` + increment a counter by one + +``callback`` + call a specified function + +These are covered in section :ref:`optparse-reference-guide`, Reference Guide +and section :ref:`optparse-option-callbacks`. + + +.. _optparse-default-values: + +Default values +^^^^^^^^^^^^^^ + +All of the above examples involve setting some variable (the "destination") when +certain command-line options are seen. What happens if those options are never +seen? Since we didn't supply any defaults, they are all set to ``None``. This +is usually fine, but sometimes you want more control. :mod:`optparse` lets you +supply a default value for each destination, which is assigned before the +command line is parsed. + +First, consider the verbose/quiet example. If we want :mod:`optparse` to set +``verbose`` to ``True`` unless ``"-q"`` is seen, then we can do this:: + + parser.add_option("-v", action="store_true", dest="verbose", default=True) + parser.add_option("-q", action="store_false", dest="verbose") + +Since default values apply to the *destination* rather than to any particular +option, and these two options happen to have the same destination, this is +exactly equivalent:: + + parser.add_option("-v", action="store_true", dest="verbose") + parser.add_option("-q", action="store_false", dest="verbose", default=True) + +Consider this:: + + parser.add_option("-v", action="store_true", dest="verbose", default=False) + parser.add_option("-q", action="store_false", dest="verbose", default=True) + +Again, the default value for ``verbose`` will be ``True``: the last default +value supplied for any particular destination is the one that counts. + +A clearer way to specify default values is the :meth:`set_defaults` method of +OptionParser, which you can call at any time before calling :meth:`parse_args`:: + + parser.set_defaults(verbose=True) + parser.add_option(...) + (options, args) = parser.parse_args() + +As before, the last value specified for a given option destination is the one +that counts. For clarity, try to use one method or the other of setting default +values, not both. + + +.. _optparse-generating-help: + +Generating help +^^^^^^^^^^^^^^^ + +:mod:`optparse`'s ability to generate help and usage text automatically is +useful for creating user-friendly command-line interfaces. All you have to do +is supply a :attr:`help` value for each option, and optionally a short usage +message for your whole program. Here's an OptionParser populated with +user-friendly (documented) options:: + + usage = "usage: %prog [options] arg1 arg2" + parser = OptionParser(usage=usage) + parser.add_option("-v", "--verbose", + action="store_true", dest="verbose", default=True, + help="make lots of noise [default]") + parser.add_option("-q", "--quiet", + action="store_false", dest="verbose", + help="be vewwy quiet (I'm hunting wabbits)") + parser.add_option("-f", "--filename", + metavar="FILE", help="write output to FILE"), + parser.add_option("-m", "--mode", + default="intermediate", + help="interaction mode: novice, intermediate, " + "or expert [default: %default]") + +If :mod:`optparse` encounters either ``"-h"`` or ``"--help"`` on the +command-line, or if you just call :meth:`parser.print_help`, it prints the +following to standard output:: + + usage: <yourscript> [options] arg1 arg2 + + options: + -h, --help show this help message and exit + -v, --verbose make lots of noise [default] + -q, --quiet be vewwy quiet (I'm hunting wabbits) + -f FILE, --filename=FILE + write output to FILE + -m MODE, --mode=MODE interaction mode: novice, intermediate, or + expert [default: intermediate] + +(If the help output is triggered by a help option, :mod:`optparse` exits after +printing the help text.) + +There's a lot going on here to help :mod:`optparse` generate the best possible +help message: + +* the script defines its own usage message:: + + usage = "usage: %prog [options] arg1 arg2" + + :mod:`optparse` expands ``"%prog"`` in the usage string to the name of the + current program, i.e. ``os.path.basename(sys.argv[0])``. The expanded string is + then printed before the detailed option help. + + If you don't supply a usage string, :mod:`optparse` uses a bland but sensible + default: ``"usage: %prog [options]"``, which is fine if your script doesn't take + any positional arguments. + +* every option defines a help string, and doesn't worry about line-wrapping--- + :mod:`optparse` takes care of wrapping lines and making the help output look + good. + +* options that take a value indicate this fact in their automatically-generated + help message, e.g. for the "mode" option:: + + -m MODE, --mode=MODE + + Here, "MODE" is called the meta-variable: it stands for the argument that the + user is expected to supply to :option:`-m`/:option:`--mode`. By default, + :mod:`optparse` converts the destination variable name to uppercase and uses + that for the meta-variable. Sometimes, that's not what you want---for example, + the :option:`--filename` option explicitly sets ``metavar="FILE"``, resulting in + this automatically-generated option description:: + + -f FILE, --filename=FILE + + This is important for more than just saving space, though: the manually written + help text uses the meta-variable "FILE" to clue the user in that there's a + connection between the semi-formal syntax "-f FILE" and the informal semantic + description "write output to FILE". This is a simple but effective way to make + your help text a lot clearer and more useful for end users. + +* options that have a default value can include ``%default`` in the help + string---\ :mod:`optparse` will replace it with :func:`str` of the option's + default value. If an option has no default value (or the default value is + ``None``), ``%default`` expands to ``none``. + + +.. _optparse-printing-version-string: + +Printing a version string +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Similar to the brief usage string, :mod:`optparse` can also print a version +string for your program. You have to supply the string as the ``version`` +argument to OptionParser:: + + parser = OptionParser(usage="%prog [-f] [-q]", version="%prog 1.0") + +``"%prog"`` is expanded just like it is in ``usage``. Apart from that, +``version`` can contain anything you like. When you supply it, :mod:`optparse` +automatically adds a ``"--version"`` option to your parser. If it encounters +this option on the command line, it expands your ``version`` string (by +replacing ``"%prog"``), prints it to stdout, and exits. + +For example, if your script is called ``/usr/bin/foo``:: + + $ /usr/bin/foo --version + foo 1.0 + + +.. _optparse-how-optparse-handles-errors: + +How :mod:`optparse` handles errors +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There are two broad classes of errors that :mod:`optparse` has to worry about: +programmer errors and user errors. Programmer errors are usually erroneous +calls to ``parser.add_option()``, e.g. invalid option strings, unknown option +attributes, missing option attributes, etc. These are dealt with in the usual +way: raise an exception (either ``optparse.OptionError`` or ``TypeError``) and +let the program crash. + +Handling user errors is much more important, since they are guaranteed to happen +no matter how stable your code is. :mod:`optparse` can automatically detect +some user errors, such as bad option arguments (passing ``"-n 4x"`` where +:option:`-n` takes an integer argument), missing arguments (``"-n"`` at the end +of the command line, where :option:`-n` takes an argument of any type). Also, +you can call ``parser.error()`` to signal an application-defined error +condition:: + + (options, args) = parser.parse_args() + [...] + if options.a and options.b: + parser.error("options -a and -b are mutually exclusive") + +In either case, :mod:`optparse` handles the error the same way: it prints the +program's usage message and an error message to standard error and exits with +error status 2. + +Consider the first example above, where the user passes ``"4x"`` to an option +that takes an integer:: + + $ /usr/bin/foo -n 4x + usage: foo [options] + + foo: error: option -n: invalid integer value: '4x' + +Or, where the user fails to pass a value at all:: + + $ /usr/bin/foo -n + usage: foo [options] + + foo: error: -n option requires an argument + +:mod:`optparse`\ -generated error messages take care always to mention the +option involved in the error; be sure to do the same when calling +``parser.error()`` from your application code. + +If :mod:`optparse`'s default error-handling behaviour does not suite your needs, +you'll need to subclass OptionParser and override ``exit()`` and/or +:meth:`error`. + + +.. _optparse-putting-it-all-together: + +Putting it all together +^^^^^^^^^^^^^^^^^^^^^^^ + +Here's what :mod:`optparse`\ -based scripts usually look like:: + + from optparse import OptionParser + [...] + def main(): + usage = "usage: %prog [options] arg" + parser = OptionParser(usage) + parser.add_option("-f", "--file", dest="filename", + help="read data from FILENAME") + parser.add_option("-v", "--verbose", + action="store_true", dest="verbose") + parser.add_option("-q", "--quiet", + action="store_false", dest="verbose") + [...] + (options, args) = parser.parse_args() + if len(args) != 1: + parser.error("incorrect number of arguments") + if options.verbose: + print "reading %s..." % options.filename + [...] + + if __name__ == "__main__": + main() + +.. % $Id: tutorial.txt 515 2006-06-10 15:37:45Z gward $ + + +.. _optparse-reference-guide: + +Reference Guide +--------------- + + +.. _optparse-creating-parser: + +Creating the parser +^^^^^^^^^^^^^^^^^^^ + +The first step in using :mod:`optparse` is to create an OptionParser instance:: + + parser = OptionParser(...) + +The OptionParser constructor has no required arguments, but a number of optional +keyword arguments. You should always pass them as keyword arguments, i.e. do +not rely on the order in which the arguments are declared. + + ``usage`` (default: ``"%prog [options]"``) + The usage summary to print when your program is run incorrectly or with a help + option. When :mod:`optparse` prints the usage string, it expands ``%prog`` to + ``os.path.basename(sys.argv[0])`` (or to ``prog`` if you passed that keyword + argument). To suppress a usage message, pass the special value + ``optparse.SUPPRESS_USAGE``. + + ``option_list`` (default: ``[]``) + A list of Option objects to populate the parser with. The options in + ``option_list`` are added after any options in ``standard_option_list`` (a class + attribute that may be set by OptionParser subclasses), but before any version or + help options. Deprecated; use :meth:`add_option` after creating the parser + instead. + + ``option_class`` (default: optparse.Option) + Class to use when adding options to the parser in :meth:`add_option`. + + ``version`` (default: ``None``) + A version string to print when the user supplies a version option. If you supply + a true value for ``version``, :mod:`optparse` automatically adds a version + option with the single option string ``"--version"``. The substring ``"%prog"`` + is expanded the same as for ``usage``. + + ``conflict_handler`` (default: ``"error"``) + Specifies what to do when options with conflicting option strings are added to + the parser; see section :ref:`optparse-conflicts-between-options`. + + ``description`` (default: ``None``) + A paragraph of text giving a brief overview of your program. :mod:`optparse` + reformats this paragraph to fit the current terminal width and prints it when + the user requests help (after ``usage``, but before the list of options). + + ``formatter`` (default: a new IndentedHelpFormatter) + An instance of optparse.HelpFormatter that will be used for printing help text. + :mod:`optparse` provides two concrete classes for this purpose: + IndentedHelpFormatter and TitledHelpFormatter. + + ``add_help_option`` (default: ``True``) + If true, :mod:`optparse` will add a help option (with option strings ``"-h"`` + and ``"--help"``) to the parser. + + ``prog`` + The string to use when expanding ``"%prog"`` in ``usage`` and ``version`` + instead of ``os.path.basename(sys.argv[0])``. + + + +.. _optparse-populating-parser: + +Populating the parser +^^^^^^^^^^^^^^^^^^^^^ + +There are several ways to populate the parser with options. The preferred way +is by using ``OptionParser.add_option()``, as shown in section +:ref:`optparse-tutorial`. :meth:`add_option` can be called in one of two ways: + +* pass it an Option instance (as returned by :func:`make_option`) + +* pass it any combination of positional and keyword arguments that are + acceptable to :func:`make_option` (i.e., to the Option constructor), and it will + create the Option instance for you + +The other alternative is to pass a list of pre-constructed Option instances to +the OptionParser constructor, as in:: + + option_list = [ + make_option("-f", "--filename", + action="store", type="string", dest="filename"), + make_option("-q", "--quiet", + action="store_false", dest="verbose"), + ] + parser = OptionParser(option_list=option_list) + +(:func:`make_option` is a factory function for creating Option instances; +currently it is an alias for the Option constructor. A future version of +:mod:`optparse` may split Option into several classes, and :func:`make_option` +will pick the right class to instantiate. Do not instantiate Option directly.) + + +.. _optparse-defining-options: + +Defining options +^^^^^^^^^^^^^^^^ + +Each Option instance represents a set of synonymous command-line option strings, +e.g. :option:`-f` and :option:`--file`. You can specify any number of short or +long option strings, but you must specify at least one overall option string. + +The canonical way to create an Option instance is with the :meth:`add_option` +method of :class:`OptionParser`:: + + parser.add_option(opt_str[, ...], attr=value, ...) + +To define an option with only a short option string:: + + parser.add_option("-f", attr=value, ...) + +And to define an option with only a long option string:: + + parser.add_option("--foo", attr=value, ...) + +The keyword arguments define attributes of the new Option object. The most +important option attribute is :attr:`action`, and it largely determines which +other attributes are relevant or required. If you pass irrelevant option +attributes, or fail to pass required ones, :mod:`optparse` raises an OptionError +exception explaining your mistake. + +An options's *action* determines what :mod:`optparse` does when it encounters +this option on the command-line. The standard option actions hard-coded into +:mod:`optparse` are: + +``store`` + store this option's argument (default) + +``store_const`` + store a constant value + +``store_true`` + store a true value + +``store_false`` + store a false value + +``append`` + append this option's argument to a list + +``append_const`` + append a constant value to a list + +``count`` + increment a counter by one + +``callback`` + call a specified function + +:attr:`help` + print a usage message including all options and the documentation for them + +(If you don't supply an action, the default is ``store``. For this action, you +may also supply :attr:`type` and :attr:`dest` option attributes; see below.) + +As you can see, most actions involve storing or updating a value somewhere. +:mod:`optparse` always creates a special object for this, conventionally called +``options`` (it happens to be an instance of ``optparse.Values``). Option +arguments (and various other values) are stored as attributes of this object, +according to the :attr:`dest` (destination) option attribute. + +For example, when you call :: + + parser.parse_args() + +one of the first things :mod:`optparse` does is create the ``options`` object:: + + options = Values() + +If one of the options in this parser is defined with :: + + parser.add_option("-f", "--file", action="store", type="string", dest="filename") + +and the command-line being parsed includes any of the following:: + + -ffoo + -f foo + --file=foo + --file foo + +then :mod:`optparse`, on seeing this option, will do the equivalent of :: + + options.filename = "foo" + +The :attr:`type` and :attr:`dest` option attributes are almost as important as +:attr:`action`, but :attr:`action` is the only one that makes sense for *all* +options. + + +.. _optparse-standard-option-actions: + +Standard option actions +^^^^^^^^^^^^^^^^^^^^^^^ + +The various option actions all have slightly different requirements and effects. +Most actions have several relevant option attributes which you may specify to +guide :mod:`optparse`'s behaviour; a few have required attributes, which you +must specify for any option using that action. + +* ``store`` [relevant: :attr:`type`, :attr:`dest`, ``nargs``, ``choices``] + + The option must be followed by an argument, which is converted to a value + according to :attr:`type` and stored in :attr:`dest`. If ``nargs`` > 1, + multiple arguments will be consumed from the command line; all will be converted + according to :attr:`type` and stored to :attr:`dest` as a tuple. See the + "Option types" section below. + + If ``choices`` is supplied (a list or tuple of strings), the type defaults to + ``choice``. + + If :attr:`type` is not supplied, it defaults to ``string``. + + If :attr:`dest` is not supplied, :mod:`optparse` derives a destination from the + first long option string (e.g., ``"--foo-bar"`` implies ``foo_bar``). If there + are no long option strings, :mod:`optparse` derives a destination from the first + short option string (e.g., ``"-f"`` implies ``f``). + + Example:: + + parser.add_option("-f") + parser.add_option("-p", type="float", nargs=3, dest="point") + + As it parses the command line :: + + -f foo.txt -p 1 -3.5 4 -fbar.txt + + :mod:`optparse` will set :: + + options.f = "foo.txt" + options.point = (1.0, -3.5, 4.0) + options.f = "bar.txt" + +* ``store_const`` [required: ``const``; relevant: :attr:`dest`] + + The value ``const`` is stored in :attr:`dest`. + + Example:: + + parser.add_option("-q", "--quiet", + action="store_const", const=0, dest="verbose") + parser.add_option("-v", "--verbose", + action="store_const", const=1, dest="verbose") + parser.add_option("--noisy", + action="store_const", const=2, dest="verbose") + + If ``"--noisy"`` is seen, :mod:`optparse` will set :: + + options.verbose = 2 + +* ``store_true`` [relevant: :attr:`dest`] + + A special case of ``store_const`` that stores a true value to :attr:`dest`. + +* ``store_false`` [relevant: :attr:`dest`] + + Like ``store_true``, but stores a false value. + + Example:: + + parser.add_option("--clobber", action="store_true", dest="clobber") + parser.add_option("--no-clobber", action="store_false", dest="clobber") + +* ``append`` [relevant: :attr:`type`, :attr:`dest`, ``nargs``, ``choices``] + + The option must be followed by an argument, which is appended to the list in + :attr:`dest`. If no default value for :attr:`dest` is supplied, an empty list + is automatically created when :mod:`optparse` first encounters this option on + the command-line. If ``nargs`` > 1, multiple arguments are consumed, and a + tuple of length ``nargs`` is appended to :attr:`dest`. + + The defaults for :attr:`type` and :attr:`dest` are the same as for the ``store`` + action. + + Example:: + + parser.add_option("-t", "--tracks", action="append", type="int") + + If ``"-t3"`` is seen on the command-line, :mod:`optparse` does the equivalent + of:: + + options.tracks = [] + options.tracks.append(int("3")) + + If, a little later on, ``"--tracks=4"`` is seen, it does:: + + options.tracks.append(int("4")) + +* ``append_const`` [required: ``const``; relevant: :attr:`dest`] + + Like ``store_const``, but the value ``const`` is appended to :attr:`dest`; as + with ``append``, :attr:`dest` defaults to ``None``, and an an empty list is + automatically created the first time the option is encountered. + +* ``count`` [relevant: :attr:`dest`] + + Increment the integer stored at :attr:`dest`. If no default value is supplied, + :attr:`dest` is set to zero before being incremented the first time. + + Example:: + + parser.add_option("-v", action="count", dest="verbosity") + + The first time ``"-v"`` is seen on the command line, :mod:`optparse` does the + equivalent of:: + + options.verbosity = 0 + options.verbosity += 1 + + Every subsequent occurrence of ``"-v"`` results in :: + + options.verbosity += 1 + +* ``callback`` [required: ``callback``; relevant: :attr:`type`, ``nargs``, + ``callback_args``, ``callback_kwargs``] + + Call the function specified by ``callback``, which is called as :: + + func(option, opt_str, value, parser, *args, **kwargs) + + See section :ref:`optparse-option-callbacks` for more detail. + +* :attr:`help` + + Prints a complete help message for all the options in the current option parser. + The help message is constructed from the ``usage`` string passed to + OptionParser's constructor and the :attr:`help` string passed to every option. + + If no :attr:`help` string is supplied for an option, it will still be listed in + the help message. To omit an option entirely, use the special value + ``optparse.SUPPRESS_HELP``. + + :mod:`optparse` automatically adds a :attr:`help` option to all OptionParsers, + so you do not normally need to create one. + + Example:: + + from optparse import OptionParser, SUPPRESS_HELP + + parser = OptionParser() + parser.add_option("-h", "--help", action="help"), + parser.add_option("-v", action="store_true", dest="verbose", + help="Be moderately verbose") + parser.add_option("--file", dest="filename", + help="Input file to read data from"), + parser.add_option("--secret", help=SUPPRESS_HELP) + + If :mod:`optparse` sees either ``"-h"`` or ``"--help"`` on the command line, it + will print something like the following help message to stdout (assuming + ``sys.argv[0]`` is ``"foo.py"``):: + + usage: foo.py [options] + + options: + -h, --help Show this help message and exit + -v Be moderately verbose + --file=FILENAME Input file to read data from + + After printing the help message, :mod:`optparse` terminates your process with + ``sys.exit(0)``. + +* ``version`` + + Prints the version number supplied to the OptionParser to stdout and exits. The + version number is actually formatted and printed by the ``print_version()`` + method of OptionParser. Generally only relevant if the ``version`` argument is + supplied to the OptionParser constructor. As with :attr:`help` options, you + will rarely create ``version`` options, since :mod:`optparse` automatically adds + them when needed. + + +.. _optparse-option-attributes: + +Option attributes +^^^^^^^^^^^^^^^^^ + +The following option attributes may be passed as keyword arguments to +``parser.add_option()``. If you pass an option attribute that is not relevant +to a particular option, or fail to pass a required option attribute, +:mod:`optparse` raises OptionError. + +* :attr:`action` (default: ``"store"``) + + Determines :mod:`optparse`'s behaviour when this option is seen on the command + line; the available options are documented above. + +* :attr:`type` (default: ``"string"``) + + The argument type expected by this option (e.g., ``"string"`` or ``"int"``); the + available option types are documented below. + +* :attr:`dest` (default: derived from option strings) + + If the option's action implies writing or modifying a value somewhere, this + tells :mod:`optparse` where to write it: :attr:`dest` names an attribute of the + ``options`` object that :mod:`optparse` builds as it parses the command line. + +* ``default`` (deprecated) + + The value to use for this option's destination if the option is not seen on the + command line. Deprecated; use ``parser.set_defaults()`` instead. + +* ``nargs`` (default: 1) + + How many arguments of type :attr:`type` should be consumed when this option is + seen. If > 1, :mod:`optparse` will store a tuple of values to :attr:`dest`. + +* ``const`` + + For actions that store a constant value, the constant value to store. + +* ``choices`` + + For options of type ``"choice"``, the list of strings the user may choose from. + +* ``callback`` + + For options with action ``"callback"``, the callable to call when this option + is seen. See section :ref:`optparse-option-callbacks` for detail on the + arguments passed to ``callable``. + +* ``callback_args``, ``callback_kwargs`` + + Additional positional and keyword arguments to pass to ``callback`` after the + four standard callback arguments. + +* :attr:`help` + + Help text to print for this option when listing all available options after the + user supplies a :attr:`help` option (such as ``"--help"``). If no help text is + supplied, the option will be listed without help text. To hide this option, use + the special value ``SUPPRESS_HELP``. + +* ``metavar`` (default: derived from option strings) + + Stand-in for the option argument(s) to use when printing help text. See section + :ref:`optparse-tutorial` for an example. + + +.. _optparse-standard-option-types: + +Standard option types +^^^^^^^^^^^^^^^^^^^^^ + +:mod:`optparse` has six built-in option types: ``string``, ``int``, ``long``, +``choice``, ``float`` and ``complex``. If you need to add new option types, see +section :ref:`optparse-extending-optparse`. + +Arguments to string options are not checked or converted in any way: the text on +the command line is stored in the destination (or passed to the callback) as-is. + +Integer arguments (type ``int`` or ``long``) are parsed as follows: + +* if the number starts with ``0x``, it is parsed as a hexadecimal number + +* if the number starts with ``0``, it is parsed as an octal number + +* if the number starts with ``0b``, is is parsed as a binary number + +* otherwise, the number is parsed as a decimal number + + +The conversion is done by calling either ``int()`` or ``long()`` with the +appropriate base (2, 8, 10, or 16). If this fails, so will :mod:`optparse`, +although with a more useful error message. + +``float`` and ``complex`` option arguments are converted directly with +``float()`` and ``complex()``, with similar error-handling. + +``choice`` options are a subtype of ``string`` options. The ``choices`` option +attribute (a sequence of strings) defines the set of allowed option arguments. +``optparse.check_choice()`` compares user-supplied option arguments against this +master list and raises OptionValueError if an invalid string is given. + + +.. _optparse-parsing-arguments: + +Parsing arguments +^^^^^^^^^^^^^^^^^ + +The whole point of creating and populating an OptionParser is to call its +:meth:`parse_args` method:: + + (options, args) = parser.parse_args(args=None, values=None) + +where the input parameters are + +``args`` + the list of arguments to process (default: ``sys.argv[1:]``) + +``values`` + object to store option arguments in (default: a new instance of optparse.Values) + +and the return values are + +``options`` + the same object that was passed in as ``options``, or the optparse.Values + instance created by :mod:`optparse` + +``args`` + the leftover positional arguments after all options have been processed + +The most common usage is to supply neither keyword argument. If you supply +``options``, it will be modified with repeated ``setattr()`` calls (roughly one +for every option argument stored to an option destination) and returned by +:meth:`parse_args`. + +If :meth:`parse_args` encounters any errors in the argument list, it calls the +OptionParser's :meth:`error` method with an appropriate end-user error message. +This ultimately terminates your process with an exit status of 2 (the +traditional Unix exit status for command-line errors). + + +.. _optparse-querying-manipulating-option-parser: + +Querying and manipulating your option parser +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes, it's useful to poke around your option parser and see what's there. +OptionParser provides a couple of methods to help you out: + +``has_option(opt_str)`` + Return true if the OptionParser has an option with option string ``opt_str`` + (e.g., ``"-q"`` or ``"--verbose"``). + +``get_option(opt_str)`` + Returns the Option instance with the option string ``opt_str``, or ``None`` if + no options have that option string. + +``remove_option(opt_str)`` + If the OptionParser has an option corresponding to ``opt_str``, that option is + removed. If that option provided any other option strings, all of those option + strings become invalid. If ``opt_str`` does not occur in any option belonging to + this OptionParser, raises ValueError. + + +.. _optparse-conflicts-between-options: + +Conflicts between options +^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you're not careful, it's easy to define options with conflicting option +strings:: + + parser.add_option("-n", "--dry-run", ...) + [...] + parser.add_option("-n", "--noisy", ...) + +(This is particularly true if you've defined your own OptionParser subclass with +some standard options.) + +Every time you add an option, :mod:`optparse` checks for conflicts with existing +options. If it finds any, it invokes the current conflict-handling mechanism. +You can set the conflict-handling mechanism either in the constructor:: + + parser = OptionParser(..., conflict_handler=handler) + +or with a separate call:: + + parser.set_conflict_handler(handler) + +The available conflict handlers are: + + ``error`` (default) + assume option conflicts are a programming error and raise OptionConflictError + + ``resolve`` + resolve option conflicts intelligently (see below) + + +As an example, let's define an OptionParser that resolves conflicts +intelligently and add conflicting options to it:: + + parser = OptionParser(conflict_handler="resolve") + parser.add_option("-n", "--dry-run", ..., help="do no harm") + parser.add_option("-n", "--noisy", ..., help="be noisy") + +At this point, :mod:`optparse` detects that a previously-added option is already +using the ``"-n"`` option string. Since ``conflict_handler`` is ``"resolve"``, +it resolves the situation by removing ``"-n"`` from the earlier option's list of +option strings. Now ``"--dry-run"`` is the only way for the user to activate +that option. If the user asks for help, the help message will reflect that:: + + options: + --dry-run do no harm + [...] + -n, --noisy be noisy + +It's possible to whittle away the option strings for a previously-added option +until there are none left, and the user has no way of invoking that option from +the command-line. In that case, :mod:`optparse` removes that option completely, +so it doesn't show up in help text or anywhere else. Carrying on with our +existing OptionParser:: + + parser.add_option("--dry-run", ..., help="new dry-run option") + +At this point, the original :option:`-n/--dry-run` option is no longer +accessible, so :mod:`optparse` removes it, leaving this help text:: + + options: + [...] + -n, --noisy be noisy + --dry-run new dry-run option + + +.. _optparse-cleanup: + +Cleanup +^^^^^^^ + +OptionParser instances have several cyclic references. This should not be a +problem for Python's garbage collector, but you may wish to break the cyclic +references explicitly by calling ``destroy()`` on your OptionParser once you are +done with it. This is particularly useful in long-running applications where +large object graphs are reachable from your OptionParser. + + +.. _optparse-other-methods: + +Other methods +^^^^^^^^^^^^^ + +OptionParser supports several other public methods: + +* ``set_usage(usage)`` + + Set the usage string according to the rules described above for the ``usage`` + constructor keyword argument. Passing ``None`` sets the default usage string; + use ``SUPPRESS_USAGE`` to suppress a usage message. + +* ``enable_interspersed_args()``, ``disable_interspersed_args()`` + + Enable/disable positional arguments interspersed with options, similar to GNU + getopt (enabled by default). For example, if ``"-a"`` and ``"-b"`` are both + simple options that take no arguments, :mod:`optparse` normally accepts this + syntax:: + + prog -a arg1 -b arg2 + + and treats it as equivalent to :: + + prog -a -b arg1 arg2 + + To disable this feature, call ``disable_interspersed_args()``. This restores + traditional Unix syntax, where option parsing stops with the first non-option + argument. + +* ``set_defaults(dest=value, ...)`` + + Set default values for several option destinations at once. Using + :meth:`set_defaults` is the preferred way to set default values for options, + since multiple options can share the same destination. For example, if several + "mode" options all set the same destination, any one of them can set the + default, and the last one wins:: + + parser.add_option("--advanced", action="store_const", + dest="mode", const="advanced", + default="novice") # overridden below + parser.add_option("--novice", action="store_const", + dest="mode", const="novice", + default="advanced") # overrides above setting + + To avoid this confusion, use :meth:`set_defaults`:: + + parser.set_defaults(mode="advanced") + parser.add_option("--advanced", action="store_const", + dest="mode", const="advanced") + parser.add_option("--novice", action="store_const", + dest="mode", const="novice") + +.. % $Id: reference.txt 519 2006-06-11 14:39:11Z gward $ + + +.. _optparse-option-callbacks: + +Option Callbacks +---------------- + +When :mod:`optparse`'s built-in actions and types aren't quite enough for your +needs, you have two choices: extend :mod:`optparse` or define a callback option. +Extending :mod:`optparse` is more general, but overkill for a lot of simple +cases. Quite often a simple callback is all you need. + +There are two steps to defining a callback option: + +* define the option itself using the ``callback`` action + +* write the callback; this is a function (or method) that takes at least four + arguments, as described below + + +.. _optparse-defining-callback-option: + +Defining a callback option +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +As always, the easiest way to define a callback option is by using the +``parser.add_option()`` method. Apart from :attr:`action`, the only option +attribute you must specify is ``callback``, the function to call:: + + parser.add_option("-c", action="callback", callback=my_callback) + +``callback`` is a function (or other callable object), so you must have already +defined ``my_callback()`` when you create this callback option. In this simple +case, :mod:`optparse` doesn't even know if :option:`-c` takes any arguments, +which usually means that the option takes no arguments---the mere presence of +:option:`-c` on the command-line is all it needs to know. In some +circumstances, though, you might want your callback to consume an arbitrary +number of command-line arguments. This is where writing callbacks gets tricky; +it's covered later in this section. + +:mod:`optparse` always passes four particular arguments to your callback, and it +will only pass additional arguments if you specify them via ``callback_args`` +and ``callback_kwargs``. Thus, the minimal callback function signature is:: + + def my_callback(option, opt, value, parser): + +The four arguments to a callback are described below. + +There are several other option attributes that you can supply when you define a +callback option: + +:attr:`type` + has its usual meaning: as with the ``store`` or ``append`` actions, it instructs + :mod:`optparse` to consume one argument and convert it to :attr:`type`. Rather + than storing the converted value(s) anywhere, though, :mod:`optparse` passes it + to your callback function. + +``nargs`` + also has its usual meaning: if it is supplied and > 1, :mod:`optparse` will + consume ``nargs`` arguments, each of which must be convertible to :attr:`type`. + It then passes a tuple of converted values to your callback. + +``callback_args`` + a tuple of extra positional arguments to pass to the callback + +``callback_kwargs`` + a dictionary of extra keyword arguments to pass to the callback + + +.. _optparse-how-callbacks-called: + +How callbacks are called +^^^^^^^^^^^^^^^^^^^^^^^^ + +All callbacks are called as follows:: + + func(option, opt_str, value, parser, *args, **kwargs) + +where + +``option`` + is the Option instance that's calling the callback + +``opt_str`` + is the option string seen on the command-line that's triggering the callback. + (If an abbreviated long option was used, ``opt_str`` will be the full, canonical + option string---e.g. if the user puts ``"--foo"`` on the command-line as an + abbreviation for ``"--foobar"``, then ``opt_str`` will be ``"--foobar"``.) + +``value`` + is the argument to this option seen on the command-line. :mod:`optparse` will + only expect an argument if :attr:`type` is set; the type of ``value`` will be + the type implied by the option's type. If :attr:`type` for this option is + ``None`` (no argument expected), then ``value`` will be ``None``. If ``nargs`` + > 1, ``value`` will be a tuple of values of the appropriate type. + +``parser`` + is the OptionParser instance driving the whole thing, mainly useful because you + can access some other interesting data through its instance attributes: + + ``parser.largs`` + the current list of leftover arguments, ie. arguments that have been consumed + but are neither options nor option arguments. Feel free to modify + ``parser.largs``, e.g. by adding more arguments to it. (This list will become + ``args``, the second return value of :meth:`parse_args`.) + + ``parser.rargs`` + the current list of remaining arguments, ie. with ``opt_str`` and ``value`` (if + applicable) removed, and only the arguments following them still there. Feel + free to modify ``parser.rargs``, e.g. by consuming more arguments. + + ``parser.values`` + the object where option values are by default stored (an instance of + optparse.OptionValues). This lets callbacks use the same mechanism as the rest + of :mod:`optparse` for storing option values; you don't need to mess around with + globals or closures. You can also access or modify the value(s) of any options + already encountered on the command-line. + +``args`` + is a tuple of arbitrary positional arguments supplied via the ``callback_args`` + option attribute. + +``kwargs`` + is a dictionary of arbitrary keyword arguments supplied via ``callback_kwargs``. + + +.. _optparse-raising-errors-in-callback: + +Raising errors in a callback +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The callback function should raise OptionValueError if there are any problems +with the option or its argument(s). :mod:`optparse` catches this and terminates +the program, printing the error message you supply to stderr. Your message +should be clear, concise, accurate, and mention the option at fault. Otherwise, +the user will have a hard time figuring out what he did wrong. + + +.. _optparse-callback-example-1: + +Callback example 1: trivial callback +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Here's an example of a callback option that takes no arguments, and simply +records that the option was seen:: + + def record_foo_seen(option, opt_str, value, parser): + parser.saw_foo = True + + parser.add_option("--foo", action="callback", callback=record_foo_seen) + +Of course, you could do that with the ``store_true`` action. + + +.. _optparse-callback-example-2: + +Callback example 2: check option order +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Here's a slightly more interesting example: record the fact that ``"-a"`` is +seen, but blow up if it comes after ``"-b"`` in the command-line. :: + + def check_order(option, opt_str, value, parser): + if parser.values.b: + raise OptionValueError("can't use -a after -b") + parser.values.a = 1 + [...] + parser.add_option("-a", action="callback", callback=check_order) + parser.add_option("-b", action="store_true", dest="b") + + +.. _optparse-callback-example-3: + +Callback example 3: check option order (generalized) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you want to re-use this callback for several similar options (set a flag, but +blow up if ``"-b"`` has already been seen), it needs a bit of work: the error +message and the flag that it sets must be generalized. :: + + def check_order(option, opt_str, value, parser): + if parser.values.b: + raise OptionValueError("can't use %s after -b" % opt_str) + setattr(parser.values, option.dest, 1) + [...] + parser.add_option("-a", action="callback", callback=check_order, dest='a') + parser.add_option("-b", action="store_true", dest="b") + parser.add_option("-c", action="callback", callback=check_order, dest='c') + + +.. _optparse-callback-example-4: + +Callback example 4: check arbitrary condition +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Of course, you could put any condition in there---you're not limited to checking +the values of already-defined options. For example, if you have options that +should not be called when the moon is full, all you have to do is this:: + + def check_moon(option, opt_str, value, parser): + if is_moon_full(): + raise OptionValueError("%s option invalid when moon is full" + % opt_str) + setattr(parser.values, option.dest, 1) + [...] + parser.add_option("--foo", + action="callback", callback=check_moon, dest="foo") + +(The definition of ``is_moon_full()`` is left as an exercise for the reader.) + + +.. _optparse-callback-example-5: + +Callback example 5: fixed arguments +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Things get slightly more interesting when you define callback options that take +a fixed number of arguments. Specifying that a callback option takes arguments +is similar to defining a ``store`` or ``append`` option: if you define +:attr:`type`, then the option takes one argument that must be convertible to +that type; if you further define ``nargs``, then the option takes ``nargs`` +arguments. + +Here's an example that just emulates the standard ``store`` action:: + + def store_value(option, opt_str, value, parser): + setattr(parser.values, option.dest, value) + [...] + parser.add_option("--foo", + action="callback", callback=store_value, + type="int", nargs=3, dest="foo") + +Note that :mod:`optparse` takes care of consuming 3 arguments and converting +them to integers for you; all you have to do is store them. (Or whatever; +obviously you don't need a callback for this example.) + + +.. _optparse-callback-example-6: + +Callback example 6: variable arguments +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Things get hairy when you want an option to take a variable number of arguments. +For this case, you must write a callback, as :mod:`optparse` doesn't provide any +built-in capabilities for it. And you have to deal with certain intricacies of +conventional Unix command-line parsing that :mod:`optparse` normally handles for +you. In particular, callbacks should implement the conventional rules for bare +``"--"`` and ``"-"`` arguments: + +* either ``"--"`` or ``"-"`` can be option arguments + +* bare ``"--"`` (if not the argument to some option): halt command-line + processing and discard the ``"--"`` + +* bare ``"-"`` (if not the argument to some option): halt command-line + processing but keep the ``"-"`` (append it to ``parser.largs``) + +If you want an option that takes a variable number of arguments, there are +several subtle, tricky issues to worry about. The exact implementation you +choose will be based on which trade-offs you're willing to make for your +application (which is why :mod:`optparse` doesn't support this sort of thing +directly). + +Nevertheless, here's a stab at a callback for an option with variable +arguments:: + + def vararg_callback(option, opt_str, value, parser): + assert value is None + done = 0 + value = [] + rargs = parser.rargs + while rargs: + arg = rargs[0] + + # Stop if we hit an arg like "--foo", "-a", "-fx", "--file=f", + # etc. Note that this also stops on "-3" or "-3.0", so if + # your option takes numeric values, you will need to handle + # this. + if ((arg[:2] == "--" and len(arg) > 2) or + (arg[:1] == "-" and len(arg) > 1 and arg[1] != "-")): + break + else: + value.append(arg) + del rargs[0] + + setattr(parser.values, option.dest, value) + + [...] + parser.add_option("-c", "--callback", + action="callback", callback=varargs) + +The main weakness with this particular implementation is that negative numbers +in the arguments following ``"-c"`` will be interpreted as further options +(probably causing an error), rather than as arguments to ``"-c"``. Fixing this +is left as an exercise for the reader. + +.. % $Id: callbacks.txt 415 2004-09-30 02:26:17Z greg $ + + +.. _optparse-extending-optparse: + +Extending :mod:`optparse` +------------------------- + +Since the two major controlling factors in how :mod:`optparse` interprets +command-line options are the action and type of each option, the most likely +direction of extension is to add new actions and new types. + + +.. _optparse-adding-new-types: + +Adding new types +^^^^^^^^^^^^^^^^ + +To add new types, you need to define your own subclass of :mod:`optparse`'s +Option class. This class has a couple of attributes that define +:mod:`optparse`'s types: :attr:`TYPES` and :attr:`TYPE_CHECKER`. + +:attr:`TYPES` is a tuple of type names; in your subclass, simply define a new +tuple :attr:`TYPES` that builds on the standard one. + +:attr:`TYPE_CHECKER` is a dictionary mapping type names to type-checking +functions. A type-checking function has the following signature:: + + def check_mytype(option, opt, value) + +where ``option`` is an :class:`Option` instance, ``opt`` is an option string +(e.g., ``"-f"``), and ``value`` is the string from the command line that must be +checked and converted to your desired type. ``check_mytype()`` should return an +object of the hypothetical type ``mytype``. The value returned by a +type-checking function will wind up in the OptionValues instance returned by +:meth:`OptionParser.parse_args`, or be passed to a callback as the ``value`` +parameter. + +Your type-checking function should raise OptionValueError if it encounters any +problems. OptionValueError takes a single string argument, which is passed +as-is to OptionParser's :meth:`error` method, which in turn prepends the program +name and the string ``"error:"`` and prints everything to stderr before +terminating the process. + +Here's a silly example that demonstrates adding a ``complex`` option type to +parse Python-style complex numbers on the command line. (This is even sillier +than it used to be, because :mod:`optparse` 1.3 added built-in support for +complex numbers, but never mind.) + +First, the necessary imports:: + + from copy import copy + from optparse import Option, OptionValueError + +You need to define your type-checker first, since it's referred to later (in the +:attr:`TYPE_CHECKER` class attribute of your Option subclass):: + + def check_complex(option, opt, value): + try: + return complex(value) + except ValueError: + raise OptionValueError( + "option %s: invalid complex value: %r" % (opt, value)) + +Finally, the Option subclass:: + + class MyOption (Option): + TYPES = Option.TYPES + ("complex",) + TYPE_CHECKER = copy(Option.TYPE_CHECKER) + TYPE_CHECKER["complex"] = check_complex + +(If we didn't make a :func:`copy` of :attr:`Option.TYPE_CHECKER`, we would end +up modifying the :attr:`TYPE_CHECKER` attribute of :mod:`optparse`'s Option +class. This being Python, nothing stops you from doing that except good manners +and common sense.) + +That's it! Now you can write a script that uses the new option type just like +any other :mod:`optparse`\ -based script, except you have to instruct your +OptionParser to use MyOption instead of Option:: + + parser = OptionParser(option_class=MyOption) + parser.add_option("-c", type="complex") + +Alternately, you can build your own option list and pass it to OptionParser; if +you don't use :meth:`add_option` in the above way, you don't need to tell +OptionParser which option class to use:: + + option_list = [MyOption("-c", action="store", type="complex", dest="c")] + parser = OptionParser(option_list=option_list) + + +.. _optparse-adding-new-actions: + +Adding new actions +^^^^^^^^^^^^^^^^^^ + +Adding new actions is a bit trickier, because you have to understand that +:mod:`optparse` has a couple of classifications for actions: + +"store" actions + actions that result in :mod:`optparse` storing a value to an attribute of the + current OptionValues instance; these options require a :attr:`dest` attribute to + be supplied to the Option constructor + +"typed" actions + actions that take a value from the command line and expect it to be of a certain + type; or rather, a string that can be converted to a certain type. These + options require a :attr:`type` attribute to the Option constructor. + +These are overlapping sets: some default "store" actions are ``store``, +``store_const``, ``append``, and ``count``, while the default "typed" actions +are ``store``, ``append``, and ``callback``. + +When you add an action, you need to categorize it by listing it in at least one +of the following class attributes of Option (all are lists of strings): + +:attr:`ACTIONS` + all actions must be listed in ACTIONS + +:attr:`STORE_ACTIONS` + "store" actions are additionally listed here + +:attr:`TYPED_ACTIONS` + "typed" actions are additionally listed here + +``ALWAYS_TYPED_ACTIONS`` + actions that always take a type (i.e. whose options always take a value) are + additionally listed here. The only effect of this is that :mod:`optparse` + assigns the default type, ``string``, to options with no explicit type whose + action is listed in ``ALWAYS_TYPED_ACTIONS``. + +In order to actually implement your new action, you must override Option's +:meth:`take_action` method and add a case that recognizes your action. + +For example, let's add an ``extend`` action. This is similar to the standard +``append`` action, but instead of taking a single value from the command-line +and appending it to an existing list, ``extend`` will take multiple values in a +single comma-delimited string, and extend an existing list with them. That is, +if ``"--names"`` is an ``extend`` option of type ``string``, the command line +:: + + --names=foo,bar --names blah --names ding,dong + +would result in a list :: + + ["foo", "bar", "blah", "ding", "dong"] + +Again we define a subclass of Option:: + + class MyOption (Option): + + ACTIONS = Option.ACTIONS + ("extend",) + STORE_ACTIONS = Option.STORE_ACTIONS + ("extend",) + TYPED_ACTIONS = Option.TYPED_ACTIONS + ("extend",) + ALWAYS_TYPED_ACTIONS = Option.ALWAYS_TYPED_ACTIONS + ("extend",) + + def take_action(self, action, dest, opt, value, values, parser): + if action == "extend": + lvalue = value.split(",") + values.ensure_value(dest, []).extend(lvalue) + else: + Option.take_action( + self, action, dest, opt, value, values, parser) + +Features of note: + +* ``extend`` both expects a value on the command-line and stores that value + somewhere, so it goes in both :attr:`STORE_ACTIONS` and :attr:`TYPED_ACTIONS` + +* to ensure that :mod:`optparse` assigns the default type of ``string`` to + ``extend`` actions, we put the ``extend`` action in ``ALWAYS_TYPED_ACTIONS`` as + well + +* :meth:`MyOption.take_action` implements just this one new action, and passes + control back to :meth:`Option.take_action` for the standard :mod:`optparse` + actions + +* ``values`` is an instance of the optparse_parser.Values class, which + provides the very useful :meth:`ensure_value` method. :meth:`ensure_value` is + essentially :func:`getattr` with a safety valve; it is called as :: + + values.ensure_value(attr, value) + + If the ``attr`` attribute of ``values`` doesn't exist or is None, then + ensure_value() first sets it to ``value``, and then returns 'value. This is very + handy for actions like ``extend``, ``append``, and ``count``, all of which + accumulate data in a variable and expect that variable to be of a certain type + (a list for the first two, an integer for the latter). Using + :meth:`ensure_value` means that scripts using your action don't have to worry + about setting a default value for the option destinations in question; they can + just leave the default as None and :meth:`ensure_value` will take care of + getting it right when it's needed. + +.. % $Id: extending.txt 517 2006-06-10 16:18:11Z gward $ + diff --git a/Doc/library/os.path.rst b/Doc/library/os.path.rst new file mode 100644 index 0000000..291d155 --- /dev/null +++ b/Doc/library/os.path.rst @@ -0,0 +1,317 @@ + +:mod:`os.path` --- Common pathname manipulations +================================================ + +.. module:: os.path + :synopsis: Operations on pathnames. + + +.. index:: single: path; operations + +This module implements some useful functions on pathnames. To read or +write files see :func:`open`, and for accessing the filesystem see the +:mod:`os` module. + +.. warning:: + + On Windows, many of these functions do not properly support UNC pathnames. + :func:`splitunc` and :func:`ismount` do handle them correctly. + + +.. function:: abspath(path) + + Return a normalized absolutized version of the pathname *path*. On most + platforms, this is equivalent to ``normpath(join(os.getcwd(), path))``. + + .. versionadded:: 1.5.2 + + +.. function:: basename(path) + + Return the base name of pathname *path*. This is the second half of the pair + returned by ``split(path)``. Note that the result of this function is different + from the Unix :program:`basename` program; where :program:`basename` for + ``'/foo/bar/'`` returns ``'bar'``, the :func:`basename` function returns an + empty string (``''``). + + +.. function:: commonprefix(list) + + Return the longest path prefix (taken character-by-character) that is a prefix + of all paths in *list*. If *list* is empty, return the empty string (``''``). + Note that this may return invalid paths because it works a character at a time. + + +.. function:: dirname(path) + + Return the directory name of pathname *path*. This is the first half of the + pair returned by ``split(path)``. + + +.. function:: exists(path) + + Return ``True`` if *path* refers to an existing path. Returns ``False`` for + broken symbolic links. On some platforms, this function may return ``False`` if + permission is not granted to execute :func:`os.stat` on the requested file, even + if the *path* physically exists. + + +.. function:: lexists(path) + + Return ``True`` if *path* refers to an existing path. Returns ``True`` for + broken symbolic links. Equivalent to :func:`exists` on platforms lacking + :func:`os.lstat`. + + .. versionadded:: 2.4 + + +.. function:: expanduser(path) + + On Unix and Windows, return the argument with an initial component of ``~`` or + ``~user`` replaced by that *user*'s home directory. + + .. index:: module: pwd + + On Unix, an initial ``~`` is replaced by the environment variable :envvar:`HOME` + if it is set; otherwise the current user's home directory is looked up in the + password directory through the built-in module :mod:`pwd`. An initial ``~user`` + is looked up directly in the password directory. + + On Windows, :envvar:`HOME` and :envvar:`USERPROFILE` will be used if set, + otherwise a combination of :envvar:`HOMEPATH` and :envvar:`HOMEDRIVE` will be + used. An initial ``~user`` is handled by stripping the last directory component + from the created user path derived above. + + If the expansion fails or if the path does not begin with a tilde, the path is + returned unchanged. + + +.. function:: expandvars(path) + + Return the argument with environment variables expanded. Substrings of the form + ``$name`` or ``${name}`` are replaced by the value of environment variable + *name*. Malformed variable names and references to non-existing variables are + left unchanged. + + On Windows, ``%name%`` expansions are supported in addition to ``$name`` and + ``${name}``. + + +.. function:: getatime(path) + + Return the time of last access of *path*. The return value is a number giving + the number of seconds since the epoch (see the :mod:`time` module). Raise + :exc:`os.error` if the file does not exist or is inaccessible. + + .. versionadded:: 1.5.2 + + .. versionchanged:: 2.3 + If :func:`os.stat_float_times` returns True, the result is a floating point + number. + + +.. function:: getmtime(path) + + Return the time of last modification of *path*. The return value is a number + giving the number of seconds since the epoch (see the :mod:`time` module). + Raise :exc:`os.error` if the file does not exist or is inaccessible. + + .. versionadded:: 1.5.2 + + .. versionchanged:: 2.3 + If :func:`os.stat_float_times` returns True, the result is a floating point + number. + + +.. function:: getctime(path) + + Return the system's ctime which, on some systems (like Unix) is the time of the + last change, and, on others (like Windows), is the creation time for *path*. + The return value is a number giving the number of seconds since the epoch (see + the :mod:`time` module). Raise :exc:`os.error` if the file does not exist or + is inaccessible. + + .. versionadded:: 2.3 + + +.. function:: getsize(path) + + Return the size, in bytes, of *path*. Raise :exc:`os.error` if the file does + not exist or is inaccessible. + + .. versionadded:: 1.5.2 + + +.. function:: isabs(path) + + Return ``True`` if *path* is an absolute pathname (begins with a slash). + + +.. function:: isfile(path) + + Return ``True`` if *path* is an existing regular file. This follows symbolic + links, so both :func:`islink` and :func:`isfile` can be true for the same path. + + +.. function:: isdir(path) + + Return ``True`` if *path* is an existing directory. This follows symbolic + links, so both :func:`islink` and :func:`isdir` can be true for the same path. + + +.. function:: islink(path) + + Return ``True`` if *path* refers to a directory entry that is a symbolic link. + Always ``False`` if symbolic links are not supported. + + +.. function:: ismount(path) + + Return ``True`` if pathname *path* is a :dfn:`mount point`: a point in a file + system where a different file system has been mounted. The function checks + whether *path*'s parent, :file:`path/..`, is on a different device than *path*, + or whether :file:`path/..` and *path* point to the same i-node on the same + device --- this should detect mount points for all Unix and POSIX variants. + + +.. function:: join(path1[, path2[, ...]]) + + Join one or more path components intelligently. If any component is an absolute + path, all previous components (on Windows, including the previous drive letter, + if there was one) are thrown away, and joining continues. The return value is + the concatenation of *path1*, and optionally *path2*, etc., with exactly one + directory separator (``os.sep``) inserted between components, unless *path2* is + empty. Note that on Windows, since there is a current directory for each drive, + ``os.path.join("c:", "foo")`` represents a path relative to the current + directory on drive :file:`C:` (:file:`c:foo`), not :file:`c:\\foo`. + + +.. function:: normcase(path) + + Normalize the case of a pathname. On Unix, this returns the path unchanged; on + case-insensitive filesystems, it converts the path to lowercase. On Windows, it + also converts forward slashes to backward slashes. + + +.. function:: normpath(path) + + Normalize a pathname. This collapses redundant separators and up-level + references so that ``A//B``, ``A/./B`` and ``A/foo/../B`` all become ``A/B``. + It does not normalize the case (use :func:`normcase` for that). On Windows, it + converts forward slashes to backward slashes. It should be understood that this + may change the meaning of the path if it contains symbolic links! + + +.. function:: realpath(path) + + Return the canonical path of the specified filename, eliminating any symbolic + links encountered in the path (if they are supported by the operating system). + + .. versionadded:: 2.2 + + +.. function:: relpath(path[, start]) + + Return a relative filepath to *path* either from the current directory or from + an optional *start* point. + + *start* defaults to :attr:`os.curdir`. Availability: Windows, Unix. + + .. versionadded:: 2.6 + + +.. function:: samefile(path1, path2) + + Return ``True`` if both pathname arguments refer to the same file or directory + (as indicated by device number and i-node number). Raise an exception if a + :func:`os.stat` call on either pathname fails. Availability: Macintosh, Unix. + + +.. function:: sameopenfile(fp1, fp2) + + Return ``True`` if the file descriptors *fp1* and *fp2* refer to the same file. + Availability: Macintosh, Unix. + + +.. function:: samestat(stat1, stat2) + + Return ``True`` if the stat tuples *stat1* and *stat2* refer to the same file. + These structures may have been returned by :func:`fstat`, :func:`lstat`, or + :func:`stat`. This function implements the underlying comparison used by + :func:`samefile` and :func:`sameopenfile`. Availability: Macintosh, Unix. + + +.. function:: split(path) + + Split the pathname *path* into a pair, ``(head, tail)`` where *tail* is the last + pathname component and *head* is everything leading up to that. The *tail* part + will never contain a slash; if *path* ends in a slash, *tail* will be empty. If + there is no slash in *path*, *head* will be empty. If *path* is empty, both + *head* and *tail* are empty. Trailing slashes are stripped from *head* unless + it is the root (one or more slashes only). In nearly all cases, ``join(head, + tail)`` equals *path* (the only exception being when there were multiple slashes + separating *head* from *tail*). + + +.. function:: splitdrive(path) + + Split the pathname *path* into a pair ``(drive, tail)`` where *drive* is either + a drive specification or the empty string. On systems which do not use drive + specifications, *drive* will always be the empty string. In all cases, ``drive + + tail`` will be the same as *path*. + + .. versionadded:: 1.3 + + +.. function:: splitext(path) + + Split the pathname *path* into a pair ``(root, ext)`` such that ``root + ext == + path``, and *ext* is empty or begins with a period and contains at most one + period. Leading periods on the basename are ignored; ``splitext('.cshrc')`` + returns ``('.cshrc', '')``. + + .. versionchanged:: 2.6 + Earlier versions could produce an empty root when the only period was the + first character. + + +.. function:: splitunc(path) + + Split the pathname *path* into a pair ``(unc, rest)`` so that *unc* is the UNC + mount point (such as ``r'\\host\mount'``), if present, and *rest* the rest of + the path (such as ``r'\path\file.ext'``). For paths containing drive letters, + *unc* will always be the empty string. Availability: Windows. + + +.. function:: walk(path, visit, arg) + + Calls the function *visit* with arguments ``(arg, dirname, names)`` for each + directory in the directory tree rooted at *path* (including *path* itself, if it + is a directory). The argument *dirname* specifies the visited directory, the + argument *names* lists the files in the directory (gotten from + ``os.listdir(dirname)``). The *visit* function may modify *names* to influence + the set of directories visited below *dirname*, e.g. to avoid visiting certain + parts of the tree. (The object referred to by *names* must be modified in + place, using :keyword:`del` or slice assignment.) + + .. note:: + + Symbolic links to directories are not treated as subdirectories, and that + :func:`walk` therefore will not visit them. To visit linked directories you must + identify them with ``os.path.islink(file)`` and ``os.path.isdir(file)``, and + invoke :func:`walk` as necessary. + + .. note:: + + The newer :func:`os.walk` generator supplies similar functionality and can be + easier to use. + + +.. data:: supports_unicode_filenames + + True if arbitrary Unicode strings can be used as file names (within limitations + imposed by the file system), and if :func:`os.listdir` returns Unicode strings + for a Unicode argument. + + .. versionadded:: 2.3 + diff --git a/Doc/library/os.rst b/Doc/library/os.rst new file mode 100644 index 0000000..5d057f1 --- /dev/null +++ b/Doc/library/os.rst @@ -0,0 +1,2036 @@ + +:mod:`os` --- Miscellaneous operating system interfaces +======================================================= + +.. module:: os + :synopsis: Miscellaneous operating system interfaces. + + +This module provides a more portable way of using operating system dependent +functionality than importing a operating system dependent built-in module like +:mod:`posix` or :mod:`nt`. (If you just want to read or write a file see +:func:`open`, and if you want to manipulate paths, see the :mod:`os.path` +module.) + +This module searches for an operating system dependent built-in module like +:mod:`mac` or :mod:`posix` and exports the same functions and data as found +there. The design of all Python's built-in operating system dependent modules +is such that as long as the same functionality is available, it uses the same +interface; for example, the function ``os.stat(path)`` returns stat information +about *path* in the same format (which happens to have originated with the POSIX +interface). + +Extensions peculiar to a particular operating system are also available through +the :mod:`os` module, but using them is of course a threat to portability! + +Note that after the first time :mod:`os` is imported, there is *no* performance +penalty in using functions from :mod:`os` instead of directly from the operating +system dependent built-in module, so there should be *no* reason not to use +:mod:`os`! + +The :mod:`os` module contains many functions and data values. The items below +and in the following sub-sections are all available directly from the :mod:`os` +module. + +.. % Frank Stajano <fstajano@uk.research.att.com> complained that it +.. % wasn't clear that the entries described in the subsections were all +.. % available at the module level (most uses of subsections are +.. % different); I think this is only a problem for the HTML version, +.. % where the relationship may not be as clear. +.. % + + +.. exception:: error + + .. index:: module: errno + + This exception is raised when a function returns a system-related error (not for + illegal argument types or other incidental errors). This is also known as the + built-in exception :exc:`OSError`. The accompanying value is a pair containing + the numeric error code from :cdata:`errno` and the corresponding string, as + would be printed by the C function :cfunc:`perror`. See the module + :mod:`errno`, which contains names for the error codes defined by the underlying + operating system. + + When exceptions are classes, this exception carries two attributes, + :attr:`errno` and :attr:`strerror`. The first holds the value of the C + :cdata:`errno` variable, and the latter holds the corresponding error message + from :cfunc:`strerror`. For exceptions that involve a file system path (such as + :func:`chdir` or :func:`unlink`), the exception instance will contain a third + attribute, :attr:`filename`, which is the file name passed to the function. + + +.. data:: name + + The name of the operating system dependent module imported. The following names + have currently been registered: ``'posix'``, ``'nt'``, ``'mac'``, ``'os2'``, + ``'ce'``, ``'java'``, ``'riscos'``. + + +.. data:: path + + The corresponding operating system dependent standard module for pathname + operations, such as :mod:`posixpath` or :mod:`macpath`. Thus, given the proper + imports, ``os.path.split(file)`` is equivalent to but more portable than + ``posixpath.split(file)``. Note that this is also an importable module: it may + be imported directly as :mod:`os.path`. + + +.. _os-procinfo: + +Process Parameters +------------------ + +These functions and data items provide information and operate on the current +process and user. + + +.. data:: environ + + A mapping object representing the string environment. For example, + ``environ['HOME']`` is the pathname of your home directory (on some platforms), + and is equivalent to ``getenv("HOME")`` in C. + + This mapping is captured the first time the :mod:`os` module is imported, + typically during Python startup as part of processing :file:`site.py`. Changes + to the environment made after this time are not reflected in ``os.environ``, + except for changes made by modifying ``os.environ`` directly. + + If the platform supports the :func:`putenv` function, this mapping may be used + to modify the environment as well as query the environment. :func:`putenv` will + be called automatically when the mapping is modified. + + .. note:: + + Calling :func:`putenv` directly does not change ``os.environ``, so it's better + to modify ``os.environ``. + + .. note:: + + On some platforms, including FreeBSD and Mac OS X, setting ``environ`` may cause + memory leaks. Refer to the system documentation for :cfunc:`putenv`. + + If :func:`putenv` is not provided, a modified copy of this mapping may be + passed to the appropriate process-creation functions to cause child processes + to use a modified environment. + + If the platform supports the :func:`unsetenv` function, you can delete items in + this mapping to unset environment variables. :func:`unsetenv` will be called + automatically when an item is deleted from ``os.environ``. + + +.. function:: chdir(path) + fchdir(fd) + getcwd() + :noindex: + + These functions are described in :ref:`os-file-dir`. + + +.. function:: ctermid() + + Return the filename corresponding to the controlling terminal of the process. + Availability: Unix. + + +.. function:: getegid() + + Return the effective group id of the current process. This corresponds to the + 'set id' bit on the file being executed in the current process. Availability: + Unix. + + +.. function:: geteuid() + + .. index:: single: user; effective id + + Return the current process' effective user id. Availability: Unix. + + +.. function:: getgid() + + .. index:: single: process; group + + Return the real group id of the current process. Availability: Unix. + + +.. function:: getgroups() + + Return list of supplemental group ids associated with the current process. + Availability: Unix. + + +.. function:: getlogin() + + Return the name of the user logged in on the controlling terminal of the + process. For most purposes, it is more useful to use the environment variable + :envvar:`LOGNAME` to find out who the user is, or + ``pwd.getpwuid(os.getuid())[0]`` to get the login name of the currently + effective user ID. Availability: Unix. + + +.. function:: getpgid(pid) + + Return the process group id of the process with process id *pid*. If *pid* is 0, + the process group id of the current process is returned. Availability: Unix. + + .. versionadded:: 2.3 + + +.. function:: getpgrp() + + .. index:: single: process; group + + Return the id of the current process group. Availability: Unix. + + +.. function:: getpid() + + .. index:: single: process; id + + Return the current process id. Availability: Unix, Windows. + + +.. function:: getppid() + + .. index:: single: process; id of parent + + Return the parent's process id. Availability: Unix. + + +.. function:: getuid() + + .. index:: single: user; id + + Return the current process' user id. Availability: Unix. + + +.. function:: getenv(varname[, value]) + + Return the value of the environment variable *varname* if it exists, or *value* + if it doesn't. *value* defaults to ``None``. Availability: most flavors of + Unix, Windows. + + +.. function:: putenv(varname, value) + + .. index:: single: environment variables; setting + + Set the environment variable named *varname* to the string *value*. Such + changes to the environment affect subprocesses started with :func:`os.system`, + :func:`popen` or :func:`fork` and :func:`execv`. Availability: most flavors of + Unix, Windows. + + .. note:: + + On some platforms, including FreeBSD and Mac OS X, setting ``environ`` may cause + memory leaks. Refer to the system documentation for putenv. + + When :func:`putenv` is supported, assignments to items in ``os.environ`` are + automatically translated into corresponding calls to :func:`putenv`; however, + calls to :func:`putenv` don't update ``os.environ``, so it is actually + preferable to assign to items of ``os.environ``. + + +.. function:: setegid(egid) + + Set the current process's effective group id. Availability: Unix. + + +.. function:: seteuid(euid) + + Set the current process's effective user id. Availability: Unix. + + +.. function:: setgid(gid) + + Set the current process' group id. Availability: Unix. + + +.. function:: setgroups(groups) + + Set the list of supplemental group ids associated with the current process to + *groups*. *groups* must be a sequence, and each element must be an integer + identifying a group. This operation is typical available only to the superuser. + Availability: Unix. + + .. versionadded:: 2.2 + + +.. function:: setpgrp() + + Calls the system call :cfunc:`setpgrp` or :cfunc:`setpgrp(0, 0)` depending on + which version is implemented (if any). See the Unix manual for the semantics. + Availability: Unix. + + +.. function:: setpgid(pid, pgrp) + + Calls the system call :cfunc:`setpgid` to set the process group id of the + process with id *pid* to the process group with id *pgrp*. See the Unix manual + for the semantics. Availability: Unix. + + +.. function:: setreuid(ruid, euid) + + Set the current process's real and effective user ids. Availability: Unix. + + +.. function:: setregid(rgid, egid) + + Set the current process's real and effective group ids. Availability: Unix. + + +.. function:: getsid(pid) + + Calls the system call :cfunc:`getsid`. See the Unix manual for the semantics. + Availability: Unix. + + .. versionadded:: 2.4 + + +.. function:: setsid() + + Calls the system call :cfunc:`setsid`. See the Unix manual for the semantics. + Availability: Unix. + + +.. function:: setuid(uid) + + .. index:: single: user; id, setting + + Set the current process' user id. Availability: Unix. + +.. % placed in this section since it relates to errno.... a little weak + + +.. function:: strerror(code) + + Return the error message corresponding to the error code in *code*. + Availability: Unix, Windows. + + +.. function:: umask(mask) + + Set the current numeric umask and returns the previous umask. Availability: + Unix, Windows. + + +.. function:: uname() + + .. index:: + single: gethostname() (in module socket) + single: gethostbyaddr() (in module socket) + + Return a 5-tuple containing information identifying the current operating + system. The tuple contains 5 strings: ``(sysname, nodename, release, version, + machine)``. Some systems truncate the nodename to 8 characters or to the + leading component; a better way to get the hostname is + :func:`socket.gethostname` or even + ``socket.gethostbyaddr(socket.gethostname())``. Availability: recent flavors of + Unix. + + +.. function:: unsetenv(varname) + + .. index:: single: environment variables; deleting + + Unset (delete) the environment variable named *varname*. Such changes to the + environment affect subprocesses started with :func:`os.system`, :func:`popen` or + :func:`fork` and :func:`execv`. Availability: most flavors of Unix, Windows. + + When :func:`unsetenv` is supported, deletion of items in ``os.environ`` is + automatically translated into a corresponding call to :func:`unsetenv`; however, + calls to :func:`unsetenv` don't update ``os.environ``, so it is actually + preferable to delete items of ``os.environ``. + + +.. _os-newstreams: + +File Object Creation +-------------------- + +These functions create new file objects. (See also :func:`open`.) + + +.. function:: fdopen(fd[, mode[, bufsize]]) + + .. index:: single: I/O control; buffering + + Return an open file object connected to the file descriptor *fd*. The *mode* + and *bufsize* arguments have the same meaning as the corresponding arguments to + the built-in :func:`open` function. Availability: Macintosh, Unix, Windows. + + .. versionchanged:: 2.3 + When specified, the *mode* argument must now start with one of the letters + ``'r'``, ``'w'``, or ``'a'``, otherwise a :exc:`ValueError` is raised. + + .. versionchanged:: 2.5 + On Unix, when the *mode* argument starts with ``'a'``, the *O_APPEND* flag is + set on the file descriptor (which the :cfunc:`fdopen` implementation already + does on most platforms). + + +.. function:: popen(command[, mode[, bufsize]]) + + Open a pipe to or from *command*. The return value is an open file object + connected to the pipe, which can be read or written depending on whether *mode* + is ``'r'`` (default) or ``'w'``. The *bufsize* argument has the same meaning as + the corresponding argument to the built-in :func:`open` function. The exit + status of the command (encoded in the format specified for :func:`wait`) is + available as the return value of the :meth:`close` method of the file object, + except that when the exit status is zero (termination without errors), ``None`` + is returned. Availability: Macintosh, Unix, Windows. + + .. deprecated:: 2.6 + This function is obsolete. Use the :mod:`subprocess` module. + + .. versionchanged:: 2.0 + This function worked unreliably under Windows in earlier versions of Python. + This was due to the use of the :cfunc:`_popen` function from the libraries + provided with Windows. Newer versions of Python do not use the broken + implementation from the Windows libraries. + + +.. function:: tmpfile() + + Return a new file object opened in update mode (``w+b``). The file has no + directory entries associated with it and will be automatically deleted once + there are no file descriptors for the file. Availability: Macintosh, Unix, + Windows. + + +.. _os-fd-ops: + +File Descriptor Operations +-------------------------- + +These functions operate on I/O streams referenced using file descriptors. + +File descriptors are small integers corresponding to a file that has been opened +by the current process. For example, standard input is usually file descriptor +0, standard output is 1, and standard error is 2. Further files opened by a +process will then be assigned 3, 4, 5, and so forth. The name "file descriptor" +is slightly deceptive; on Unix platforms, sockets and pipes are also referenced +by file descriptors. + + +.. function:: close(fd) + + Close file descriptor *fd*. Availability: Macintosh, Unix, Windows. + + .. note:: + + This function is intended for low-level I/O and must be applied to a file + descriptor as returned by :func:`open` or :func:`pipe`. To close a "file + object" returned by the built-in function :func:`open` or by :func:`popen` or + :func:`fdopen`, use its :meth:`close` method. + + +.. function:: dup(fd) + + Return a duplicate of file descriptor *fd*. Availability: Macintosh, Unix, + Windows. + + +.. function:: dup2(fd, fd2) + + Duplicate file descriptor *fd* to *fd2*, closing the latter first if necessary. + Availability: Macintosh, Unix, Windows. + + +.. function:: fdatasync(fd) + + Force write of file with filedescriptor *fd* to disk. Does not force update of + metadata. Availability: Unix. + + +.. function:: fpathconf(fd, name) + + Return system configuration information relevant to an open file. *name* + specifies the configuration value to retrieve; it may be a string which is the + name of a defined system value; these names are specified in a number of + standards (POSIX.1, Unix 95, Unix 98, and others). Some platforms define + additional names as well. The names known to the host operating system are + given in the ``pathconf_names`` dictionary. For configuration variables not + included in that mapping, passing an integer for *name* is also accepted. + Availability: Macintosh, Unix. + + If *name* is a string and is not known, :exc:`ValueError` is raised. If a + specific value for *name* is not supported by the host system, even if it is + included in ``pathconf_names``, an :exc:`OSError` is raised with + :const:`errno.EINVAL` for the error number. + + +.. function:: fstat(fd) + + Return status for file descriptor *fd*, like :func:`stat`. Availability: + Macintosh, Unix, Windows. + + +.. function:: fstatvfs(fd) + + Return information about the filesystem containing the file associated with file + descriptor *fd*, like :func:`statvfs`. Availability: Unix. + + +.. function:: fsync(fd) + + Force write of file with filedescriptor *fd* to disk. On Unix, this calls the + native :cfunc:`fsync` function; on Windows, the MS :cfunc:`_commit` function. + + If you're starting with a Python file object *f*, first do ``f.flush()``, and + then do ``os.fsync(f.fileno())``, to ensure that all internal buffers associated + with *f* are written to disk. Availability: Macintosh, Unix, and Windows + starting in 2.2.3. + + +.. function:: ftruncate(fd, length) + + Truncate the file corresponding to file descriptor *fd*, so that it is at most + *length* bytes in size. Availability: Macintosh, Unix. + + +.. function:: isatty(fd) + + Return ``True`` if the file descriptor *fd* is open and connected to a + tty(-like) device, else ``False``. Availability: Macintosh, Unix. + + +.. function:: lseek(fd, pos, how) + + Set the current position of file descriptor *fd* to position *pos*, modified by + *how*: ``0`` to set the position relative to the beginning of the file; ``1`` to + set it relative to the current position; ``2`` to set it relative to the end of + the file. Availability: Macintosh, Unix, Windows. + + +.. function:: open(file, flags[, mode]) + + Open the file *file* and set various flags according to *flags* and possibly its + mode according to *mode*. The default *mode* is ``0777`` (octal), and the + current umask value is first masked out. Return the file descriptor for the + newly opened file. Availability: Macintosh, Unix, Windows. + + For a description of the flag and mode values, see the C run-time documentation; + flag constants (like :const:`O_RDONLY` and :const:`O_WRONLY`) are defined in + this module too (see below). + + .. note:: + + This function is intended for low-level I/O. For normal usage, use the built-in + function :func:`open`, which returns a "file object" with :meth:`read` and + :meth:`write` methods (and many more). To wrap a file descriptor in a "file + object", use :func:`fdopen`. + + +.. function:: openpty() + + .. index:: module: pty + + Open a new pseudo-terminal pair. Return a pair of file descriptors ``(master, + slave)`` for the pty and the tty, respectively. For a (slightly) more portable + approach, use the :mod:`pty` module. Availability: Macintosh, Some flavors of + Unix. + + +.. function:: pipe() + + Create a pipe. Return a pair of file descriptors ``(r, w)`` usable for reading + and writing, respectively. Availability: Macintosh, Unix, Windows. + + +.. function:: read(fd, n) + + Read at most *n* bytes from file descriptor *fd*. Return a string containing the + bytes read. If the end of the file referred to by *fd* has been reached, an + empty string is returned. Availability: Macintosh, Unix, Windows. + + .. note:: + + This function is intended for low-level I/O and must be applied to a file + descriptor as returned by :func:`open` or :func:`pipe`. To read a "file object" + returned by the built-in function :func:`open` or by :func:`popen` or + :func:`fdopen`, or ``sys.stdin``, use its :meth:`read` or :meth:`readline` + methods. + + +.. function:: tcgetpgrp(fd) + + Return the process group associated with the terminal given by *fd* (an open + file descriptor as returned by :func:`open`). Availability: Macintosh, Unix. + + +.. function:: tcsetpgrp(fd, pg) + + Set the process group associated with the terminal given by *fd* (an open file + descriptor as returned by :func:`open`) to *pg*. Availability: Macintosh, Unix. + + +.. function:: ttyname(fd) + + Return a string which specifies the terminal device associated with + file-descriptor *fd*. If *fd* is not associated with a terminal device, an + exception is raised. Availability:Macintosh, Unix. + + +.. function:: write(fd, str) + + Write the string *str* to file descriptor *fd*. Return the number of bytes + actually written. Availability: Macintosh, Unix, Windows. + + .. note:: + + This function is intended for low-level I/O and must be applied to a file + descriptor as returned by :func:`open` or :func:`pipe`. To write a "file + object" returned by the built-in function :func:`open` or by :func:`popen` or + :func:`fdopen`, or ``sys.stdout`` or ``sys.stderr``, use its :meth:`write` + method. + +The following data items are available for use in constructing the *flags* +parameter to the :func:`open` function. Some items will not be available on all +platforms. For descriptions of their availability and use, consult +:manpage:`open(2)`. + + +.. data:: O_RDONLY + O_WRONLY + O_RDWR + O_APPEND + O_CREAT + O_EXCL + O_TRUNC + + Options for the *flag* argument to the :func:`open` function. These can be + bit-wise OR'd together. Availability: Macintosh, Unix, Windows. + + +.. data:: O_DSYNC + O_RSYNC + O_SYNC + O_NDELAY + O_NONBLOCK + O_NOCTTY + O_SHLOCK + O_EXLOCK + + More options for the *flag* argument to the :func:`open` function. Availability: + Macintosh, Unix. + + +.. data:: O_BINARY + + Option for the *flag* argument to the :func:`open` function. This can be + bit-wise OR'd together with those listed above. Availability: Windows. + + .. % XXX need to check on the availability of this one. + + +.. data:: O_NOINHERIT + O_SHORT_LIVED + O_TEMPORARY + O_RANDOM + O_SEQUENTIAL + O_TEXT + + Options for the *flag* argument to the :func:`open` function. These can be + bit-wise OR'd together. Availability: Windows. + + +.. data:: SEEK_SET + SEEK_CUR + SEEK_END + + Parameters to the :func:`lseek` function. Their values are 0, 1, and 2, + respectively. Availability: Windows, Macintosh, Unix. + + .. versionadded:: 2.5 + + +.. _os-file-dir: + +Files and Directories +--------------------- + + +.. function:: access(path, mode) + + Use the real uid/gid to test for access to *path*. Note that most operations + will use the effective uid/gid, therefore this routine can be used in a + suid/sgid environment to test if the invoking user has the specified access to + *path*. *mode* should be :const:`F_OK` to test the existence of *path*, or it + can be the inclusive OR of one or more of :const:`R_OK`, :const:`W_OK`, and + :const:`X_OK` to test permissions. Return :const:`True` if access is allowed, + :const:`False` if not. See the Unix man page :manpage:`access(2)` for more + information. Availability: Macintosh, Unix, Windows. + + .. note:: + + Using :func:`access` to check if a user is authorized to e.g. open a file before + actually doing so using :func:`open` creates a security hole, because the user + might exploit the short time interval between checking and opening the file to + manipulate it. + + .. note:: + + I/O operations may fail even when :func:`access` indicates that they would + succeed, particularly for operations on network filesystems which may have + permissions semantics beyond the usual POSIX permission-bit model. + + +.. data:: F_OK + + Value to pass as the *mode* parameter of :func:`access` to test the existence of + *path*. + + +.. data:: R_OK + + Value to include in the *mode* parameter of :func:`access` to test the + readability of *path*. + + +.. data:: W_OK + + Value to include in the *mode* parameter of :func:`access` to test the + writability of *path*. + + +.. data:: X_OK + + Value to include in the *mode* parameter of :func:`access` to determine if + *path* can be executed. + + +.. function:: chdir(path) + + .. index:: single: directory; changing + + Change the current working directory to *path*. Availability: Macintosh, Unix, + Windows. + + +.. function:: fchdir(fd) + + Change the current working directory to the directory represented by the file + descriptor *fd*. The descriptor must refer to an opened directory, not an open + file. Availability: Unix. + + .. versionadded:: 2.3 + + +.. function:: getcwd() + + Return a string representing the current working directory. Availability: + Macintosh, Unix, Windows. + + +.. function:: getcwdu() + + Return a Unicode object representing the current working directory. + Availability: Macintosh, Unix, Windows. + + .. versionadded:: 2.3 + + +.. function:: chflags(path, flags) + + Set the flags of *path* to the numeric *flags*. *flags* may take a combination + (bitwise OR) of the following values (as defined in the :mod:`stat` module): + + * ``UF_NODUMP`` + * ``UF_IMMUTABLE`` + * ``UF_APPEND`` + * ``UF_OPAQUE`` + * ``UF_NOUNLINK`` + * ``SF_ARCHIVED`` + * ``SF_IMMUTABLE`` + * ``SF_APPEND`` + * ``SF_NOUNLINK`` + * ``SF_SNAPSHOT`` + + Availability: Macintosh, Unix. + + .. versionadded:: 2.6 + + +.. function:: chroot(path) + + Change the root directory of the current process to *path*. Availability: + Macintosh, Unix. + + .. versionadded:: 2.2 + + +.. function:: chmod(path, mode) + + Change the mode of *path* to the numeric *mode*. *mode* may take one of the + following values (as defined in the :mod:`stat` module) or bitwise or-ed + combinations of them: + + * ``stat.S_ISUID`` + * ``stat.S_ISGID`` + * ``stat.S_ENFMT`` + * ``stat.S_ISVTX`` + * ``stat.S_IREAD`` + * ``stat.S_IWRITE`` + * ``stat.S_IEXEC`` + * ``stat.S_IRWXU`` + * ``stat.S_IRUSR`` + * ``stat.S_IWUSR`` + * ``stat.S_IXUSR`` + * ``stat.S_IRWXG`` + * ``stat.S_IRGRP`` + * ``stat.S_IWGRP`` + * ``stat.S_IXGRP`` + * ``stat.S_IRWXO`` + * ``stat.S_IROTH`` + * ``stat.S_IWOTH`` + * ``stat.S_IXOTH`` + + Availability: Macintosh, Unix, Windows. + + .. note:: + + Although Windows supports :func:`chmod`, you can only set the file's read-only + flag with it (via the ``stat.S_IWRITE`` and ``stat.S_IREAD`` + constants or a corresponding integer value). All other bits are + ignored. + + +.. function:: chown(path, uid, gid) + + Change the owner and group id of *path* to the numeric *uid* and *gid*. To leave + one of the ids unchanged, set it to -1. Availability: Macintosh, Unix. + + +.. function:: lchflags(path, flags) + + Set the flags of *path* to the numeric *flags*, like :func:`chflags`, but do not + follow symbolic links. Availability: Unix. + + .. versionadded:: 2.6 + + +.. function:: lchown(path, uid, gid) + + Change the owner and group id of *path* to the numeric *uid* and gid. This + function will not follow symbolic links. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. function:: link(src, dst) + + Create a hard link pointing to *src* named *dst*. Availability: Macintosh, Unix. + + +.. function:: listdir(path) + + Return a list containing the names of the entries in the directory. The list is + in arbitrary order. It does not include the special entries ``'.'`` and + ``'..'`` even if they are present in the directory. Availability: Macintosh, + Unix, Windows. + + .. versionchanged:: 2.3 + On Windows NT/2k/XP and Unix, if *path* is a Unicode object, the result will be + a list of Unicode objects. + + +.. function:: lstat(path) + + Like :func:`stat`, but do not follow symbolic links. Availability: Macintosh, + Unix. + + +.. function:: mkfifo(path[, mode]) + + Create a FIFO (a named pipe) named *path* with numeric mode *mode*. The default + *mode* is ``0666`` (octal). The current umask value is first masked out from + the mode. Availability: Macintosh, Unix. + + FIFOs are pipes that can be accessed like regular files. FIFOs exist until they + are deleted (for example with :func:`os.unlink`). Generally, FIFOs are used as + rendezvous between "client" and "server" type processes: the server opens the + FIFO for reading, and the client opens it for writing. Note that :func:`mkfifo` + doesn't open the FIFO --- it just creates the rendezvous point. + + +.. function:: mknod(filename[, mode=0600, device]) + + Create a filesystem node (file, device special file or named pipe) named + *filename*. *mode* specifies both the permissions to use and the type of node to + be created, being combined (bitwise OR) with one of ``stat.S_IFREG``, + ``stat.S_IFCHR``, ``stat.S_IFBLK``, + and ``stat.S_IFIFO`` (those constants are available in :mod:`stat`). + For ``stat.S_IFCHR`` and + ``stat.S_IFBLK``, *device* defines the newly created device special file (probably using + :func:`os.makedev`), otherwise it is ignored. + + .. versionadded:: 2.3 + + +.. function:: major(device) + + Extracts the device major number from a raw device number (usually the + :attr:`st_dev` or :attr:`st_rdev` field from :ctype:`stat`). + + .. versionadded:: 2.3 + + +.. function:: minor(device) + + Extracts the device minor number from a raw device number (usually the + :attr:`st_dev` or :attr:`st_rdev` field from :ctype:`stat`). + + .. versionadded:: 2.3 + + +.. function:: makedev(major, minor) + + Composes a raw device number from the major and minor device numbers. + + .. versionadded:: 2.3 + + +.. function:: mkdir(path[, mode]) + + Create a directory named *path* with numeric mode *mode*. The default *mode* is + ``0777`` (octal). On some systems, *mode* is ignored. Where it is used, the + current umask value is first masked out. Availability: Macintosh, Unix, Windows. + + +.. function:: makedirs(path[, mode]) + + .. index:: + single: directory; creating + single: UNC paths; and os.makedirs() + + Recursive directory creation function. Like :func:`mkdir`, but makes all + intermediate-level directories needed to contain the leaf directory. Throws an + :exc:`error` exception if the leaf directory already exists or cannot be + created. The default *mode* is ``0777`` (octal). On some systems, *mode* is + ignored. Where it is used, the current umask value is first masked out. + + .. note:: + + :func:`makedirs` will become confused if the path elements to create include + *os.pardir*. + + .. versionadded:: 1.5.2 + + .. versionchanged:: 2.3 + This function now handles UNC paths correctly. + + +.. function:: pathconf(path, name) + + Return system configuration information relevant to a named file. *name* + specifies the configuration value to retrieve; it may be a string which is the + name of a defined system value; these names are specified in a number of + standards (POSIX.1, Unix 95, Unix 98, and others). Some platforms define + additional names as well. The names known to the host operating system are + given in the ``pathconf_names`` dictionary. For configuration variables not + included in that mapping, passing an integer for *name* is also accepted. + Availability: Macintosh, Unix. + + If *name* is a string and is not known, :exc:`ValueError` is raised. If a + specific value for *name* is not supported by the host system, even if it is + included in ``pathconf_names``, an :exc:`OSError` is raised with + :const:`errno.EINVAL` for the error number. + + +.. data:: pathconf_names + + Dictionary mapping names accepted by :func:`pathconf` and :func:`fpathconf` to + the integer values defined for those names by the host operating system. This + can be used to determine the set of names known to the system. Availability: + Macintosh, Unix. + + +.. function:: readlink(path) + + Return a string representing the path to which the symbolic link points. The + result may be either an absolute or relative pathname; if it is relative, it may + be converted to an absolute pathname using ``os.path.join(os.path.dirname(path), + result)``. + + .. versionchanged:: 2.6 + If the *path* is a Unicode object the result will also be a Unicode object. + + Availability: Macintosh, Unix. + + +.. function:: remove(path) + + Remove the file *path*. If *path* is a directory, :exc:`OSError` is raised; see + :func:`rmdir` below to remove a directory. This is identical to the + :func:`unlink` function documented below. On Windows, attempting to remove a + file that is in use causes an exception to be raised; on Unix, the directory + entry is removed but the storage allocated to the file is not made available + until the original file is no longer in use. Availability: Macintosh, Unix, + Windows. + + +.. function:: removedirs(path) + + .. index:: single: directory; deleting + + Removes directories recursively. Works like :func:`rmdir` except that, if the + leaf directory is successfully removed, :func:`removedirs` tries to + successively remove every parent directory mentioned in *path* until an error + is raised (which is ignored, because it generally means that a parent directory + is not empty). For example, ``os.removedirs('foo/bar/baz')`` will first remove + the directory ``'foo/bar/baz'``, and then remove ``'foo/bar'`` and ``'foo'`` if + they are empty. Raises :exc:`OSError` if the leaf directory could not be + successfully removed. + + .. versionadded:: 1.5.2 + + +.. function:: rename(src, dst) + + Rename the file or directory *src* to *dst*. If *dst* is a directory, + :exc:`OSError` will be raised. On Unix, if *dst* exists and is a file, it will + be removed silently if the user has permission. The operation may fail on some + Unix flavors if *src* and *dst* are on different filesystems. If successful, + the renaming will be an atomic operation (this is a POSIX requirement). On + Windows, if *dst* already exists, :exc:`OSError` will be raised even if it is a + file; there may be no way to implement an atomic rename when *dst* names an + existing file. Availability: Macintosh, Unix, Windows. + + +.. function:: renames(old, new) + + Recursive directory or file renaming function. Works like :func:`rename`, except + creation of any intermediate directories needed to make the new pathname good is + attempted first. After the rename, directories corresponding to rightmost path + segments of the old name will be pruned away using :func:`removedirs`. + + .. versionadded:: 1.5.2 + + .. note:: + + This function can fail with the new directory structure made if you lack + permissions needed to remove the leaf directory or file. + + +.. function:: rmdir(path) + + Remove the directory *path*. Availability: Macintosh, Unix, Windows. + + +.. function:: stat(path) + + Perform a :cfunc:`stat` system call on the given path. The return value is an + object whose attributes correspond to the members of the :ctype:`stat` + structure, namely: :attr:`st_mode` (protection bits), :attr:`st_ino` (inode + number), :attr:`st_dev` (device), :attr:`st_nlink` (number of hard links), + :attr:`st_uid` (user ID of owner), :attr:`st_gid` (group ID of owner), + :attr:`st_size` (size of file, in bytes), :attr:`st_atime` (time of most recent + access), :attr:`st_mtime` (time of most recent content modification), + :attr:`st_ctime` (platform dependent; time of most recent metadata change on + Unix, or the time of creation on Windows):: + + >>> import os + >>> statinfo = os.stat('somefile.txt') + >>> statinfo + (33188, 422511L, 769L, 1, 1032, 100, 926L, 1105022698,1105022732, 1105022732) + >>> statinfo.st_size + 926L + >>> + + .. versionchanged:: 2.3 + If :func:`stat_float_times` returns true, the time values are floats, measuring + seconds. Fractions of a second may be reported if the system supports that. On + Mac OS, the times are always floats. See :func:`stat_float_times` for further + discussion. + + On some Unix systems (such as Linux), the following attributes may also be + available: :attr:`st_blocks` (number of blocks allocated for file), + :attr:`st_blksize` (filesystem blocksize), :attr:`st_rdev` (type of device if an + inode device). :attr:`st_flags` (user defined flags for file). + + On other Unix systems (such as FreeBSD), the following attributes may be + available (but may be only filled out if root tries to use them): :attr:`st_gen` + (file generation number), :attr:`st_birthtime` (time of file creation). + + On Mac OS systems, the following attributes may also be available: + :attr:`st_rsize`, :attr:`st_creator`, :attr:`st_type`. + + On RISCOS systems, the following attributes are also available: :attr:`st_ftype` + (file type), :attr:`st_attrs` (attributes), :attr:`st_obtype` (object type). + + .. index:: module: stat + + For backward compatibility, the return value of :func:`stat` is also accessible + as a tuple of at least 10 integers giving the most important (and portable) + members of the :ctype:`stat` structure, in the order :attr:`st_mode`, + :attr:`st_ino`, :attr:`st_dev`, :attr:`st_nlink`, :attr:`st_uid`, + :attr:`st_gid`, :attr:`st_size`, :attr:`st_atime`, :attr:`st_mtime`, + :attr:`st_ctime`. More items may be added at the end by some implementations. + The standard module :mod:`stat` defines functions and constants that are useful + for extracting information from a :ctype:`stat` structure. (On Windows, some + items are filled with dummy values.) + + .. note:: + + The exact meaning and resolution of the :attr:`st_atime`, :attr:`st_mtime`, and + :attr:`st_ctime` members depends on the operating system and the file system. + For example, on Windows systems using the FAT or FAT32 file systems, + :attr:`st_mtime` has 2-second resolution, and :attr:`st_atime` has only 1-day + resolution. See your operating system documentation for details. + + Availability: Macintosh, Unix, Windows. + + .. versionchanged:: 2.2 + Added access to values as attributes of the returned object. + + .. versionchanged:: 2.5 + Added st_gen, st_birthtime. + + +.. function:: stat_float_times([newvalue]) + + Determine whether :class:`stat_result` represents time stamps as float objects. + If *newvalue* is ``True``, future calls to :func:`stat` return floats, if it is + ``False``, future calls return ints. If *newvalue* is omitted, return the + current setting. + + For compatibility with older Python versions, accessing :class:`stat_result` as + a tuple always returns integers. + + .. versionchanged:: 2.5 + Python now returns float values by default. Applications which do not work + correctly with floating point time stamps can use this function to restore the + old behaviour. + + The resolution of the timestamps (that is the smallest possible fraction) + depends on the system. Some systems only support second resolution; on these + systems, the fraction will always be zero. + + It is recommended that this setting is only changed at program startup time in + the *__main__* module; libraries should never change this setting. If an + application uses a library that works incorrectly if floating point time stamps + are processed, this application should turn the feature off until the library + has been corrected. + + +.. function:: statvfs(path) + + Perform a :cfunc:`statvfs` system call on the given path. The return value is + an object whose attributes describe the filesystem on the given path, and + correspond to the members of the :ctype:`statvfs` structure, namely: + :attr:`f_bsize`, :attr:`f_frsize`, :attr:`f_blocks`, :attr:`f_bfree`, + :attr:`f_bavail`, :attr:`f_files`, :attr:`f_ffree`, :attr:`f_favail`, + :attr:`f_flag`, :attr:`f_namemax`. Availability: Unix. + + .. index:: module: statvfs + + For backward compatibility, the return value is also accessible as a tuple whose + values correspond to the attributes, in the order given above. The standard + module :mod:`statvfs` defines constants that are useful for extracting + information from a :ctype:`statvfs` structure when accessing it as a sequence; + this remains useful when writing code that needs to work with versions of Python + that don't support accessing the fields as attributes. + + .. versionchanged:: 2.2 + Added access to values as attributes of the returned object. + + +.. function:: symlink(src, dst) + + Create a symbolic link pointing to *src* named *dst*. Availability: Unix. + + +.. function:: tempnam([dir[, prefix]]) + + Return a unique path name that is reasonable for creating a temporary file. + This will be an absolute path that names a potential directory entry in the + directory *dir* or a common location for temporary files if *dir* is omitted or + ``None``. If given and not ``None``, *prefix* is used to provide a short prefix + to the filename. Applications are responsible for properly creating and + managing files created using paths returned by :func:`tempnam`; no automatic + cleanup is provided. On Unix, the environment variable :envvar:`TMPDIR` + overrides *dir*, while on Windows the :envvar:`TMP` is used. The specific + behavior of this function depends on the C library implementation; some aspects + are underspecified in system documentation. + + .. warning:: + + Use of :func:`tempnam` is vulnerable to symlink attacks; consider using + :func:`tmpfile` (section :ref:`os-newstreams`) instead. + + Availability: Macintosh, Unix, Windows. + + +.. function:: tmpnam() + + Return a unique path name that is reasonable for creating a temporary file. + This will be an absolute path that names a potential directory entry in a common + location for temporary files. Applications are responsible for properly + creating and managing files created using paths returned by :func:`tmpnam`; no + automatic cleanup is provided. + + .. warning:: + + Use of :func:`tmpnam` is vulnerable to symlink attacks; consider using + :func:`tmpfile` (section :ref:`os-newstreams`) instead. + + Availability: Unix, Windows. This function probably shouldn't be used on + Windows, though: Microsoft's implementation of :func:`tmpnam` always creates a + name in the root directory of the current drive, and that's generally a poor + location for a temp file (depending on privileges, you may not even be able to + open a file using this name). + + +.. data:: TMP_MAX + + The maximum number of unique names that :func:`tmpnam` will generate before + reusing names. + + +.. function:: unlink(path) + + Remove the file *path*. This is the same function as :func:`remove`; the + :func:`unlink` name is its traditional Unix name. Availability: Macintosh, Unix, + Windows. + + +.. function:: utime(path, times) + + Set the access and modified times of the file specified by *path*. If *times* is + ``None``, then the file's access and modified times are set to the current time. + Otherwise, *times* must be a 2-tuple of numbers, of the form ``(atime, mtime)`` + which is used to set the access and modified times, respectively. Whether a + directory can be given for *path* depends on whether the operating system + implements directories as files (for example, Windows does not). Note that the + exact times you set here may not be returned by a subsequent :func:`stat` call, + depending on the resolution with which your operating system records access and + modification times; see :func:`stat`. + + .. versionchanged:: 2.0 + Added support for ``None`` for *times*. + + Availability: Macintosh, Unix, Windows. + + +.. function:: walk(top[, topdown=True [, onerror=None[, followlinks=False]]]) + + .. index:: + single: directory; walking + single: directory; traversal + + :func:`walk` generates the file names in a directory tree, by walking the tree + either top down or bottom up. For each directory in the tree rooted at directory + *top* (including *top* itself), it yields a 3-tuple ``(dirpath, dirnames, + filenames)``. + + *dirpath* is a string, the path to the directory. *dirnames* is a list of the + names of the subdirectories in *dirpath* (excluding ``'.'`` and ``'..'``). + *filenames* is a list of the names of the non-directory files in *dirpath*. + Note that the names in the lists contain no path components. To get a full path + (which begins with *top*) to a file or directory in *dirpath*, do + ``os.path.join(dirpath, name)``. + + If optional argument *topdown* is true or not specified, the triple for a + directory is generated before the triples for any of its subdirectories + (directories are generated top down). If *topdown* is false, the triple for a + directory is generated after the triples for all of its subdirectories + (directories are generated bottom up). + + When *topdown* is true, the caller can modify the *dirnames* list in-place + (perhaps using :keyword:`del` or slice assignment), and :func:`walk` will only + recurse into the subdirectories whose names remain in *dirnames*; this can be + used to prune the search, impose a specific order of visiting, or even to inform + :func:`walk` about directories the caller creates or renames before it resumes + :func:`walk` again. Modifying *dirnames* when *topdown* is false is + ineffective, because in bottom-up mode the directories in *dirnames* are + generated before *dirpath* itself is generated. + + By default errors from the ``os.listdir()`` call are ignored. If optional + argument *onerror* is specified, it should be a function; it will be called with + one argument, an :exc:`OSError` instance. It can report the error to continue + with the walk, or raise the exception to abort the walk. Note that the filename + is available as the ``filename`` attribute of the exception object. + + By default, :func:`walk` will not walk down into symbolic links that resolve to + directories. Set *followlinks* to True to visit directories pointed to by + symlinks, on systems that support them. + + .. versionadded:: 2.6 + The *followlinks* parameter. + + .. note:: + + Be aware that setting *followlinks* to true can lead to infinite recursion if a + link points to a parent directory of itself. :func:`walk` does not keep track of + the directories it visited already. + + .. note:: + + If you pass a relative pathname, don't change the current working directory + between resumptions of :func:`walk`. :func:`walk` never changes the current + directory, and assumes that its caller doesn't either. + + This example displays the number of bytes taken by non-directory files in each + directory under the starting directory, except that it doesn't look under any + CVS subdirectory:: + + import os + from os.path import join, getsize + for root, dirs, files in os.walk('python/Lib/email'): + print root, "consumes", + print sum(getsize(join(root, name)) for name in files), + print "bytes in", len(files), "non-directory files" + if 'CVS' in dirs: + dirs.remove('CVS') # don't visit CVS directories + + In the next example, walking the tree bottom up is essential: :func:`rmdir` + doesn't allow deleting a directory before the directory is empty:: + + # Delete everything reachable from the directory named in 'top', + # assuming there are no symbolic links. + # CAUTION: This is dangerous! For example, if top == '/', it + # could delete all your disk files. + import os + for root, dirs, files in os.walk(top, topdown=False): + for name in files: + os.remove(os.path.join(root, name)) + for name in dirs: + os.rmdir(os.path.join(root, name)) + + .. versionadded:: 2.3 + + +.. _os-process: + +Process Management +------------------ + +These functions may be used to create and manage processes. + +The various :func:`exec\*` functions take a list of arguments for the new +program loaded into the process. In each case, the first of these arguments is +passed to the new program as its own name rather than as an argument a user may +have typed on a command line. For the C programmer, this is the ``argv[0]`` +passed to a program's :cfunc:`main`. For example, ``os.execv('/bin/echo', +['foo', 'bar'])`` will only print ``bar`` on standard output; ``foo`` will seem +to be ignored. + + +.. function:: abort() + + Generate a :const:`SIGABRT` signal to the current process. On Unix, the default + behavior is to produce a core dump; on Windows, the process immediately returns + an exit code of ``3``. Be aware that programs which use :func:`signal.signal` + to register a handler for :const:`SIGABRT` will behave differently. + Availability: Macintosh, Unix, Windows. + + +.. function:: execl(path, arg0, arg1, ...) + execle(path, arg0, arg1, ..., env) + execlp(file, arg0, arg1, ...) + execlpe(file, arg0, arg1, ..., env) + execv(path, args) + execve(path, args, env) + execvp(file, args) + execvpe(file, args, env) + + These functions all execute a new program, replacing the current process; they + do not return. On Unix, the new executable is loaded into the current process, + and will have the same process ID as the caller. Errors will be reported as + :exc:`OSError` exceptions. + + The ``'l'`` and ``'v'`` variants of the :func:`exec\*` functions differ in how + command-line arguments are passed. The ``'l'`` variants are perhaps the easiest + to work with if the number of parameters is fixed when the code is written; the + individual parameters simply become additional parameters to the :func:`execl\*` + functions. The ``'v'`` variants are good when the number of parameters is + variable, with the arguments being passed in a list or tuple as the *args* + parameter. In either case, the arguments to the child process should start with + the name of the command being run, but this is not enforced. + + The variants which include a ``'p'`` near the end (:func:`execlp`, + :func:`execlpe`, :func:`execvp`, and :func:`execvpe`) will use the + :envvar:`PATH` environment variable to locate the program *file*. When the + environment is being replaced (using one of the :func:`exec\*e` variants, + discussed in the next paragraph), the new environment is used as the source of + the :envvar:`PATH` variable. The other variants, :func:`execl`, :func:`execle`, + :func:`execv`, and :func:`execve`, will not use the :envvar:`PATH` variable to + locate the executable; *path* must contain an appropriate absolute or relative + path. + + For :func:`execle`, :func:`execlpe`, :func:`execve`, and :func:`execvpe` (note + that these all end in ``'e'``), the *env* parameter must be a mapping which is + used to define the environment variables for the new process; the :func:`execl`, + :func:`execlp`, :func:`execv`, and :func:`execvp` all cause the new process to + inherit the environment of the current process. Availability: Macintosh, Unix, + Windows. + + +.. function:: _exit(n) + + Exit to the system with status *n*, without calling cleanup handlers, flushing + stdio buffers, etc. Availability: Macintosh, Unix, Windows. + + .. note:: + + The standard way to exit is ``sys.exit(n)``. :func:`_exit` should normally only + be used in the child process after a :func:`fork`. + +The following exit codes are a defined, and can be used with :func:`_exit`, +although they are not required. These are typically used for system programs +written in Python, such as a mail server's external command delivery program. + +.. note:: + + Some of these may not be available on all Unix platforms, since there is some + variation. These constants are defined where they are defined by the underlying + platform. + + +.. data:: EX_OK + + Exit code that means no error occurred. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_USAGE + + Exit code that means the command was used incorrectly, such as when the wrong + number of arguments are given. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_DATAERR + + Exit code that means the input data was incorrect. Availability: Macintosh, + Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_NOINPUT + + Exit code that means an input file did not exist or was not readable. + Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_NOUSER + + Exit code that means a specified user did not exist. Availability: Macintosh, + Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_NOHOST + + Exit code that means a specified host did not exist. Availability: Macintosh, + Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_UNAVAILABLE + + Exit code that means that a required service is unavailable. Availability: + Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_SOFTWARE + + Exit code that means an internal software error was detected. Availability: + Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_OSERR + + Exit code that means an operating system error was detected, such as the + inability to fork or create a pipe. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_OSFILE + + Exit code that means some system file did not exist, could not be opened, or had + some other kind of error. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_CANTCREAT + + Exit code that means a user specified output file could not be created. + Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_IOERR + + Exit code that means that an error occurred while doing I/O on some file. + Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_TEMPFAIL + + Exit code that means a temporary failure occurred. This indicates something + that may not really be an error, such as a network connection that couldn't be + made during a retryable operation. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_PROTOCOL + + Exit code that means that a protocol exchange was illegal, invalid, or not + understood. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_NOPERM + + Exit code that means that there were insufficient permissions to perform the + operation (but not intended for file system problems). Availability: Macintosh, + Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_CONFIG + + Exit code that means that some kind of configuration error occurred. + Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. data:: EX_NOTFOUND + + Exit code that means something like "an entry was not found". Availability: + Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. function:: fork() + + Fork a child process. Return ``0`` in the child, the child's process id in the + parent. Availability: Macintosh, Unix. + + +.. function:: forkpty() + + Fork a child process, using a new pseudo-terminal as the child's controlling + terminal. Return a pair of ``(pid, fd)``, where *pid* is ``0`` in the child, the + new child's process id in the parent, and *fd* is the file descriptor of the + master end of the pseudo-terminal. For a more portable approach, use the + :mod:`pty` module. Availability: Macintosh, Some flavors of Unix. + + +.. function:: kill(pid, sig) + + .. index:: + single: process; killing + single: process; signalling + + Send signal *sig* to the process *pid*. Constants for the specific signals + available on the host platform are defined in the :mod:`signal` module. + Availability: Macintosh, Unix. + + +.. function:: killpg(pgid, sig) + + .. index:: + single: process; killing + single: process; signalling + + Send the signal *sig* to the process group *pgid*. Availability: Macintosh, + Unix. + + .. versionadded:: 2.3 + + +.. function:: nice(increment) + + Add *increment* to the process's "niceness". Return the new niceness. + Availability: Macintosh, Unix. + + +.. function:: plock(op) + + Lock program segments into memory. The value of *op* (defined in + ``<sys/lock.h>``) determines which segments are locked. Availability: Macintosh, + Unix. + + +.. function:: popen(...) + :noindex: + + Run child processes, returning opened pipes for communications. These functions + are described in section :ref:`os-newstreams`. + + +.. function:: spawnl(mode, path, ...) + spawnle(mode, path, ..., env) + spawnlp(mode, file, ...) + spawnlpe(mode, file, ..., env) + spawnv(mode, path, args) + spawnve(mode, path, args, env) + spawnvp(mode, file, args) + spawnvpe(mode, file, args, env) + + Execute the program *path* in a new process. + + (Note that the :mod:`subprocess` module provides more powerful facilities for + spawning new processes and retrieving their results; using that module is + preferable to using these functions.) + + If *mode* is :const:`P_NOWAIT`, this function returns the process ID of the new + process; if *mode* is :const:`P_WAIT`, returns the process's exit code if it + exits normally, or ``-signal``, where *signal* is the signal that killed the + process. On Windows, the process ID will actually be the process handle, so can + be used with the :func:`waitpid` function. + + The ``'l'`` and ``'v'`` variants of the :func:`spawn\*` functions differ in how + command-line arguments are passed. The ``'l'`` variants are perhaps the easiest + to work with if the number of parameters is fixed when the code is written; the + individual parameters simply become additional parameters to the + :func:`spawnl\*` functions. The ``'v'`` variants are good when the number of + parameters is variable, with the arguments being passed in a list or tuple as + the *args* parameter. In either case, the arguments to the child process must + start with the name of the command being run. + + The variants which include a second ``'p'`` near the end (:func:`spawnlp`, + :func:`spawnlpe`, :func:`spawnvp`, and :func:`spawnvpe`) will use the + :envvar:`PATH` environment variable to locate the program *file*. When the + environment is being replaced (using one of the :func:`spawn\*e` variants, + discussed in the next paragraph), the new environment is used as the source of + the :envvar:`PATH` variable. The other variants, :func:`spawnl`, + :func:`spawnle`, :func:`spawnv`, and :func:`spawnve`, will not use the + :envvar:`PATH` variable to locate the executable; *path* must contain an + appropriate absolute or relative path. + + For :func:`spawnle`, :func:`spawnlpe`, :func:`spawnve`, and :func:`spawnvpe` + (note that these all end in ``'e'``), the *env* parameter must be a mapping + which is used to define the environment variables for the new process; the + :func:`spawnl`, :func:`spawnlp`, :func:`spawnv`, and :func:`spawnvp` all cause + the new process to inherit the environment of the current process. + + As an example, the following calls to :func:`spawnlp` and :func:`spawnvpe` are + equivalent:: + + import os + os.spawnlp(os.P_WAIT, 'cp', 'cp', 'index.html', '/dev/null') + + L = ['cp', 'index.html', '/dev/null'] + os.spawnvpe(os.P_WAIT, 'cp', L, os.environ) + + Availability: Unix, Windows. :func:`spawnlp`, :func:`spawnlpe`, :func:`spawnvp` + and :func:`spawnvpe` are not available on Windows. + + .. versionadded:: 1.6 + + +.. data:: P_NOWAIT + P_NOWAITO + + Possible values for the *mode* parameter to the :func:`spawn\*` family of + functions. If either of these values is given, the :func:`spawn\*` functions + will return as soon as the new process has been created, with the process ID as + the return value. Availability: Macintosh, Unix, Windows. + + .. versionadded:: 1.6 + + +.. data:: P_WAIT + + Possible value for the *mode* parameter to the :func:`spawn\*` family of + functions. If this is given as *mode*, the :func:`spawn\*` functions will not + return until the new process has run to completion and will return the exit code + of the process the run is successful, or ``-signal`` if a signal kills the + process. Availability: Macintosh, Unix, Windows. + + .. versionadded:: 1.6 + + +.. data:: P_DETACH + P_OVERLAY + + Possible values for the *mode* parameter to the :func:`spawn\*` family of + functions. These are less portable than those listed above. :const:`P_DETACH` + is similar to :const:`P_NOWAIT`, but the new process is detached from the + console of the calling process. If :const:`P_OVERLAY` is used, the current + process will be replaced; the :func:`spawn\*` function will not return. + Availability: Windows. + + .. versionadded:: 1.6 + + +.. function:: startfile(path[, operation]) + + Start a file with its associated application. + + When *operation* is not specified or ``'open'``, this acts like double-clicking + the file in Windows Explorer, or giving the file name as an argument to the + :program:`start` command from the interactive command shell: the file is opened + with whatever application (if any) its extension is associated. + + When another *operation* is given, it must be a "command verb" that specifies + what should be done with the file. Common verbs documented by Microsoft are + ``'print'`` and ``'edit'`` (to be used on files) as well as ``'explore'`` and + ``'find'`` (to be used on directories). + + :func:`startfile` returns as soon as the associated application is launched. + There is no option to wait for the application to close, and no way to retrieve + the application's exit status. The *path* parameter is relative to the current + directory. If you want to use an absolute path, make sure the first character + is not a slash (``'/'``); the underlying Win32 :cfunc:`ShellExecute` function + doesn't work if it is. Use the :func:`os.path.normpath` function to ensure that + the path is properly encoded for Win32. Availability: Windows. + + .. versionadded:: 2.0 + + .. versionadded:: 2.5 + The *operation* parameter. + + +.. function:: system(command) + + Execute the command (a string) in a subshell. This is implemented by calling + the Standard C function :cfunc:`system`, and has the same limitations. Changes + to ``posix.environ``, ``sys.stdin``, etc. are not reflected in the environment + of the executed command. + + On Unix, the return value is the exit status of the process encoded in the + format specified for :func:`wait`. Note that POSIX does not specify the meaning + of the return value of the C :cfunc:`system` function, so the return value of + the Python function is system-dependent. + + On Windows, the return value is that returned by the system shell after running + *command*, given by the Windows environment variable :envvar:`COMSPEC`: on + :program:`command.com` systems (Windows 95, 98 and ME) this is always ``0``; on + :program:`cmd.exe` systems (Windows NT, 2000 and XP) this is the exit status of + the command run; on systems using a non-native shell, consult your shell + documentation. + + Availability: Macintosh, Unix, Windows. + + The :mod:`subprocess` module provides more powerful facilities for spawning new + processes and retrieving their results; using that module is preferable to using + this function. + + +.. function:: times() + + Return a 5-tuple of floating point numbers indicating accumulated (processor or + other) times, in seconds. The items are: user time, system time, children's + user time, children's system time, and elapsed real time since a fixed point in + the past, in that order. See the Unix manual page :manpage:`times(2)` or the + corresponding Windows Platform API documentation. Availability: Macintosh, Unix, + Windows. + + +.. function:: wait() + + Wait for completion of a child process, and return a tuple containing its pid + and exit status indication: a 16-bit number, whose low byte is the signal number + that killed the process, and whose high byte is the exit status (if the signal + number is zero); the high bit of the low byte is set if a core file was + produced. Availability: Macintosh, Unix. + + +.. function:: waitpid(pid, options) + + The details of this function differ on Unix and Windows. + + On Unix: Wait for completion of a child process given by process id *pid*, and + return a tuple containing its process id and exit status indication (encoded as + for :func:`wait`). The semantics of the call are affected by the value of the + integer *options*, which should be ``0`` for normal operation. + + If *pid* is greater than ``0``, :func:`waitpid` requests status information for + that specific process. If *pid* is ``0``, the request is for the status of any + child in the process group of the current process. If *pid* is ``-1``, the + request pertains to any child of the current process. If *pid* is less than + ``-1``, status is requested for any process in the process group ``-pid`` (the + absolute value of *pid*). + + On Windows: Wait for completion of a process given by process handle *pid*, and + return a tuple containing *pid*, and its exit status shifted left by 8 bits + (shifting makes cross-platform use of the function easier). A *pid* less than or + equal to ``0`` has no special meaning on Windows, and raises an exception. The + value of integer *options* has no effect. *pid* can refer to any process whose + id is known, not necessarily a child process. The :func:`spawn` functions called + with :const:`P_NOWAIT` return suitable process handles. + + +.. function:: wait3([options]) + + Similar to :func:`waitpid`, except no process id argument is given and a + 3-element tuple containing the child's process id, exit status indication, and + resource usage information is returned. Refer to :mod:`resource`.\ + :func:`getrusage` for details on resource usage information. The option + argument is the same as that provided to :func:`waitpid` and :func:`wait4`. + Availability: Unix. + + .. versionadded:: 2.5 + + +.. function:: wait4(pid, options) + + Similar to :func:`waitpid`, except a 3-element tuple, containing the child's + process id, exit status indication, and resource usage information is returned. + Refer to :mod:`resource`.\ :func:`getrusage` for details on resource usage + information. The arguments to :func:`wait4` are the same as those provided to + :func:`waitpid`. Availability: Unix. + + .. versionadded:: 2.5 + + +.. data:: WNOHANG + + The option for :func:`waitpid` to return immediately if no child process status + is available immediately. The function returns ``(0, 0)`` in this case. + Availability: Macintosh, Unix. + + +.. data:: WCONTINUED + + This option causes child processes to be reported if they have been continued + from a job control stop since their status was last reported. Availability: Some + Unix systems. + + .. versionadded:: 2.3 + + +.. data:: WUNTRACED + + This option causes child processes to be reported if they have been stopped but + their current state has not been reported since they were stopped. Availability: + Macintosh, Unix. + + .. versionadded:: 2.3 + +The following functions take a process status code as returned by +:func:`system`, :func:`wait`, or :func:`waitpid` as a parameter. They may be +used to determine the disposition of a process. + + +.. function:: WCOREDUMP(status) + + Returns ``True`` if a core dump was generated for the process, otherwise it + returns ``False``. Availability: Macintosh, Unix. + + .. versionadded:: 2.3 + + +.. function:: WIFCONTINUED(status) + + Returns ``True`` if the process has been continued from a job control stop, + otherwise it returns ``False``. Availability: Unix. + + .. versionadded:: 2.3 + + +.. function:: WIFSTOPPED(status) + + Returns ``True`` if the process has been stopped, otherwise it returns + ``False``. Availability: Unix. + + +.. function:: WIFSIGNALED(status) + + Returns ``True`` if the process exited due to a signal, otherwise it returns + ``False``. Availability: Macintosh, Unix. + + +.. function:: WIFEXITED(status) + + Returns ``True`` if the process exited using the :manpage:`exit(2)` system call, + otherwise it returns ``False``. Availability: Macintosh, Unix. + + +.. function:: WEXITSTATUS(status) + + If ``WIFEXITED(status)`` is true, return the integer parameter to the + :manpage:`exit(2)` system call. Otherwise, the return value is meaningless. + Availability: Macintosh, Unix. + + +.. function:: WSTOPSIG(status) + + Return the signal which caused the process to stop. Availability: Macintosh, + Unix. + + +.. function:: WTERMSIG(status) + + Return the signal which caused the process to exit. Availability: Macintosh, + Unix. + + +.. _os-path: + +Miscellaneous System Information +-------------------------------- + + +.. function:: confstr(name) + + Return string-valued system configuration values. *name* specifies the + configuration value to retrieve; it may be a string which is the name of a + defined system value; these names are specified in a number of standards (POSIX, + Unix 95, Unix 98, and others). Some platforms define additional names as well. + The names known to the host operating system are given as the keys of the + ``confstr_names`` dictionary. For configuration variables not included in that + mapping, passing an integer for *name* is also accepted. Availability: + Macintosh, Unix. + + If the configuration value specified by *name* isn't defined, ``None`` is + returned. + + If *name* is a string and is not known, :exc:`ValueError` is raised. If a + specific value for *name* is not supported by the host system, even if it is + included in ``confstr_names``, an :exc:`OSError` is raised with + :const:`errno.EINVAL` for the error number. + + +.. data:: confstr_names + + Dictionary mapping names accepted by :func:`confstr` to the integer values + defined for those names by the host operating system. This can be used to + determine the set of names known to the system. Availability: Macintosh, Unix. + + +.. function:: getloadavg() + + Return the number of processes in the system run queue averaged over the last 1, + 5, and 15 minutes or raises :exc:`OSError` if the load average was + unobtainable. + + .. versionadded:: 2.3 + + +.. function:: sysconf(name) + + Return integer-valued system configuration values. If the configuration value + specified by *name* isn't defined, ``-1`` is returned. The comments regarding + the *name* parameter for :func:`confstr` apply here as well; the dictionary that + provides information on the known names is given by ``sysconf_names``. + Availability: Macintosh, Unix. + + +.. data:: sysconf_names + + Dictionary mapping names accepted by :func:`sysconf` to the integer values + defined for those names by the host operating system. This can be used to + determine the set of names known to the system. Availability: Macintosh, Unix. + +The follow data values are used to support path manipulation operations. These +are defined for all platforms. + +Higher-level operations on pathnames are defined in the :mod:`os.path` module. + + +.. data:: curdir + + The constant string used by the operating system to refer to the current + directory. For example: ``'.'`` for POSIX or ``':'`` for Mac OS 9. Also + available via :mod:`os.path`. + + +.. data:: pardir + + The constant string used by the operating system to refer to the parent + directory. For example: ``'..'`` for POSIX or ``'::'`` for Mac OS 9. Also + available via :mod:`os.path`. + + +.. data:: sep + + The character used by the operating system to separate pathname components, for + example, ``'/'`` for POSIX or ``':'`` for Mac OS 9. Note that knowing this is + not sufficient to be able to parse or concatenate pathnames --- use + :func:`os.path.split` and :func:`os.path.join` --- but it is occasionally + useful. Also available via :mod:`os.path`. + + +.. data:: altsep + + An alternative character used by the operating system to separate pathname + components, or ``None`` if only one separator character exists. This is set to + ``'/'`` on Windows systems where ``sep`` is a backslash. Also available via + :mod:`os.path`. + + +.. data:: extsep + + The character which separates the base filename from the extension; for example, + the ``'.'`` in :file:`os.py`. Also available via :mod:`os.path`. + + .. versionadded:: 2.2 + + +.. data:: pathsep + + The character conventionally used by the operating system to separate search + path components (as in :envvar:`PATH`), such as ``':'`` for POSIX or ``';'`` for + Windows. Also available via :mod:`os.path`. + + +.. data:: defpath + + The default search path used by :func:`exec\*p\*` and :func:`spawn\*p\*` if the + environment doesn't have a ``'PATH'`` key. Also available via :mod:`os.path`. + + +.. data:: linesep + + The string used to separate (or, rather, terminate) lines on the current + platform. This may be a single character, such as ``'\n'`` for POSIX or + ``'\r'`` for Mac OS, or multiple characters, for example, ``'\r\n'`` for + Windows. Do not use *os.linesep* as a line terminator when writing files opened + in text mode (the default); use a single ``'\n'`` instead, on all platforms. + + +.. data:: devnull + + The file path of the null device. For example: ``'/dev/null'`` for POSIX or + ``'Dev:Nul'`` for Mac OS 9. Also available via :mod:`os.path`. + + .. versionadded:: 2.4 + + +.. _os-miscfunc: + +Miscellaneous Functions +----------------------- + + +.. function:: urandom(n) + + Return a string of *n* random bytes suitable for cryptographic use. + + This function returns random bytes from an OS-specific randomness source. The + returned data should be unpredictable enough for cryptographic applications, + though its exact quality depends on the OS implementation. On a UNIX-like + system this will query /dev/urandom, and on Windows it will use CryptGenRandom. + If a randomness source is not found, :exc:`NotImplementedError` will be raised. + + .. versionadded:: 2.4 + diff --git a/Doc/library/ossaudiodev.rst b/Doc/library/ossaudiodev.rst new file mode 100644 index 0000000..066b26b --- /dev/null +++ b/Doc/library/ossaudiodev.rst @@ -0,0 +1,429 @@ + +:mod:`ossaudiodev` --- Access to OSS-compatible audio devices +============================================================= + +.. module:: ossaudiodev + :platform: Linux, FreeBSD + :synopsis: Access to OSS-compatible audio devices. + + +.. versionadded:: 2.3 + +This module allows you to access the OSS (Open Sound System) audio interface. +OSS is available for a wide range of open-source and commercial Unices, and is +the standard audio interface for Linux and recent versions of FreeBSD. + +.. % Things will get more complicated for future Linux versions, since +.. % ALSA is in the standard kernel as of 2.5.x. Presumably if you +.. % use ALSA, you'll have to make sure its OSS compatibility layer +.. % is active to use ossaudiodev, but you're gonna need it for the vast +.. % majority of Linux audio apps anyways. +.. % +.. % Sounds like things are also complicated for other BSDs. In response +.. % to my python-dev query, Thomas Wouters said: +.. % +.. % > Likewise, googling shows OpenBSD also uses OSS/Free -- the commercial +.. % > OSS installation manual tells you to remove references to OSS/Free from the +.. % > kernel :) +.. % +.. % but Aleksander Piotrowsk actually has an OpenBSD box, and he quotes +.. % from its <soundcard.h>: +.. % > * WARNING! WARNING! +.. % > * This is an OSS (Linux) audio emulator. +.. % > * Use the Native NetBSD API for developing new code, and this +.. % > * only for compiling Linux programs. +.. % +.. % There's also an ossaudio manpage on OpenBSD that explains things +.. % further. Presumably NetBSD and OpenBSD have a different standard +.. % audio interface. That's the great thing about standards, there are so +.. % many to choose from ... ;-) +.. % +.. % This probably all warrants a footnote or two, but I don't understand +.. % things well enough right now to write it! --GPW + + +.. seealso:: + + `Open Sound System Programmer's Guide <http://www.opensound.com/pguide/oss.pdf>`_ + the official documentation for the OSS C API + + The module defines a large number of constants supplied by the OSS device + driver; see ``<sys/soundcard.h>`` on either Linux or FreeBSD for a listing . + +:mod:`ossaudiodev` defines the following variables and functions: + + +.. exception:: OSSAudioError + + This exception is raised on certain errors. The argument is a string describing + what went wrong. + + (If :mod:`ossaudiodev` receives an error from a system call such as + :cfunc:`open`, :cfunc:`write`, or :cfunc:`ioctl`, it raises :exc:`IOError`. + Errors detected directly by :mod:`ossaudiodev` result in :exc:`OSSAudioError`.) + + (For backwards compatibility, the exception class is also available as + ``ossaudiodev.error``.) + + +.. function:: open([device, ]mode) + + Open an audio device and return an OSS audio device object. This object + supports many file-like methods, such as :meth:`read`, :meth:`write`, and + :meth:`fileno` (although there are subtle differences between conventional Unix + read/write semantics and those of OSS audio devices). It also supports a number + of audio-specific methods; see below for the complete list of methods. + + *device* is the audio device filename to use. If it is not specified, this + module first looks in the environment variable :envvar:`AUDIODEV` for a device + to use. If not found, it falls back to :file:`/dev/dsp`. + + *mode* is one of ``'r'`` for read-only (record) access, ``'w'`` for + write-only (playback) access and ``'rw'`` for both. Since many sound cards + only allow one process to have the recorder or player open at a time, it is a + good idea to open the device only for the activity needed. Further, some + sound cards are half-duplex: they can be opened for reading or writing, but + not both at once. + + Note the unusual calling syntax: the *first* argument is optional, and the + second is required. This is a historical artifact for compatibility with the + older :mod:`linuxaudiodev` module which :mod:`ossaudiodev` supersedes. + + .. % XXX it might also be motivated + .. % by my unfounded-but-still-possibly-true belief that the default + .. % audio device varies unpredictably across operating systems. -GW + + +.. function:: openmixer([device]) + + Open a mixer device and return an OSS mixer device object. *device* is the + mixer device filename to use. If it is not specified, this module first looks + in the environment variable :envvar:`MIXERDEV` for a device to use. If not + found, it falls back to :file:`/dev/mixer`. + + +.. _ossaudio-device-objects: + +Audio Device Objects +-------------------- + +Before you can write to or read from an audio device, you must call three +methods in the correct order: + +#. :meth:`setfmt` to set the output format + +#. :meth:`channels` to set the number of channels + +#. :meth:`speed` to set the sample rate + +Alternately, you can use the :meth:`setparameters` method to set all three audio +parameters at once. This is more convenient, but may not be as flexible in all +cases. + +The audio device objects returned by :func:`open` define the following methods +and (read-only) attributes: + + +.. method:: oss_audio_device.close() + + Explicitly close the audio device. When you are done writing to or reading from + an audio device, you should explicitly close it. A closed device cannot be used + again. + + +.. method:: oss_audio_device.fileno() + + Return the file descriptor associated with the device. + + +.. method:: oss_audio_device.read(size) + + Read *size* bytes from the audio input and return them as a Python string. + Unlike most Unix device drivers, OSS audio devices in blocking mode (the + default) will block :func:`read` until the entire requested amount of data is + available. + + +.. method:: oss_audio_device.write(data) + + Write the Python string *data* to the audio device and return the number of + bytes written. If the audio device is in blocking mode (the default), the + entire string is always written (again, this is different from usual Unix device + semantics). If the device is in non-blocking mode, some data may not be written + ---see :meth:`writeall`. + + +.. method:: oss_audio_device.writeall(data) + + Write the entire Python string *data* to the audio device: waits until the audio + device is able to accept data, writes as much data as it will accept, and + repeats until *data* has been completely written. If the device is in blocking + mode (the default), this has the same effect as :meth:`write`; :meth:`writeall` + is only useful in non-blocking mode. Has no return value, since the amount of + data written is always equal to the amount of data supplied. + +The following methods each map to exactly one :func:`ioctl` system call. The +correspondence is obvious: for example, :meth:`setfmt` corresponds to the +``SNDCTL_DSP_SETFMT`` ioctl, and :meth:`sync` to ``SNDCTL_DSP_SYNC`` (this can +be useful when consulting the OSS documentation). If the underlying +:func:`ioctl` fails, they all raise :exc:`IOError`. + + +.. method:: oss_audio_device.nonblock() + + Put the device into non-blocking mode. Once in non-blocking mode, there is no + way to return it to blocking mode. + + +.. method:: oss_audio_device.getfmts() + + Return a bitmask of the audio output formats supported by the soundcard. Some + of the formats supported by OSS are: + + +-------------------------+---------------------------------------------+ + | Format | Description | + +=========================+=============================================+ + | :const:`AFMT_MU_LAW` | a logarithmic encoding (used by Sun ``.au`` | + | | files and :file:`/dev/audio`) | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_A_LAW` | a logarithmic encoding | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_IMA_ADPCM` | a 4:1 compressed format defined by the | + | | Interactive Multimedia Association | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_U8` | Unsigned, 8-bit audio | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_S16_LE` | Signed, 16-bit audio, little-endian byte | + | | order (as used by Intel processors) | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_S16_BE` | Signed, 16-bit audio, big-endian byte order | + | | (as used by 68k, PowerPC, Sparc) | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_S8` | Signed, 8 bit audio | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_U16_LE` | Unsigned, 16-bit little-endian audio | + +-------------------------+---------------------------------------------+ + | :const:`AFMT_U16_BE` | Unsigned, 16-bit big-endian audio | + +-------------------------+---------------------------------------------+ + + Consult the OSS documentation for a full list of audio formats, and note that + most devices support only a subset of these formats. Some older devices only + support :const:`AFMT_U8`; the most common format used today is + :const:`AFMT_S16_LE`. + + +.. method:: oss_audio_device.setfmt(format) + + Try to set the current audio format to *format*---see :meth:`getfmts` for a + list. Returns the audio format that the device was set to, which may not be the + requested format. May also be used to return the current audio format---do this + by passing an "audio format" of :const:`AFMT_QUERY`. + + +.. method:: oss_audio_device.channels(nchannels) + + Set the number of output channels to *nchannels*. A value of 1 indicates + monophonic sound, 2 stereophonic. Some devices may have more than 2 channels, + and some high-end devices may not support mono. Returns the number of channels + the device was set to. + + +.. method:: oss_audio_device.speed(samplerate) + + Try to set the audio sampling rate to *samplerate* samples per second. Returns + the rate actually set. Most sound devices don't support arbitrary sampling + rates. Common rates are: + + +-------+-------------------------------------------+ + | Rate | Description | + +=======+===========================================+ + | 8000 | default rate for :file:`/dev/audio` | + +-------+-------------------------------------------+ + | 11025 | speech recording | + +-------+-------------------------------------------+ + | 22050 | | + +-------+-------------------------------------------+ + | 44100 | CD quality audio (at 16 bits/sample and 2 | + | | channels) | + +-------+-------------------------------------------+ + | 96000 | DVD quality audio (at 24 bits/sample) | + +-------+-------------------------------------------+ + + +.. method:: oss_audio_device.sync() + + Wait until the sound device has played every byte in its buffer. (This happens + implicitly when the device is closed.) The OSS documentation recommends closing + and re-opening the device rather than using :meth:`sync`. + + +.. method:: oss_audio_device.reset() + + Immediately stop playing or recording and return the device to a state where it + can accept commands. The OSS documentation recommends closing and re-opening + the device after calling :meth:`reset`. + + +.. method:: oss_audio_device.post() + + Tell the driver that there is likely to be a pause in the output, making it + possible for the device to handle the pause more intelligently. You might use + this after playing a spot sound effect, before waiting for user input, or before + doing disk I/O. + +The following convenience methods combine several ioctls, or one ioctl and some +simple calculations. + + +.. method:: oss_audio_device.setparameters(format, nchannels, samplerate [, strict=False]) + + Set the key audio sampling parameters---sample format, number of channels, and + sampling rate---in one method call. *format*, *nchannels*, and *samplerate* + should be as specified in the :meth:`setfmt`, :meth:`channels`, and + :meth:`speed` methods. If *strict* is true, :meth:`setparameters` checks to + see if each parameter was actually set to the requested value, and raises + :exc:`OSSAudioError` if not. Returns a tuple (*format*, *nchannels*, + *samplerate*) indicating the parameter values that were actually set by the + device driver (i.e., the same as the return values of :meth:`setfmt`, + :meth:`channels`, and :meth:`speed`). + + For example, :: + + (fmt, channels, rate) = dsp.setparameters(fmt, channels, rate) + + is equivalent to :: + + fmt = dsp.setfmt(fmt) + channels = dsp.channels(channels) + rate = dsp.rate(channels) + + +.. method:: oss_audio_device.bufsize() + + Returns the size of the hardware buffer, in samples. + + +.. method:: oss_audio_device.obufcount() + + Returns the number of samples that are in the hardware buffer yet to be played. + + +.. method:: oss_audio_device.obuffree() + + Returns the number of samples that could be queued into the hardware buffer to + be played without blocking. + +Audio device objects also support several read-only attributes: + + +.. attribute:: oss_audio_device.closed + + Boolean indicating whether the device has been closed. + + +.. attribute:: oss_audio_device.name + + String containing the name of the device file. + + +.. attribute:: oss_audio_device.mode + + The I/O mode for the file, either ``"r"``, ``"rw"``, or ``"w"``. + + +.. _mixer-device-objects: + +Mixer Device Objects +-------------------- + +The mixer object provides two file-like methods: + + +.. method:: oss_mixer_device.close() + + This method closes the open mixer device file. Any further attempts to use the + mixer after this file is closed will raise an :exc:`IOError`. + + +.. method:: oss_mixer_device.fileno() + + Returns the file handle number of the open mixer device file. + +The remaining methods are specific to audio mixing: + + +.. method:: oss_mixer_device.controls() + + This method returns a bitmask specifying the available mixer controls ("Control" + being a specific mixable "channel", such as :const:`SOUND_MIXER_PCM` or + :const:`SOUND_MIXER_SYNTH`). This bitmask indicates a subset of all available + mixer controls---the :const:`SOUND_MIXER_\*` constants defined at module level. + To determine if, for example, the current mixer object supports a PCM mixer, use + the following Python code:: + + mixer=ossaudiodev.openmixer() + if mixer.controls() & (1 << ossaudiodev.SOUND_MIXER_PCM): + # PCM is supported + ... code ... + + For most purposes, the :const:`SOUND_MIXER_VOLUME` (master volume) and + :const:`SOUND_MIXER_PCM` controls should suffice---but code that uses the mixer + should be flexible when it comes to choosing mixer controls. On the Gravis + Ultrasound, for example, :const:`SOUND_MIXER_VOLUME` does not exist. + + +.. method:: oss_mixer_device.stereocontrols() + + Returns a bitmask indicating stereo mixer controls. If a bit is set, the + corresponding control is stereo; if it is unset, the control is either + monophonic or not supported by the mixer (use in combination with + :meth:`controls` to determine which). + + See the code example for the :meth:`controls` function for an example of getting + data from a bitmask. + + +.. method:: oss_mixer_device.reccontrols() + + Returns a bitmask specifying the mixer controls that may be used to record. See + the code example for :meth:`controls` for an example of reading from a bitmask. + + +.. method:: oss_mixer_device.get(control) + + Returns the volume of a given mixer control. The returned volume is a 2-tuple + ``(left_volume,right_volume)``. Volumes are specified as numbers from 0 + (silent) to 100 (full volume). If the control is monophonic, a 2-tuple is still + returned, but both volumes are the same. + + Raises :exc:`OSSAudioError` if an invalid control was is specified, or + :exc:`IOError` if an unsupported control is specified. + + +.. method:: oss_mixer_device.set(control, (left, right)) + + Sets the volume for a given mixer control to ``(left,right)``. ``left`` and + ``right`` must be ints and between 0 (silent) and 100 (full volume). On + success, the new volume is returned as a 2-tuple. Note that this may not be + exactly the same as the volume specified, because of the limited resolution of + some soundcard's mixers. + + Raises :exc:`OSSAudioError` if an invalid mixer control was specified, or if the + specified volumes were out-of-range. + + +.. method:: oss_mixer_device.get_recsrc() + + This method returns a bitmask indicating which control(s) are currently being + used as a recording source. + + +.. method:: oss_mixer_device.set_recsrc(bitmask) + + Call this function to specify a recording source. Returns a bitmask indicating + the new recording source (or sources) if successful; raises :exc:`IOError` if an + invalid source was specified. To set the current recording source to the + microphone input:: + + mixer.setrecsrc (1 << ossaudiodev.SOUND_MIXER_MIC) + diff --git a/Doc/library/othergui.rst b/Doc/library/othergui.rst new file mode 100644 index 0000000..aadb74d --- /dev/null +++ b/Doc/library/othergui.rst @@ -0,0 +1,84 @@ +.. _other-gui-packages: + +Other Graphical User Interface Packages +======================================= + +There are an number of extension widget sets to :mod:`Tkinter`. + + +.. seealso:: + + `Python megawidgets <http://pmw.sourceforge.net/>`_ + is a toolkit for building high-level compound widgets in Python using the + :mod:`Tkinter` module. It consists of a set of base classes and a library of + flexible and extensible megawidgets built on this foundation. These megawidgets + include notebooks, comboboxes, selection widgets, paned widgets, scrolled + widgets, dialog windows, etc. Also, with the Pmw.Blt interface to BLT, the + busy, graph, stripchart, tabset and vector commands are be available. + + The initial ideas for Pmw were taken from the Tk ``itcl`` extensions ``[incr + Tk]`` by Michael McLennan and ``[incr Widgets]`` by Mark Ulferts. Several of the + megawidgets are direct translations from the itcl to Python. It offers most of + the range of widgets that ``[incr Widgets]`` does, and is almost as complete as + Tix, lacking however Tix's fast :class:`HList` widget for drawing trees. + + `Tkinter3000 Widget Construction Kit (WCK) <http://tkinter.effbot.org/>`_ + is a library that allows you to write new Tkinter widgets in pure Python. The + WCK framework gives you full control over widget creation, configuration, screen + appearance, and event handling. WCK widgets can be very fast and light-weight, + since they can operate directly on Python data structures, without having to + transfer data through the Tk/Tcl layer. + + .. % + +The major cross-platform (Windows, Mac OS X, Unix-like) GUI toolkits that are +also available for Python: + + +.. seealso:: + + `PyGTK <http://www.pygtk.org/>`_ + is a set of bindings for the `GTK <http://www.gtk.org/>`_ widget set. It + provides an object oriented interface that is slightly higher level than the C + one. It comes with many more widgets than Tkinter provides, and + has good Python-specific reference documentation. There are also `bindings + <http://www.daa.com.au/~james/gnome/>`_ to `GNOME <http://www.gnome.org>`_. + One well known PyGTK application is + `PythonCAD <http://www.pythoncad.org/>`_. An + online `tutorial <http://www.pygtk.org/pygtk2tutorial/index.html>`_ is + available. + + `PyQt <//http://www.riverbankcomputing.co.uk/pyqt/index.php>`_ + PyQt is a :program:`sip`\ -wrapped binding to the Qt toolkit. Qt is an + extensive C++ GUI application development framework that is + available for Unix, Windows and Mac OS X. :program:`sip` is a tool + for generating bindings for C++ libraries as Python classes, and + is specifically designed for Python. The *PyQt3* bindings have a + book, `GUI Programming with Python: QT Edition + <http://www.commandprompt.com/community/pyqt/>`_ by Boudewijn + Rempt. The *PyQt4* bindings also have a book, `Rapid GUI Programming + with Python and Qt <http://www.qtrac.eu/pyqtbook.html>`_, by Mark + Summerfield. + + `wxPython <http://www.wxpython.org>`_ + wxPython is a cross-platform GUI toolkit for Python that is built around + the popular `wxWidgets <http://www.wxwidgets.org/>`_ (formerly wxWindows) + C++ toolkit.  It provides a native look and feel for applications on + Windows, Mac OS X, and Unix systems by using each platform's native + widgets where ever possible, (GTK+ on Unix-like systems).  In addition to + an extensive set of widgets, wxPython provides classes for online + documentation and context sensitive help, printing, HTML viewing, + low-level device context drawing, drag and drop, system clipboard access, + an XML-based resource format and more, including an ever growing library + of user-contributed modules.  wxPython has a book, `wxPython in Action + <http://www.amazon.com/exec/obidos/ASIN/1932394621>`_, by Noel Rappin and + Robin Dunn. + +PyGTK, PyQt, and wxPython, all have a modern look and feel and far more +widgets and better documentation than Tkinter. In addition, +there are many other GUI toolkits for Python, both cross-platform, and +platform-specific. See the `GUI Programming +<http://wiki.python.org/moin/GuiProgramming>`_ page in the Python Wiki for a +much more complete list, and also for links to documents where the +different GUI toolkits are compared. + diff --git a/Doc/library/parser.rst b/Doc/library/parser.rst new file mode 100644 index 0000000..b767561 --- /dev/null +++ b/Doc/library/parser.rst @@ -0,0 +1,683 @@ + +:mod:`parser` --- Access Python parse trees +=========================================== + +.. module:: parser + :synopsis: Access parse trees for Python source code. +.. moduleauthor:: Fred L. Drake, Jr. <fdrake@acm.org> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. % Copyright 1995 Virginia Polytechnic Institute and State University +.. % and Fred L. Drake, Jr. This copyright notice must be distributed on +.. % all copies, but this document otherwise may be distributed as part +.. % of the Python distribution. No fee may be charged for this document +.. % in any representation, either on paper or electronically. This +.. % restriction does not affect other elements in a distributed package +.. % in any way. + +.. index:: single: parsing; Python source code + +The :mod:`parser` module provides an interface to Python's internal parser and +byte-code compiler. The primary purpose for this interface is to allow Python +code to edit the parse tree of a Python expression and create executable code +from this. This is better than trying to parse and modify an arbitrary Python +code fragment as a string because parsing is performed in a manner identical to +the code forming the application. It is also faster. + +There are a few things to note about this module which are important to making +use of the data structures created. This is not a tutorial on editing the parse +trees for Python code, but some examples of using the :mod:`parser` module are +presented. + +Most importantly, a good understanding of the Python grammar processed by the +internal parser is required. For full information on the language syntax, refer +to :ref:`reference-index`. The parser +itself is created from a grammar specification defined in the file +:file:`Grammar/Grammar` in the standard Python distribution. The parse trees +stored in the AST objects created by this module are the actual output from the +internal parser when created by the :func:`expr` or :func:`suite` functions, +described below. The AST objects created by :func:`sequence2ast` faithfully +simulate those structures. Be aware that the values of the sequences which are +considered "correct" will vary from one version of Python to another as the +formal grammar for the language is revised. However, transporting code from one +Python version to another as source text will always allow correct parse trees +to be created in the target version, with the only restriction being that +migrating to an older version of the interpreter will not support more recent +language constructs. The parse trees are not typically compatible from one +version to another, whereas source code has always been forward-compatible. + +Each element of the sequences returned by :func:`ast2list` or :func:`ast2tuple` +has a simple form. Sequences representing non-terminal elements in the grammar +always have a length greater than one. The first element is an integer which +identifies a production in the grammar. These integers are given symbolic names +in the C header file :file:`Include/graminit.h` and the Python module +:mod:`symbol`. Each additional element of the sequence represents a component +of the production as recognized in the input string: these are always sequences +which have the same form as the parent. An important aspect of this structure +which should be noted is that keywords used to identify the parent node type, +such as the keyword :keyword:`if` in an :const:`if_stmt`, are included in the +node tree without any special treatment. For example, the :keyword:`if` keyword +is represented by the tuple ``(1, 'if')``, where ``1`` is the numeric value +associated with all :const:`NAME` tokens, including variable and function names +defined by the user. In an alternate form returned when line number information +is requested, the same token might be represented as ``(1, 'if', 12)``, where +the ``12`` represents the line number at which the terminal symbol was found. + +Terminal elements are represented in much the same way, but without any child +elements and the addition of the source text which was identified. The example +of the :keyword:`if` keyword above is representative. The various types of +terminal symbols are defined in the C header file :file:`Include/token.h` and +the Python module :mod:`token`. + +The AST objects are not required to support the functionality of this module, +but are provided for three purposes: to allow an application to amortize the +cost of processing complex parse trees, to provide a parse tree representation +which conserves memory space when compared to the Python list or tuple +representation, and to ease the creation of additional modules in C which +manipulate parse trees. A simple "wrapper" class may be created in Python to +hide the use of AST objects. + +The :mod:`parser` module defines functions for a few distinct purposes. The +most important purposes are to create AST objects and to convert AST objects to +other representations such as parse trees and compiled code objects, but there +are also functions which serve to query the type of parse tree represented by an +AST object. + + +.. seealso:: + + Module :mod:`symbol` + Useful constants representing internal nodes of the parse tree. + + Module :mod:`token` + Useful constants representing leaf nodes of the parse tree and functions for + testing node values. + + +.. _creating-asts: + +Creating AST Objects +-------------------- + +AST objects may be created from source code or from a parse tree. When creating +an AST object from source, different functions are used to create the ``'eval'`` +and ``'exec'`` forms. + + +.. function:: expr(source) + + The :func:`expr` function parses the parameter *source* as if it were an input + to ``compile(source, 'file.py', 'eval')``. If the parse succeeds, an AST object + is created to hold the internal parse tree representation, otherwise an + appropriate exception is thrown. + + +.. function:: suite(source) + + The :func:`suite` function parses the parameter *source* as if it were an input + to ``compile(source, 'file.py', 'exec')``. If the parse succeeds, an AST object + is created to hold the internal parse tree representation, otherwise an + appropriate exception is thrown. + + +.. function:: sequence2ast(sequence) + + This function accepts a parse tree represented as a sequence and builds an + internal representation if possible. If it can validate that the tree conforms + to the Python grammar and all nodes are valid node types in the host version of + Python, an AST object is created from the internal representation and returned + to the called. If there is a problem creating the internal representation, or + if the tree cannot be validated, a :exc:`ParserError` exception is thrown. An + AST object created this way should not be assumed to compile correctly; normal + exceptions thrown by compilation may still be initiated when the AST object is + passed to :func:`compileast`. This may indicate problems not related to syntax + (such as a :exc:`MemoryError` exception), but may also be due to constructs such + as the result of parsing ``del f(0)``, which escapes the Python parser but is + checked by the bytecode compiler. + + Sequences representing terminal tokens may be represented as either two-element + lists of the form ``(1, 'name')`` or as three-element lists of the form ``(1, + 'name', 56)``. If the third element is present, it is assumed to be a valid + line number. The line number may be specified for any subset of the terminal + symbols in the input tree. + + +.. function:: tuple2ast(sequence) + + This is the same function as :func:`sequence2ast`. This entry point is + maintained for backward compatibility. + + +.. _converting-asts: + +Converting AST Objects +---------------------- + +AST objects, regardless of the input used to create them, may be converted to +parse trees represented as list- or tuple- trees, or may be compiled into +executable code objects. Parse trees may be extracted with or without line +numbering information. + + +.. function:: ast2list(ast[, line_info]) + + This function accepts an AST object from the caller in *ast* and returns a + Python list representing the equivalent parse tree. The resulting list + representation can be used for inspection or the creation of a new parse tree in + list form. This function does not fail so long as memory is available to build + the list representation. If the parse tree will only be used for inspection, + :func:`ast2tuple` should be used instead to reduce memory consumption and + fragmentation. When the list representation is required, this function is + significantly faster than retrieving a tuple representation and converting that + to nested lists. + + If *line_info* is true, line number information will be included for all + terminal tokens as a third element of the list representing the token. Note + that the line number provided specifies the line on which the token *ends*. + This information is omitted if the flag is false or omitted. + + +.. function:: ast2tuple(ast[, line_info]) + + This function accepts an AST object from the caller in *ast* and returns a + Python tuple representing the equivalent parse tree. Other than returning a + tuple instead of a list, this function is identical to :func:`ast2list`. + + If *line_info* is true, line number information will be included for all + terminal tokens as a third element of the list representing the token. This + information is omitted if the flag is false or omitted. + + +.. function:: compileast(ast[, filename='<ast>']) + + .. index:: + builtin: exec + builtin: eval + + The Python byte compiler can be invoked on an AST object to produce code objects + which can be used as part of a call to the built-in :func:`exec` or :func:`eval` + functions. This function provides the interface to the compiler, passing the + internal parse tree from *ast* to the parser, using the source file name + specified by the *filename* parameter. The default value supplied for *filename* + indicates that the source was an AST object. + + Compiling an AST object may result in exceptions related to compilation; an + example would be a :exc:`SyntaxError` caused by the parse tree for ``del f(0)``: + this statement is considered legal within the formal grammar for Python but is + not a legal language construct. The :exc:`SyntaxError` raised for this + condition is actually generated by the Python byte-compiler normally, which is + why it can be raised at this point by the :mod:`parser` module. Most causes of + compilation failure can be diagnosed programmatically by inspection of the parse + tree. + + +.. _querying-asts: + +Queries on AST Objects +---------------------- + +Two functions are provided which allow an application to determine if an AST was +created as an expression or a suite. Neither of these functions can be used to +determine if an AST was created from source code via :func:`expr` or +:func:`suite` or from a parse tree via :func:`sequence2ast`. + + +.. function:: isexpr(ast) + + .. index:: builtin: compile + + When *ast* represents an ``'eval'`` form, this function returns true, otherwise + it returns false. This is useful, since code objects normally cannot be queried + for this information using existing built-in functions. Note that the code + objects created by :func:`compileast` cannot be queried like this either, and + are identical to those created by the built-in :func:`compile` function. + + +.. function:: issuite(ast) + + This function mirrors :func:`isexpr` in that it reports whether an AST object + represents an ``'exec'`` form, commonly known as a "suite." It is not safe to + assume that this function is equivalent to ``not isexpr(ast)``, as additional + syntactic fragments may be supported in the future. + + +.. _ast-errors: + +Exceptions and Error Handling +----------------------------- + +The parser module defines a single exception, but may also pass other built-in +exceptions from other portions of the Python runtime environment. See each +function for information about the exceptions it can raise. + + +.. exception:: ParserError + + Exception raised when a failure occurs within the parser module. This is + generally produced for validation failures rather than the built in + :exc:`SyntaxError` thrown during normal parsing. The exception argument is + either a string describing the reason of the failure or a tuple containing a + sequence causing the failure from a parse tree passed to :func:`sequence2ast` + and an explanatory string. Calls to :func:`sequence2ast` need to be able to + handle either type of exception, while calls to other functions in the module + will only need to be aware of the simple string values. + +Note that the functions :func:`compileast`, :func:`expr`, and :func:`suite` may +throw exceptions which are normally thrown by the parsing and compilation +process. These include the built in exceptions :exc:`MemoryError`, +:exc:`OverflowError`, :exc:`SyntaxError`, and :exc:`SystemError`. In these +cases, these exceptions carry all the meaning normally associated with them. +Refer to the descriptions of each function for detailed information. + + +.. _ast-objects: + +AST Objects +----------- + +Ordered and equality comparisons are supported between AST objects. Pickling of +AST objects (using the :mod:`pickle` module) is also supported. + + +.. data:: ASTType + + The type of the objects returned by :func:`expr`, :func:`suite` and + :func:`sequence2ast`. + +AST objects have the following methods: + + +.. method:: AST.compile([filename]) + + Same as ``compileast(ast, filename)``. + + +.. method:: AST.isexpr() + + Same as ``isexpr(ast)``. + + +.. method:: AST.issuite() + + Same as ``issuite(ast)``. + + +.. method:: AST.tolist([line_info]) + + Same as ``ast2list(ast, line_info)``. + + +.. method:: AST.totuple([line_info]) + + Same as ``ast2tuple(ast, line_info)``. + + +.. _ast-examples: + +Examples +-------- + +.. index:: builtin: compile + +The parser modules allows operations to be performed on the parse tree of Python +source code before the bytecode is generated, and provides for inspection of the +parse tree for information gathering purposes. Two examples are presented. The +simple example demonstrates emulation of the :func:`compile` built-in function +and the complex example shows the use of a parse tree for information discovery. + + +Emulation of :func:`compile` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +While many useful operations may take place between parsing and bytecode +generation, the simplest operation is to do nothing. For this purpose, using +the :mod:`parser` module to produce an intermediate data structure is equivalent +to the code :: + + >>> code = compile('a + 5', 'file.py', 'eval') + >>> a = 5 + >>> eval(code) + 10 + +The equivalent operation using the :mod:`parser` module is somewhat longer, and +allows the intermediate internal parse tree to be retained as an AST object:: + + >>> import parser + >>> ast = parser.expr('a + 5') + >>> code = ast.compile('file.py') + >>> a = 5 + >>> eval(code) + 10 + +An application which needs both AST and code objects can package this code into +readily available functions:: + + import parser + + def load_suite(source_string): + ast = parser.suite(source_string) + return ast, ast.compile() + + def load_expression(source_string): + ast = parser.expr(source_string) + return ast, ast.compile() + + +Information Discovery +^^^^^^^^^^^^^^^^^^^^^ + +.. index:: + single: string; documentation + single: docstrings + +Some applications benefit from direct access to the parse tree. The remainder +of this section demonstrates how the parse tree provides access to module +documentation defined in docstrings without requiring that the code being +examined be loaded into a running interpreter via :keyword:`import`. This can +be very useful for performing analyses of untrusted code. + +Generally, the example will demonstrate how the parse tree may be traversed to +distill interesting information. Two functions and a set of classes are +developed which provide programmatic access to high level function and class +definitions provided by a module. The classes extract information from the +parse tree and provide access to the information at a useful semantic level, one +function provides a simple low-level pattern matching capability, and the other +function defines a high-level interface to the classes by handling file +operations on behalf of the caller. All source files mentioned here which are +not part of the Python installation are located in the :file:`Demo/parser/` +directory of the distribution. + +The dynamic nature of Python allows the programmer a great deal of flexibility, +but most modules need only a limited measure of this when defining classes, +functions, and methods. In this example, the only definitions that will be +considered are those which are defined in the top level of their context, e.g., +a function defined by a :keyword:`def` statement at column zero of a module, but +not a function defined within a branch of an :keyword:`if` ... :keyword:`else` +construct, though there are some good reasons for doing so in some situations. +Nesting of definitions will be handled by the code developed in the example. + +To construct the upper-level extraction methods, we need to know what the parse +tree structure looks like and how much of it we actually need to be concerned +about. Python uses a moderately deep parse tree so there are a large number of +intermediate nodes. It is important to read and understand the formal grammar +used by Python. This is specified in the file :file:`Grammar/Grammar` in the +distribution. Consider the simplest case of interest when searching for +docstrings: a module consisting of a docstring and nothing else. (See file +:file:`docstring.py`.) :: + + """Some documentation. + """ + +Using the interpreter to take a look at the parse tree, we find a bewildering +mass of numbers and parentheses, with the documentation buried deep in nested +tuples. :: + + >>> import parser + >>> import pprint + >>> ast = parser.suite(open('docstring.py').read()) + >>> tup = ast.totuple() + >>> pprint.pprint(tup) + (257, + (264, + (265, + (266, + (267, + (307, + (287, + (288, + (289, + (290, + (292, + (293, + (294, + (295, + (296, + (297, + (298, + (299, + (300, (3, '"""Some documentation.\n"""'))))))))))))))))), + (4, ''))), + (4, ''), + (0, '')) + +The numbers at the first element of each node in the tree are the node types; +they map directly to terminal and non-terminal symbols in the grammar. +Unfortunately, they are represented as integers in the internal representation, +and the Python structures generated do not change that. However, the +:mod:`symbol` and :mod:`token` modules provide symbolic names for the node types +and dictionaries which map from the integers to the symbolic names for the node +types. + +In the output presented above, the outermost tuple contains four elements: the +integer ``257`` and three additional tuples. Node type ``257`` has the symbolic +name :const:`file_input`. Each of these inner tuples contains an integer as the +first element; these integers, ``264``, ``4``, and ``0``, represent the node +types :const:`stmt`, :const:`NEWLINE`, and :const:`ENDMARKER`, respectively. +Note that these values may change depending on the version of Python you are +using; consult :file:`symbol.py` and :file:`token.py` for details of the +mapping. It should be fairly clear that the outermost node is related primarily +to the input source rather than the contents of the file, and may be disregarded +for the moment. The :const:`stmt` node is much more interesting. In +particular, all docstrings are found in subtrees which are formed exactly as +this node is formed, with the only difference being the string itself. The +association between the docstring in a similar tree and the defined entity +(class, function, or module) which it describes is given by the position of the +docstring subtree within the tree defining the described structure. + +By replacing the actual docstring with something to signify a variable component +of the tree, we allow a simple pattern matching approach to check any given +subtree for equivalence to the general pattern for docstrings. Since the +example demonstrates information extraction, we can safely require that the tree +be in tuple form rather than list form, allowing a simple variable +representation to be ``['variable_name']``. A simple recursive function can +implement the pattern matching, returning a Boolean and a dictionary of variable +name to value mappings. (See file :file:`example.py`.) :: + + from types import ListType, TupleType + + def match(pattern, data, vars=None): + if vars is None: + vars = {} + if type(pattern) is ListType: + vars[pattern[0]] = data + return 1, vars + if type(pattern) is not TupleType: + return (pattern == data), vars + if len(data) != len(pattern): + return 0, vars + for pattern, data in map(None, pattern, data): + same, vars = match(pattern, data, vars) + if not same: + break + return same, vars + +Using this simple representation for syntactic variables and the symbolic node +types, the pattern for the candidate docstring subtrees becomes fairly readable. +(See file :file:`example.py`.) :: + + import symbol + import token + + DOCSTRING_STMT_PATTERN = ( + symbol.stmt, + (symbol.simple_stmt, + (symbol.small_stmt, + (symbol.expr_stmt, + (symbol.testlist, + (symbol.test, + (symbol.and_test, + (symbol.not_test, + (symbol.comparison, + (symbol.expr, + (symbol.xor_expr, + (symbol.and_expr, + (symbol.shift_expr, + (symbol.arith_expr, + (symbol.term, + (symbol.factor, + (symbol.power, + (symbol.atom, + (token.STRING, ['docstring']) + )))))))))))))))), + (token.NEWLINE, '') + )) + +Using the :func:`match` function with this pattern, extracting the module +docstring from the parse tree created previously is easy:: + + >>> found, vars = match(DOCSTRING_STMT_PATTERN, tup[1]) + >>> found + 1 + >>> vars + {'docstring': '"""Some documentation.\n"""'} + +Once specific data can be extracted from a location where it is expected, the +question of where information can be expected needs to be answered. When +dealing with docstrings, the answer is fairly simple: the docstring is the first +:const:`stmt` node in a code block (:const:`file_input` or :const:`suite` node +types). A module consists of a single :const:`file_input` node, and class and +function definitions each contain exactly one :const:`suite` node. Classes and +functions are readily identified as subtrees of code block nodes which start +with ``(stmt, (compound_stmt, (classdef, ...`` or ``(stmt, (compound_stmt, +(funcdef, ...``. Note that these subtrees cannot be matched by :func:`match` +since it does not support multiple sibling nodes to match without regard to +number. A more elaborate matching function could be used to overcome this +limitation, but this is sufficient for the example. + +Given the ability to determine whether a statement might be a docstring and +extract the actual string from the statement, some work needs to be performed to +walk the parse tree for an entire module and extract information about the names +defined in each context of the module and associate any docstrings with the +names. The code to perform this work is not complicated, but bears some +explanation. + +The public interface to the classes is straightforward and should probably be +somewhat more flexible. Each "major" block of the module is described by an +object providing several methods for inquiry and a constructor which accepts at +least the subtree of the complete parse tree which it represents. The +:class:`ModuleInfo` constructor accepts an optional *name* parameter since it +cannot otherwise determine the name of the module. + +The public classes include :class:`ClassInfo`, :class:`FunctionInfo`, and +:class:`ModuleInfo`. All objects provide the methods :meth:`get_name`, +:meth:`get_docstring`, :meth:`get_class_names`, and :meth:`get_class_info`. The +:class:`ClassInfo` objects support :meth:`get_method_names` and +:meth:`get_method_info` while the other classes provide +:meth:`get_function_names` and :meth:`get_function_info`. + +Within each of the forms of code block that the public classes represent, most +of the required information is in the same form and is accessed in the same way, +with classes having the distinction that functions defined at the top level are +referred to as "methods." Since the difference in nomenclature reflects a real +semantic distinction from functions defined outside of a class, the +implementation needs to maintain the distinction. Hence, most of the +functionality of the public classes can be implemented in a common base class, +:class:`SuiteInfoBase`, with the accessors for function and method information +provided elsewhere. Note that there is only one class which represents function +and method information; this parallels the use of the :keyword:`def` statement +to define both types of elements. + +Most of the accessor functions are declared in :class:`SuiteInfoBase` and do not +need to be overridden by subclasses. More importantly, the extraction of most +information from a parse tree is handled through a method called by the +:class:`SuiteInfoBase` constructor. The example code for most of the classes is +clear when read alongside the formal grammar, but the method which recursively +creates new information objects requires further examination. Here is the +relevant part of the :class:`SuiteInfoBase` definition from :file:`example.py`:: + + class SuiteInfoBase: + _docstring = '' + _name = '' + + def __init__(self, tree = None): + self._class_info = {} + self._function_info = {} + if tree: + self._extract_info(tree) + + def _extract_info(self, tree): + # extract docstring + if len(tree) == 2: + found, vars = match(DOCSTRING_STMT_PATTERN[1], tree[1]) + else: + found, vars = match(DOCSTRING_STMT_PATTERN, tree[3]) + if found: + self._docstring = eval(vars['docstring']) + # discover inner definitions + for node in tree[1:]: + found, vars = match(COMPOUND_STMT_PATTERN, node) + if found: + cstmt = vars['compound'] + if cstmt[0] == symbol.funcdef: + name = cstmt[2][1] + self._function_info[name] = FunctionInfo(cstmt) + elif cstmt[0] == symbol.classdef: + name = cstmt[2][1] + self._class_info[name] = ClassInfo(cstmt) + +After initializing some internal state, the constructor calls the +:meth:`_extract_info` method. This method performs the bulk of the information +extraction which takes place in the entire example. The extraction has two +distinct phases: the location of the docstring for the parse tree passed in, and +the discovery of additional definitions within the code block represented by the +parse tree. + +The initial :keyword:`if` test determines whether the nested suite is of the +"short form" or the "long form." The short form is used when the code block is +on the same line as the definition of the code block, as in :: + + def square(x): "Square an argument."; return x ** 2 + +while the long form uses an indented block and allows nested definitions:: + + def make_power(exp): + "Make a function that raises an argument to the exponent `exp'." + def raiser(x, y=exp): + return x ** y + return raiser + +When the short form is used, the code block may contain a docstring as the +first, and possibly only, :const:`small_stmt` element. The extraction of such a +docstring is slightly different and requires only a portion of the complete +pattern used in the more common case. As implemented, the docstring will only +be found if there is only one :const:`small_stmt` node in the +:const:`simple_stmt` node. Since most functions and methods which use the short +form do not provide a docstring, this may be considered sufficient. The +extraction of the docstring proceeds using the :func:`match` function as +described above, and the value of the docstring is stored as an attribute of the +:class:`SuiteInfoBase` object. + +After docstring extraction, a simple definition discovery algorithm operates on +the :const:`stmt` nodes of the :const:`suite` node. The special case of the +short form is not tested; since there are no :const:`stmt` nodes in the short +form, the algorithm will silently skip the single :const:`simple_stmt` node and +correctly not discover any nested definitions. + +Each statement in the code block is categorized as a class definition, function +or method definition, or something else. For the definition statements, the +name of the element defined is extracted and a representation object appropriate +to the definition is created with the defining subtree passed as an argument to +the constructor. The representation objects are stored in instance variables +and may be retrieved by name using the appropriate accessor methods. + +The public classes provide any accessors required which are more specific than +those provided by the :class:`SuiteInfoBase` class, but the real extraction +algorithm remains common to all forms of code blocks. A high-level function can +be used to extract the complete set of information from a source file. (See +file :file:`example.py`.) :: + + def get_docs(fileName): + import os + import parser + + source = open(fileName).read() + basename = os.path.basename(os.path.splitext(fileName)[0]) + ast = parser.suite(source) + return ModuleInfo(ast.totuple(), basename) + +This provides an easy-to-use interface to the documentation of a module. If +information is required which is not extracted by the code of this example, the +code may be extended at clearly defined points to provide additional +capabilities. + diff --git a/Doc/library/pdb.rst b/Doc/library/pdb.rst new file mode 100644 index 0000000..804dd23 --- /dev/null +++ b/Doc/library/pdb.rst @@ -0,0 +1,409 @@ + +.. _debugger: + +******************* +The Python Debugger +******************* + +.. module:: pdb + :synopsis: The Python debugger for interactive interpreters. + + +.. index:: single: debugging + +The module :mod:`pdb` defines an interactive source code debugger for Python +programs. It supports setting (conditional) breakpoints and single stepping at +the source line level, inspection of stack frames, source code listing, and +evaluation of arbitrary Python code in the context of any stack frame. It also +supports post-mortem debugging and can be called under program control. + +.. index:: + single: Pdb (class in pdb) + module: bdb + module: cmd + +The debugger is extensible --- it is actually defined as the class :class:`Pdb`. +This is currently undocumented but easily understood by reading the source. The +extension interface uses the modules :mod:`bdb` (undocumented) and :mod:`cmd`. + +The debugger's prompt is ``(Pdb)``. Typical usage to run a program under control +of the debugger is:: + + >>> import pdb + >>> import mymodule + >>> pdb.run('mymodule.test()') + > <string>(0)?() + (Pdb) continue + > <string>(1)?() + (Pdb) continue + NameError: 'spam' + > <string>(1)?() + (Pdb) + +:file:`pdb.py` can also be invoked as a script to debug other scripts. For +example:: + + python -m pdb myscript.py + +When invoked as a script, pdb will automatically enter post-mortem debugging if +the program being debugged exits abnormally. After post-mortem debugging (or +after normal exit of the program), pdb will restart the program. Automatic +restarting preserves pdb's state (such as breakpoints) and in most cases is more +useful than quitting the debugger upon program's exit. + +.. versionadded:: 2.4 + Restarting post-mortem behavior added. + +Typical usage to inspect a crashed program is:: + + >>> import pdb + >>> import mymodule + >>> mymodule.test() + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "./mymodule.py", line 4, in test + test2() + File "./mymodule.py", line 3, in test2 + print spam + NameError: spam + >>> pdb.pm() + > ./mymodule.py(3)test2() + -> print spam + (Pdb) + +The module defines the following functions; each enters the debugger in a +slightly different way: + + +.. function:: run(statement[, globals[, locals]]) + + Execute the *statement* (given as a string) under debugger control. The + debugger prompt appears before any code is executed; you can set breakpoints and + type ``continue``, or you can step through the statement using ``step`` or + ``next`` (all these commands are explained below). The optional *globals* and + *locals* arguments specify the environment in which the code is executed; by + default the dictionary of the module :mod:`__main__` is used. (See the + explanation of the built-in :func:`exec` or :func:`eval` functions.) + + +.. function:: runeval(expression[, globals[, locals]]) + + Evaluate the *expression* (given as a string) under debugger control. When + :func:`runeval` returns, it returns the value of the expression. Otherwise this + function is similar to :func:`run`. + + +.. function:: runcall(function[, argument, ...]) + + Call the *function* (a function or method object, not a string) with the given + arguments. When :func:`runcall` returns, it returns whatever the function call + returned. The debugger prompt appears as soon as the function is entered. + + +.. function:: set_trace() + + Enter the debugger at the calling stack frame. This is useful to hard-code a + breakpoint at a given point in a program, even if the code is not otherwise + being debugged (e.g. when an assertion fails). + + +.. function:: post_mortem(traceback) + + Enter post-mortem debugging of the given *traceback* object. + + +.. function:: pm() + + Enter post-mortem debugging of the traceback found in ``sys.last_traceback``. + + +.. _debugger-commands: + +Debugger Commands +================= + +The debugger recognizes the following commands. Most commands can be +abbreviated to one or two letters; e.g. ``h(elp)`` means that either ``h`` or +``help`` can be used to enter the help command (but not ``he`` or ``hel``, nor +``H`` or ``Help`` or ``HELP``). Arguments to commands must be separated by +whitespace (spaces or tabs). Optional arguments are enclosed in square brackets +(``[]``) in the command syntax; the square brackets must not be typed. +Alternatives in the command syntax are separated by a vertical bar (``|``). + +Entering a blank line repeats the last command entered. Exception: if the last +command was a ``list`` command, the next 11 lines are listed. + +Commands that the debugger doesn't recognize are assumed to be Python statements +and are executed in the context of the program being debugged. Python +statements can also be prefixed with an exclamation point (``!``). This is a +powerful way to inspect the program being debugged; it is even possible to +change a variable or call a function. When an exception occurs in such a +statement, the exception name is printed but the debugger's state is not +changed. + +Multiple commands may be entered on a single line, separated by ``;;``. (A +single ``;`` is not used as it is the separator for multiple commands in a line +that is passed to the Python parser.) No intelligence is applied to separating +the commands; the input is split at the first ``;;`` pair, even if it is in the +middle of a quoted string. + +The debugger supports aliases. Aliases can have parameters which allows one a +certain level of adaptability to the context under examination. + +.. index:: + pair: .pdbrc; file + triple: debugger; configuration; file + +If a file :file:`.pdbrc` exists in the user's home directory or in the current +directory, it is read in and executed as if it had been typed at the debugger +prompt. This is particularly useful for aliases. If both files exist, the one +in the home directory is read first and aliases defined there can be overridden +by the local file. + +h(elp) [*command*] + Without argument, print the list of available commands. With a *command* as + argument, print help about that command. ``help pdb`` displays the full + documentation file; if the environment variable :envvar:`PAGER` is defined, the + file is piped through that command instead. Since the *command* argument must + be an identifier, ``help exec`` must be entered to get help on the ``!`` + command. + +w(here) + Print a stack trace, with the most recent frame at the bottom. An arrow + indicates the current frame, which determines the context of most commands. + +d(own) + Move the current frame one level down in the stack trace (to a newer frame). + +u(p) + Move the current frame one level up in the stack trace (to an older frame). + +b(reak) [[*filename*:]*lineno*``|``*function*[, *condition*]] + With a *lineno* argument, set a break there in the current file. With a + *function* argument, set a break at the first executable statement within that + function. The line number may be prefixed with a filename and a colon, to + specify a breakpoint in another file (probably one that hasn't been loaded yet). + The file is searched on ``sys.path``. Note that each breakpoint is assigned a + number to which all the other breakpoint commands refer. + + If a second argument is present, it is an expression which must evaluate to true + before the breakpoint is honored. + + Without argument, list all breaks, including for each breakpoint, the number of + times that breakpoint has been hit, the current ignore count, and the associated + condition if any. + +tbreak [[*filename*:]*lineno*``|``*function*[, *condition*]] + Temporary breakpoint, which is removed automatically when it is first hit. The + arguments are the same as break. + +cl(ear) [*bpnumber* [*bpnumber ...*]] + With a space separated list of breakpoint numbers, clear those breakpoints. + Without argument, clear all breaks (but first ask confirmation). + +disable [*bpnumber* [*bpnumber ...*]] + Disables the breakpoints given as a space separated list of breakpoint numbers. + Disabling a breakpoint means it cannot cause the program to stop execution, but + unlike clearing a breakpoint, it remains in the list of breakpoints and can be + (re-)enabled. + +enable [*bpnumber* [*bpnumber ...*]] + Enables the breakpoints specified. + +ignore *bpnumber* [*count*] + Sets the ignore count for the given breakpoint number. If count is omitted, the + ignore count is set to 0. A breakpoint becomes active when the ignore count is + zero. When non-zero, the count is decremented each time the breakpoint is + reached and the breakpoint is not disabled and any associated condition + evaluates to true. + +condition *bpnumber* [*condition*] + Condition is an expression which must evaluate to true before the breakpoint is + honored. If condition is absent, any existing condition is removed; i.e., the + breakpoint is made unconditional. + +commands [*bpnumber*] + Specify a list of commands for breakpoint number *bpnumber*. The commands + themselves appear on the following lines. Type a line containing just 'end' to + terminate the commands. An example:: + + (Pdb) commands 1 + (com) print some_variable + (com) end + (Pdb) + + To remove all commands from a breakpoint, type commands and follow it + immediately with end; that is, give no commands. + + With no *bpnumber* argument, commands refers to the last breakpoint set. + + You can use breakpoint commands to start your program up again. Simply use the + continue command, or step, or any other command that resumes execution. + + Specifying any command resuming execution (currently continue, step, next, + return, jump, quit and their abbreviations) terminates the command list (as if + that command was immediately followed by end). This is because any time you + resume execution (even with a simple next or step), you may encounter· another + breakpoint--which could have its own command list, leading to ambiguities about + which list to execute. + + If you use the 'silent' command in the command list, the usual message about + stopping at a breakpoint is not printed. This may be desirable for breakpoints + that are to print a specific message and then continue. If none of the other + commands print anything, you see no sign that the breakpoint was reached. + + .. versionadded:: 2.5 + +s(tep) + Execute the current line, stop at the first possible occasion (either in a + function that is called or on the next line in the current function). + +n(ext) + Continue execution until the next line in the current function is reached or it + returns. (The difference between ``next`` and ``step`` is that ``step`` stops + inside a called function, while ``next`` executes called functions at (nearly) + full speed, only stopping at the next line in the current function.) + +r(eturn) + Continue execution until the current function returns. + +c(ont(inue)) + Continue execution, only stop when a breakpoint is encountered. + +j(ump) *lineno* + Set the next line that will be executed. Only available in the bottom-most + frame. This lets you jump back and execute code again, or jump forward to skip + code that you don't want to run. + + It should be noted that not all jumps are allowed --- for instance it is not + possible to jump into the middle of a :keyword:`for` loop or out of a + :keyword:`finally` clause. + +l(ist) [*first*[, *last*]] + List source code for the current file. Without arguments, list 11 lines around + the current line or continue the previous listing. With one argument, list 11 + lines around at that line. With two arguments, list the given range; if the + second argument is less than the first, it is interpreted as a count. + +a(rgs) + Print the argument list of the current function. + +p *expression* + Evaluate the *expression* in the current context and print its value. + + .. note:: + + ``print`` can also be used, but is not a debugger command --- this executes the + Python :keyword:`print` statement. + +pp *expression* + Like the ``p`` command, except the value of the expression is pretty-printed + using the :mod:`pprint` module. + +alias [*name* [command]] + Creates an alias called *name* that executes *command*. The command must *not* + be enclosed in quotes. Replaceable parameters can be indicated by ``%1``, + ``%2``, and so on, while ``%*`` is replaced by all the parameters. If no + command is given, the current alias for *name* is shown. If no arguments are + given, all aliases are listed. + + Aliases may be nested and can contain anything that can be legally typed at the + pdb prompt. Note that internal pdb commands *can* be overridden by aliases. + Such a command is then hidden until the alias is removed. Aliasing is + recursively applied to the first word of the command line; all other words in + the line are left alone. + + As an example, here are two useful aliases (especially when placed in the + :file:`.pdbrc` file):: + + #Print instance variables (usage "pi classInst") + alias pi for k in %1.__dict__.keys(): print "%1.",k,"=",%1.__dict__[k] + #Print instance variables in self + alias ps pi self + +unalias *name* + Deletes the specified alias. + +[!]*statement* + Execute the (one-line) *statement* in the context of the current stack frame. + The exclamation point can be omitted unless the first word of the statement + resembles a debugger command. To set a global variable, you can prefix the + assignment command with a ``global`` command on the same line, e.g.:: + + (Pdb) global list_options; list_options = ['-l'] + (Pdb) + +run [*args* ...] + Restart the debugged python program. If an argument is supplied, it is splitted + with "shlex" and the result is used as the new sys.argv. History, breakpoints, + actions and debugger options are preserved. "restart" is an alias for "run". + + .. versionadded:: 2.6 + +q(uit) + Quit from the debugger. The program being executed is aborted. + + +.. _debugger-hooks: + +How It Works +============ + +Some changes were made to the interpreter: + +* ``sys.settrace(func)`` sets the global trace function + +* there can also a local trace function (see later) + +Trace functions have three arguments: *frame*, *event*, and *arg*. *frame* is +the current stack frame. *event* is a string: ``'call'``, ``'line'``, +``'return'``, ``'exception'``, ``'c_call'``, ``'c_return'``, or +``'c_exception'``. *arg* depends on the event type. + +The global trace function is invoked (with *event* set to ``'call'``) whenever a +new local scope is entered; it should return a reference to the local trace +function to be used that scope, or ``None`` if the scope shouldn't be traced. + +The local trace function should return a reference to itself (or to another +function for further tracing in that scope), or ``None`` to turn off tracing in +that scope. + +Instance methods are accepted (and very useful!) as trace functions. + +The events have the following meaning: + +``'call'`` + A function is called (or some other code block entered). The global trace + function is called; *arg* is ``None``; the return value specifies the local + trace function. + +``'line'`` + The interpreter is about to execute a new line of code (sometimes multiple line + events on one line exist). The local trace function is called; *arg* is + ``None``; the return value specifies the new local trace function. + +``'return'`` + A function (or other code block) is about to return. The local trace function + is called; *arg* is the value that will be returned. The trace function's + return value is ignored. + +``'exception'`` + An exception has occurred. The local trace function is called; *arg* is a + triple ``(exception, value, traceback)``; the return value specifies the new + local trace function. + +``'c_call'`` + A C function is about to be called. This may be an extension function or a + builtin. *arg* is the C function object. + +``'c_return'`` + A C function has returned. *arg* is ``None``. + +``'c_exception'`` + A C function has thrown an exception. *arg* is ``None``. + +Note that as an exception is propagated down the chain of callers, an +``'exception'`` event is generated at each level. + +For more information on code and frame objects, refer to :ref:`types`. + diff --git a/Doc/library/persistence.rst b/Doc/library/persistence.rst new file mode 100644 index 0000000..78e40f6 --- /dev/null +++ b/Doc/library/persistence.rst @@ -0,0 +1,32 @@ + +.. _persistence: + +**************** +Data Persistence +**************** + +The modules described in this chapter support storing Python data in a +persistent form on disk. The :mod:`pickle` and :mod:`marshal` modules can turn +many Python data types into a stream of bytes and then recreate the objects from +the bytes. The various DBM-related modules support a family of hash-based file +formats that store a mapping of strings to other strings. The :mod:`bsddb` +module also provides such disk-based string-to-string mappings based on hashing, +and also supports B-Tree and record-based formats. + +The list of modules described in this chapter is: + + +.. toctree:: + + pickle.rst + copy_reg.rst + shelve.rst + marshal.rst + anydbm.rst + whichdb.rst + dbm.rst + gdbm.rst + dbhash.rst + bsddb.rst + dumbdbm.rst + sqlite3.rst diff --git a/Doc/library/pickle.rst b/Doc/library/pickle.rst new file mode 100644 index 0000000..ab19ff8 --- /dev/null +++ b/Doc/library/pickle.rst @@ -0,0 +1,868 @@ + +:mod:`pickle` --- Python object serialization +============================================= + +.. index:: + single: persistence + pair: persistent; objects + pair: serializing; objects + pair: marshalling; objects + pair: flattening; objects + pair: pickling; objects + +.. module:: pickle + :synopsis: Convert Python objects to streams of bytes and back. + + +.. % Substantial improvements by Jim Kerr <jbkerr@sr.hp.com>. +.. % Rewritten by Barry Warsaw <barry@zope.com> + +The :mod:`pickle` module implements a fundamental, but powerful algorithm for +serializing and de-serializing a Python object structure. "Pickling" is the +process whereby a Python object hierarchy is converted into a byte stream, and +"unpickling" is the inverse operation, whereby a byte stream is converted back +into an object hierarchy. Pickling (and unpickling) is alternatively known as +"serialization", "marshalling," [#]_ or "flattening", however, to avoid +confusion, the terms used here are "pickling" and "unpickling". + +This documentation describes both the :mod:`pickle` module and the +:mod:`cPickle` module. + + +Relationship to other Python modules +------------------------------------ + +The :mod:`pickle` module has an optimized cousin called the :mod:`cPickle` +module. As its name implies, :mod:`cPickle` is written in C, so it can be up to +1000 times faster than :mod:`pickle`. However it does not support subclassing +of the :func:`Pickler` and :func:`Unpickler` classes, because in :mod:`cPickle` +these are functions, not classes. Most applications have no need for this +functionality, and can benefit from the improved performance of :mod:`cPickle`. +Other than that, the interfaces of the two modules are nearly identical; the +common interface is described in this manual and differences are pointed out +where necessary. In the following discussions, we use the term "pickle" to +collectively describe the :mod:`pickle` and :mod:`cPickle` modules. + +The data streams the two modules produce are guaranteed to be interchangeable. + +Python has a more primitive serialization module called :mod:`marshal`, but in +general :mod:`pickle` should always be the preferred way to serialize Python +objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc` +files. + +The :mod:`pickle` module differs from :mod:`marshal` several significant ways: + +* The :mod:`pickle` module keeps track of the objects it has already serialized, + so that later references to the same object won't be serialized again. + :mod:`marshal` doesn't do this. + + This has implications both for recursive objects and object sharing. Recursive + objects are objects that contain references to themselves. These are not + handled by marshal, and in fact, attempting to marshal recursive objects will + crash your Python interpreter. Object sharing happens when there are multiple + references to the same object in different places in the object hierarchy being + serialized. :mod:`pickle` stores such objects only once, and ensures that all + other references point to the master copy. Shared objects remain shared, which + can be very important for mutable objects. + +* :mod:`marshal` cannot be used to serialize user-defined classes and their + instances. :mod:`pickle` can save and restore class instances transparently, + however the class definition must be importable and live in the same module as + when the object was stored. + +* The :mod:`marshal` serialization format is not guaranteed to be portable + across Python versions. Because its primary job in life is to support + :file:`.pyc` files, the Python implementers reserve the right to change the + serialization format in non-backwards compatible ways should the need arise. + The :mod:`pickle` serialization format is guaranteed to be backwards compatible + across Python releases. + +.. warning:: + + The :mod:`pickle` module is not intended to be secure against erroneous or + maliciously constructed data. Never unpickle data received from an untrusted or + unauthenticated source. + +Note that serialization is a more primitive notion than persistence; although +:mod:`pickle` reads and writes file objects, it does not handle the issue of +naming persistent objects, nor the (even more complicated) issue of concurrent +access to persistent objects. The :mod:`pickle` module can transform a complex +object into a byte stream and it can transform the byte stream into an object +with the same internal structure. Perhaps the most obvious thing to do with +these byte streams is to write them onto a file, but it is also conceivable to +send them across a network or store them in a database. The module +:mod:`shelve` provides a simple interface to pickle and unpickle objects on +DBM-style database files. + + +Data stream format +------------------ + +.. index:: + single: XDR + single: External Data Representation + +The data format used by :mod:`pickle` is Python-specific. This has the +advantage that there are no restrictions imposed by external standards such as +XDR (which can't represent pointer sharing); however it means that non-Python +programs may not be able to reconstruct pickled Python objects. + +By default, the :mod:`pickle` data format uses a printable ASCII representation. +This is slightly more voluminous than a binary representation. The big +advantage of using printable ASCII (and of some other characteristics of +:mod:`pickle`'s representation) is that for debugging or recovery purposes it is +possible for a human to read the pickled file with a standard text editor. + +There are currently 3 different protocols which can be used for pickling. + +* Protocol version 0 is the original ASCII protocol and is backwards compatible + with earlier versions of Python. + +* Protocol version 1 is the old binary format which is also compatible with + earlier versions of Python. + +* Protocol version 2 was introduced in Python 2.3. It provides much more + efficient pickling of new-style classes. + +Refer to :pep:`307` for more information. + +If a *protocol* is not specified, protocol 0 is used. If *protocol* is specified +as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol version +available will be used. + +.. versionchanged:: 2.3 + Introduced the *protocol* parameter. + +A binary format, which is slightly more efficient, can be chosen by specifying a +*protocol* version >= 1. + + +Usage +----- + +To serialize an object hierarchy, you first create a pickler, then you call the +pickler's :meth:`dump` method. To de-serialize a data stream, you first create +an unpickler, then you call the unpickler's :meth:`load` method. The +:mod:`pickle` module provides the following constant: + + +.. data:: HIGHEST_PROTOCOL + + The highest protocol version available. This value can be passed as a + *protocol* value. + + .. versionadded:: 2.3 + +.. note:: + + Be sure to always open pickle files created with protocols >= 1 in binary mode. + For the old ASCII-based pickle protocol 0 you can use either text mode or binary + mode as long as you stay consistent. + + A pickle file written with protocol 0 in binary mode will contain lone linefeeds + as line terminators and therefore will look "funny" when viewed in Notepad or + other editors which do not support this format. + +The :mod:`pickle` module provides the following functions to make the pickling +process more convenient: + + +.. function:: dump(obj, file[, protocol]) + + Write a pickled representation of *obj* to the open file object *file*. This is + equivalent to ``Pickler(file, protocol).dump(obj)``. + + If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is + specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol + version will be used. + + .. versionchanged:: 2.3 + Introduced the *protocol* parameter. + + *file* must have a :meth:`write` method that accepts a single string argument. + It can thus be a file object opened for writing, a :mod:`StringIO` object, or + any other custom object that meets this interface. + + +.. function:: load(file) + + Read a string from the open file object *file* and interpret it as a pickle data + stream, reconstructing and returning the original object hierarchy. This is + equivalent to ``Unpickler(file).load()``. + + *file* must have two methods, a :meth:`read` method that takes an integer + argument, and a :meth:`readline` method that requires no arguments. Both + methods should return a string. Thus *file* can be a file object opened for + reading, a :mod:`StringIO` object, or any other custom object that meets this + interface. + + This function automatically determines whether the data stream was written in + binary mode or not. + + +.. function:: dumps(obj[, protocol]) + + Return the pickled representation of the object as a string, instead of writing + it to a file. + + If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is + specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol + version will be used. + + .. versionchanged:: 2.3 + The *protocol* parameter was added. + + +.. function:: loads(string) + + Read a pickled object hierarchy from a string. Characters in the string past + the pickled object's representation are ignored. + +The :mod:`pickle` module also defines three exceptions: + + +.. exception:: PickleError + + A common base class for the other exceptions defined below. This inherits from + :exc:`Exception`. + + +.. exception:: PicklingError + + This exception is raised when an unpicklable object is passed to the + :meth:`dump` method. + + +.. exception:: UnpicklingError + + This exception is raised when there is a problem unpickling an object. Note that + other exceptions may also be raised during unpickling, including (but not + necessarily limited to) :exc:`AttributeError`, :exc:`EOFError`, + :exc:`ImportError`, and :exc:`IndexError`. + +The :mod:`pickle` module also exports two callables [#]_, :class:`Pickler` and +:class:`Unpickler`: + + +.. class:: Pickler(file[, protocol]) + + This takes a file-like object to which it will write a pickle data stream. + + If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is + specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest + protocol version will be used. + + .. versionchanged:: 2.3 + Introduced the *protocol* parameter. + + *file* must have a :meth:`write` method that accepts a single string argument. + It can thus be an open file object, a :mod:`StringIO` object, or any other + custom object that meets this interface. + +:class:`Pickler` objects define one (or two) public methods: + + +.. method:: Pickler.dump(obj) + + Write a pickled representation of *obj* to the open file object given in the + constructor. Either the binary or ASCII format will be used, depending on the + value of the *protocol* argument passed to the constructor. + + +.. method:: Pickler.clear_memo() + + Clears the pickler's "memo". The memo is the data structure that remembers + which objects the pickler has already seen, so that shared or recursive objects + pickled by reference and not by value. This method is useful when re-using + picklers. + + .. note:: + + Prior to Python 2.3, :meth:`clear_memo` was only available on the picklers + created by :mod:`cPickle`. In the :mod:`pickle` module, picklers have an + instance variable called :attr:`memo` which is a Python dictionary. So to clear + the memo for a :mod:`pickle` module pickler, you could do the following:: + + mypickler.memo.clear() + + Code that does not need to support older versions of Python should simply use + :meth:`clear_memo`. + +It is possible to make multiple calls to the :meth:`dump` method of the same +:class:`Pickler` instance. These must then be matched to the same number of +calls to the :meth:`load` method of the corresponding :class:`Unpickler` +instance. If the same object is pickled by multiple :meth:`dump` calls, the +:meth:`load` will all yield references to the same object. [#]_ + +:class:`Unpickler` objects are defined as: + + +.. class:: Unpickler(file) + + This takes a file-like object from which it will read a pickle data stream. + This class automatically determines whether the data stream was written in + binary mode or not, so it does not need a flag as in the :class:`Pickler` + factory. + + *file* must have two methods, a :meth:`read` method that takes an integer + argument, and a :meth:`readline` method that requires no arguments. Both + methods should return a string. Thus *file* can be a file object opened for + reading, a :mod:`StringIO` object, or any other custom object that meets this + interface. + +:class:`Unpickler` objects have one (or two) public methods: + + +.. method:: Unpickler.load() + + Read a pickled object representation from the open file object given in the + constructor, and return the reconstituted object hierarchy specified therein. + + This method automatically determines whether the data stream was written in + binary mode or not. + + +.. method:: Unpickler.noload() + + This is just like :meth:`load` except that it doesn't actually create any + objects. This is useful primarily for finding what's called "persistent ids" + that may be referenced in a pickle data stream. See section + :ref:`pickle-protocol` below for more details. + + **Note:** the :meth:`noload` method is currently only available on + :class:`Unpickler` objects created with the :mod:`cPickle` module. + :mod:`pickle` module :class:`Unpickler`\ s do not have the :meth:`noload` + method. + + +What can be pickled and unpickled? +---------------------------------- + +The following types can be pickled: + +* ``None``, ``True``, and ``False`` + +* integers, long integers, floating point numbers, complex numbers + +* normal and Unicode strings + +* tuples, lists, sets, and dictionaries containing only picklable objects + +* functions defined at the top level of a module + +* built-in functions defined at the top level of a module + +* classes that are defined at the top level of a module + +* instances of such classes whose :attr:`__dict__` or :meth:`__setstate__` is + picklable (see section :ref:`pickle-protocol` for details) + +Attempts to pickle unpicklable objects will raise the :exc:`PicklingError` +exception; when this happens, an unspecified number of bytes may have already +been written to the underlying file. Trying to pickle a highly recursive data +structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be +raised in this case. You can carefully raise this limit with +:func:`sys.setrecursionlimit`. + +Note that functions (built-in and user-defined) are pickled by "fully qualified" +name reference, not by value. This means that only the function name is +pickled, along with the name of module the function is defined in. Neither the +function's code, nor any of its function attributes are pickled. Thus the +defining module must be importable in the unpickling environment, and the module +must contain the named object, otherwise an exception will be raised. [#]_ + +Similarly, classes are pickled by named reference, so the same restrictions in +the unpickling environment apply. Note that none of the class's code or data is +pickled, so in the following example the class attribute ``attr`` is not +restored in the unpickling environment:: + + class Foo: + attr = 'a class attr' + + picklestring = pickle.dumps(Foo) + +These restrictions are why picklable functions and classes must be defined in +the top level of a module. + +Similarly, when class instances are pickled, their class's code and data are not +pickled along with them. Only the instance data are pickled. This is done on +purpose, so you can fix bugs in a class or add methods to the class and still +load objects that were created with an earlier version of the class. If you +plan to have long-lived objects that will see many versions of a class, it may +be worthwhile to put a version number in the objects so that suitable +conversions can be made by the class's :meth:`__setstate__` method. + + +.. _pickle-protocol: + +The pickle protocol +------------------- + +This section describes the "pickling protocol" that defines the interface +between the pickler/unpickler and the objects that are being serialized. This +protocol provides a standard way for you to define, customize, and control how +your objects are serialized and de-serialized. The description in this section +doesn't cover specific customizations that you can employ to make the unpickling +environment slightly safer from untrusted pickle data streams; see section +:ref:`pickle-sub` for more details. + + +.. _pickle-inst: + +Pickling and unpickling normal class instances +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. index:: + single: __getinitargs__() (copy protocol) + single: __init__() (instance constructor) + +When a pickled class instance is unpickled, its :meth:`__init__` method is +normally *not* invoked. If it is desirable that the :meth:`__init__` method be +called on unpickling, an old-style class can define a method +:meth:`__getinitargs__`, which should return a *tuple* containing the arguments +to be passed to the class constructor (:meth:`__init__` for example). The +:meth:`__getinitargs__` method is called at pickle time; the tuple it returns is +incorporated in the pickle for the instance. + +.. index:: single: __getnewargs__() (copy protocol) + +New-style types can provide a :meth:`__getnewargs__` method that is used for +protocol 2. Implementing this method is needed if the type establishes some +internal invariants when the instance is created, or if the memory allocation is +affected by the values passed to the :meth:`__new__` method for the type (as it +is for tuples and strings). Instances of a new-style type :class:`C` are +created using :: + + obj = C.__new__(C, *args) + + +where *args* is the result of calling :meth:`__getnewargs__` on the original +object; if there is no :meth:`__getnewargs__`, an empty tuple is assumed. + +.. index:: + single: __getstate__() (copy protocol) + single: __setstate__() (copy protocol) + single: __dict__ (instance attribute) + +Classes can further influence how their instances are pickled; if the class +defines the method :meth:`__getstate__`, it is called and the return state is +pickled as the contents for the instance, instead of the contents of the +instance's dictionary. If there is no :meth:`__getstate__` method, the +instance's :attr:`__dict__` is pickled. + +Upon unpickling, if the class also defines the method :meth:`__setstate__`, it +is called with the unpickled state. [#]_ If there is no :meth:`__setstate__` +method, the pickled state must be a dictionary and its items are assigned to the +new instance's dictionary. If a class defines both :meth:`__getstate__` and +:meth:`__setstate__`, the state object needn't be a dictionary and these methods +can do what they want. [#]_ + +.. warning:: + + For new-style classes, if :meth:`__getstate__` returns a false value, the + :meth:`__setstate__` method will not be called. + + +Pickling and unpickling extension types +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When the :class:`Pickler` encounters an object of a type it knows nothing about +--- such as an extension type --- it looks in two places for a hint of how to +pickle it. One alternative is for the object to implement a :meth:`__reduce__` +method. If provided, at pickling time :meth:`__reduce__` will be called with no +arguments, and it must return either a string or a tuple. + +If a string is returned, it names a global variable whose contents are pickled +as normal. The string returned by :meth:`__reduce__` should be the object's +local name relative to its module; the pickle module searches the module +namespace to determine the object's module. + +When a tuple is returned, it must be between two and five elements long. +Optional elements can either be omitted, or ``None`` can be provided as their +value. The semantics of each element are: + +* A callable object that will be called to create the initial version of the + object. The next element of the tuple will provide arguments for this callable, + and later elements provide additional state information that will subsequently + be used to fully reconstruct the pickled data. + + In the unpickling environment this object must be either a class, a callable + registered as a "safe constructor" (see below), or it must have an attribute + :attr:`__safe_for_unpickling__` with a true value. Otherwise, an + :exc:`UnpicklingError` will be raised in the unpickling environment. Note that + as usual, the callable itself is pickled by name. + +* A tuple of arguments for the callable object. + + .. versionchanged:: 2.5 + Formerly, this argument could also be ``None``. + +* Optionally, the object's state, which will be passed to the object's + :meth:`__setstate__` method as described in section :ref:`pickle-inst`. If the + object has no :meth:`__setstate__` method, then, as above, the value must be a + dictionary and it will be added to the object's :attr:`__dict__`. + +* Optionally, an iterator (and not a sequence) yielding successive list items. + These list items will be pickled, and appended to the object using either + ``obj.append(item)`` or ``obj.extend(list_of_items)``. This is primarily used + for list subclasses, but may be used by other classes as long as they have + :meth:`append` and :meth:`extend` methods with the appropriate signature. + (Whether :meth:`append` or :meth:`extend` is used depends on which pickle + protocol version is used as well as the number of items to append, so both must + be supported.) + +* Optionally, an iterator (not a sequence) yielding successive dictionary items, + which should be tuples of the form ``(key, value)``. These items will be + pickled and stored to the object using ``obj[key] = value``. This is primarily + used for dictionary subclasses, but may be used by other classes as long as they + implement :meth:`__setitem__`. + +It is sometimes useful to know the protocol version when implementing +:meth:`__reduce__`. This can be done by implementing a method named +:meth:`__reduce_ex__` instead of :meth:`__reduce__`. :meth:`__reduce_ex__`, when +it exists, is called in preference over :meth:`__reduce__` (you may still +provide :meth:`__reduce__` for backwards compatibility). The +:meth:`__reduce_ex__` method will be called with a single integer argument, the +protocol version. + +The :class:`object` class implements both :meth:`__reduce__` and +:meth:`__reduce_ex__`; however, if a subclass overrides :meth:`__reduce__` but +not :meth:`__reduce_ex__`, the :meth:`__reduce_ex__` implementation detects this +and calls :meth:`__reduce__`. + +An alternative to implementing a :meth:`__reduce__` method on the object to be +pickled, is to register the callable with the :mod:`copy_reg` module. This +module provides a way for programs to register "reduction functions" and +constructors for user-defined types. Reduction functions have the same +semantics and interface as the :meth:`__reduce__` method described above, except +that they are called with a single argument, the object to be pickled. + +The registered constructor is deemed a "safe constructor" for purposes of +unpickling as described above. + + +Pickling and unpickling external objects +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For the benefit of object persistence, the :mod:`pickle` module supports the +notion of a reference to an object outside the pickled data stream. Such +objects are referenced by a "persistent id", which is just an arbitrary string +of printable ASCII characters. The resolution of such names is not defined by +the :mod:`pickle` module; it will delegate this resolution to user defined +functions on the pickler and unpickler. [#]_ + +To define external persistent id resolution, you need to set the +:attr:`persistent_id` attribute of the pickler object and the +:attr:`persistent_load` attribute of the unpickler object. + +To pickle objects that have an external persistent id, the pickler must have a +custom :func:`persistent_id` method that takes an object as an argument and +returns either ``None`` or the persistent id for that object. When ``None`` is +returned, the pickler simply pickles the object as normal. When a persistent id +string is returned, the pickler will pickle that string, along with a marker so +that the unpickler will recognize the string as a persistent id. + +To unpickle external objects, the unpickler must have a custom +:func:`persistent_load` function that takes a persistent id string and returns +the referenced object. + +Here's a silly example that *might* shed more light:: + + import pickle + from cStringIO import StringIO + + src = StringIO() + p = pickle.Pickler(src) + + def persistent_id(obj): + if hasattr(obj, 'x'): + return 'the value %d' % obj.x + else: + return None + + p.persistent_id = persistent_id + + class Integer: + def __init__(self, x): + self.x = x + def __str__(self): + return 'My name is integer %d' % self.x + + i = Integer(7) + print i + p.dump(i) + + datastream = src.getvalue() + print repr(datastream) + dst = StringIO(datastream) + + up = pickle.Unpickler(dst) + + class FancyInteger(Integer): + def __str__(self): + return 'I am the integer %d' % self.x + + def persistent_load(persid): + if persid.startswith('the value '): + value = int(persid.split()[2]) + return FancyInteger(value) + else: + raise pickle.UnpicklingError, 'Invalid persistent id' + + up.persistent_load = persistent_load + + j = up.load() + print j + +In the :mod:`cPickle` module, the unpickler's :attr:`persistent_load` attribute +can also be set to a Python list, in which case, when the unpickler reaches a +persistent id, the persistent id string will simply be appended to this list. +This functionality exists so that a pickle data stream can be "sniffed" for +object references without actually instantiating all the objects in a pickle. +[#]_ Setting :attr:`persistent_load` to a list is usually used in conjunction +with the :meth:`noload` method on the Unpickler. + +.. % BAW: Both pickle and cPickle support something called +.. % inst_persistent_id() which appears to give unknown types a second +.. % shot at producing a persistent id. Since Jim Fulton can't remember +.. % why it was added or what it's for, I'm leaving it undocumented. + + +.. _pickle-sub: + +Subclassing Unpicklers +---------------------- + +By default, unpickling will import any class that it finds in the pickle data. +You can control exactly what gets unpickled and what gets called by customizing +your unpickler. Unfortunately, exactly how you do this is different depending +on whether you're using :mod:`pickle` or :mod:`cPickle`. [#]_ + +In the :mod:`pickle` module, you need to derive a subclass from +:class:`Unpickler`, overriding the :meth:`load_global` method. +:meth:`load_global` should read two lines from the pickle data stream where the +first line will the name of the module containing the class and the second line +will be the name of the instance's class. It then looks up the class, possibly +importing the module and digging out the attribute, then it appends what it +finds to the unpickler's stack. Later on, this class will be assigned to the +:attr:`__class__` attribute of an empty class, as a way of magically creating an +instance without calling its class's :meth:`__init__`. Your job (should you +choose to accept it), would be to have :meth:`load_global` push onto the +unpickler's stack, a known safe version of any class you deem safe to unpickle. +It is up to you to produce such a class. Or you could raise an error if you +want to disallow all unpickling of instances. If this sounds like a hack, +you're right. Refer to the source code to make this work. + +Things are a little cleaner with :mod:`cPickle`, but not by much. To control +what gets unpickled, you can set the unpickler's :attr:`find_global` attribute +to a function or ``None``. If it is ``None`` then any attempts to unpickle +instances will raise an :exc:`UnpicklingError`. If it is a function, then it +should accept a module name and a class name, and return the corresponding class +object. It is responsible for looking up the class and performing any necessary +imports, and it may raise an error to prevent instances of the class from being +unpickled. + +The moral of the story is that you should be really careful about the source of +the strings your application unpickles. + + +.. _pickle-example: + +Example +------- + +For the simplest code, use the :func:`dump` and :func:`load` functions. Note +that a self-referencing list is pickled and restored correctly. :: + + import pickle + + data1 = {'a': [1, 2.0, 3, 4+6j], + 'b': ('string', u'Unicode string'), + 'c': None} + + selfref_list = [1, 2, 3] + selfref_list.append(selfref_list) + + output = open('data.pkl', 'wb') + + # Pickle dictionary using protocol 0. + pickle.dump(data1, output) + + # Pickle the list using the highest protocol available. + pickle.dump(selfref_list, output, -1) + + output.close() + +The following example reads the resulting pickled data. When reading a +pickle-containing file, you should open the file in binary mode because you +can't be sure if the ASCII or binary format was used. :: + + import pprint, pickle + + pkl_file = open('data.pkl', 'rb') + + data1 = pickle.load(pkl_file) + pprint.pprint(data1) + + data2 = pickle.load(pkl_file) + pprint.pprint(data2) + + pkl_file.close() + +Here's a larger example that shows how to modify pickling behavior for a class. +The :class:`TextReader` class opens a text file, and returns the line number and +line contents each time its :meth:`readline` method is called. If a +:class:`TextReader` instance is pickled, all attributes *except* the file object +member are saved. When the instance is unpickled, the file is reopened, and +reading resumes from the last location. The :meth:`__setstate__` and +:meth:`__getstate__` methods are used to implement this behavior. :: + + #!/usr/local/bin/python + + class TextReader: + """Print and number lines in a text file.""" + def __init__(self, file): + self.file = file + self.fh = open(file) + self.lineno = 0 + + def readline(self): + self.lineno = self.lineno + 1 + line = self.fh.readline() + if not line: + return None + if line.endswith("\n"): + line = line[:-1] + return "%d: %s" % (self.lineno, line) + + def __getstate__(self): + odict = self.__dict__.copy() # copy the dict since we change it + del odict['fh'] # remove filehandle entry + return odict + + def __setstate__(self, dict): + fh = open(dict['file']) # reopen file + count = dict['lineno'] # read from file... + while count: # until line count is restored + fh.readline() + count = count - 1 + self.__dict__.update(dict) # update attributes + self.fh = fh # save the file object + +A sample usage might be something like this:: + + >>> import TextReader + >>> obj = TextReader.TextReader("TextReader.py") + >>> obj.readline() + '1: #!/usr/local/bin/python' + >>> obj.readline() + '2: ' + >>> obj.readline() + '3: class TextReader:' + >>> import pickle + >>> pickle.dump(obj, open('save.p', 'wb')) + +If you want to see that :mod:`pickle` works across Python processes, start +another Python session, before continuing. What follows can happen from either +the same process or a new process. :: + + >>> import pickle + >>> reader = pickle.load(open('save.p', 'rb')) + >>> reader.readline() + '4: """Print and number lines in a text file."""' + + +.. seealso:: + + Module :mod:`copy_reg` + Pickle interface constructor registration for extension types. + + Module :mod:`shelve` + Indexed databases of objects; uses :mod:`pickle`. + + Module :mod:`copy` + Shallow and deep object copying. + + Module :mod:`marshal` + High-performance serialization of built-in types. + + +:mod:`cPickle` --- A faster :mod:`pickle` +========================================= + +.. module:: cPickle + :synopsis: Faster version of pickle, but not subclassable. +.. moduleauthor:: Jim Fulton <jim@zope.com> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. index:: module: pickle + +The :mod:`cPickle` module supports serialization and de-serialization of Python +objects, providing an interface and functionality nearly identical to the +:mod:`pickle` module. There are several differences, the most important being +performance and subclassability. + +First, :mod:`cPickle` can be up to 1000 times faster than :mod:`pickle` because +the former is implemented in C. Second, in the :mod:`cPickle` module the +callables :func:`Pickler` and :func:`Unpickler` are functions, not classes. +This means that you cannot use them to derive custom pickling and unpickling +subclasses. Most applications have no need for this functionality and should +benefit from the greatly improved performance of the :mod:`cPickle` module. + +The pickle data stream produced by :mod:`pickle` and :mod:`cPickle` are +identical, so it is possible to use :mod:`pickle` and :mod:`cPickle` +interchangeably with existing pickles. [#]_ + +There are additional minor differences in API between :mod:`cPickle` and +:mod:`pickle`, however for most applications, they are interchangeable. More +documentation is provided in the :mod:`pickle` module documentation, which +includes a list of the documented differences. + +.. rubric:: Footnotes + +.. [#] Don't confuse this with the :mod:`marshal` module + +.. [#] In the :mod:`pickle` module these callables are classes, which you could + subclass to customize the behavior. However, in the :mod:`cPickle` module these + callables are factory functions and so cannot be subclassed. One common reason + to subclass is to control what objects can actually be unpickled. See section + :ref:`pickle-sub` for more details. + +.. [#] *Warning*: this is intended for pickling multiple objects without intervening + modifications to the objects or their parts. If you modify an object and then + pickle it again using the same :class:`Pickler` instance, the object is not + pickled again --- a reference to it is pickled and the :class:`Unpickler` will + return the old value, not the modified one. There are two problems here: (1) + detecting changes, and (2) marshalling a minimal set of changes. Garbage + Collection may also become a problem here. + +.. [#] The exception raised will likely be an :exc:`ImportError` or an + :exc:`AttributeError` but it could be something else. + +.. [#] These methods can also be used to implement copying class instances. + +.. [#] This protocol is also used by the shallow and deep copying operations defined in + the :mod:`copy` module. + +.. [#] The actual mechanism for associating these user defined functions is slightly + different for :mod:`pickle` and :mod:`cPickle`. The description given here + works the same for both implementations. Users of the :mod:`pickle` module + could also use subclassing to effect the same results, overriding the + :meth:`persistent_id` and :meth:`persistent_load` methods in the derived + classes. + +.. [#] We'll leave you with the image of Guido and Jim sitting around sniffing pickles + in their living rooms. + +.. [#] A word of caution: the mechanisms described here use internal attributes and + methods, which are subject to change in future versions of Python. We intend to + someday provide a common interface for controlling this behavior, which will + work in either :mod:`pickle` or :mod:`cPickle`. + +.. [#] Since the pickle data format is actually a tiny stack-oriented programming + language, and some freedom is taken in the encodings of certain objects, it is + possible that the two modules produce different data streams for the same input + objects. However it is guaranteed that they will always be able to read each + other's data streams. + diff --git a/Doc/library/pickletools.rst b/Doc/library/pickletools.rst new file mode 100644 index 0000000..ec220d9 --- /dev/null +++ b/Doc/library/pickletools.rst @@ -0,0 +1,37 @@ + +:mod:`pickletools` --- Tools for pickle developers. +=================================================== + +.. module:: pickletools + :synopsis: Contains extensive comments about the pickle protocols and pickle-machine + opcodes, as well as some useful functions. + + +.. versionadded:: 2.3 + +This module contains various constants relating to the intimate details of the +:mod:`pickle` module, some lengthy comments about the implementation, and a few +useful functions for analyzing pickled data. The contents of this module are +useful for Python core developers who are working on the :mod:`pickle` and +:mod:`cPickle` implementations; ordinary users of the :mod:`pickle` module +probably won't find the :mod:`pickletools` module relevant. + + +.. function:: dis(pickle[, out=None, memo=None, indentlevel=4]) + + Outputs a symbolic disassembly of the pickle to the file-like object *out*, + defaulting to ``sys.stdout``. *pickle* can be a string or a file-like object. + *memo* can be a Python dictionary that will be used as the pickle's memo; it can + be used to perform disassemblies across multiple pickles created by the same + pickler. Successive levels, indicated by ``MARK`` opcodes in the stream, are + indented by *indentlevel* spaces. + + +.. function:: genops(pickle) + + Provides an iterator over all of the opcodes in a pickle, returning a sequence + of ``(opcode, arg, pos)`` triples. *opcode* is an instance of an + :class:`OpcodeInfo` class; *arg* is the decoded value, as a Python object, of + the opcode's argument; *pos* is the position at which this opcode is located. + *pickle* can be a string or a file-like object. + diff --git a/Doc/library/pipes.rst b/Doc/library/pipes.rst new file mode 100644 index 0000000..1f2b2ff --- /dev/null +++ b/Doc/library/pipes.rst @@ -0,0 +1,92 @@ + +:mod:`pipes` --- Interface to shell pipelines +============================================= + +.. module:: pipes + :platform: Unix + :synopsis: A Python interface to Unix shell pipelines. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`pipes` module defines a class to abstract the concept of a *pipeline* +--- a sequence of converters from one file to another. + +Because the module uses :program:`/bin/sh` command lines, a POSIX or compatible +shell for :func:`os.system` and :func:`os.popen` is required. + +The :mod:`pipes` module defines the following class: + + +.. class:: Template() + + An abstraction of a pipeline. + +Example:: + + >>> import pipes + >>> t=pipes.Template() + >>> t.append('tr a-z A-Z', '--') + >>> f=t.open('/tmp/1', 'w') + >>> f.write('hello world') + >>> f.close() + >>> open('/tmp/1').read() + 'HELLO WORLD' + + +.. _template-objects: + +Template Objects +---------------- + +Template objects following methods: + + +.. method:: Template.reset() + + Restore a pipeline template to its initial state. + + +.. method:: Template.clone() + + Return a new, equivalent, pipeline template. + + +.. method:: Template.debug(flag) + + If *flag* is true, turn debugging on. Otherwise, turn debugging off. When + debugging is on, commands to be executed are printed, and the shell is given + ``set -x`` command to be more verbose. + + +.. method:: Template.append(cmd, kind) + + Append a new action at the end. The *cmd* variable must be a valid bourne shell + command. The *kind* variable consists of two letters. + + The first letter can be either of ``'-'`` (which means the command reads its + standard input), ``'f'`` (which means the commands reads a given file on the + command line) or ``'.'`` (which means the commands reads no input, and hence + must be first.) + + Similarly, the second letter can be either of ``'-'`` (which means the command + writes to standard output), ``'f'`` (which means the command writes a file on + the command line) or ``'.'`` (which means the command does not write anything, + and hence must be last.) + + +.. method:: Template.prepend(cmd, kind) + + Add a new action at the beginning. See :meth:`append` for explanations of the + arguments. + + +.. method:: Template.open(file, mode) + + Return a file-like object, open to *file*, but read from or written to by the + pipeline. Note that only one of ``'r'``, ``'w'`` may be given. + + +.. method:: Template.copy(infile, outfile) + + Copy *infile* to *outfile* through the pipe. + diff --git a/Doc/library/pkgutil.rst b/Doc/library/pkgutil.rst new file mode 100644 index 0000000..1fbfb04 --- /dev/null +++ b/Doc/library/pkgutil.rst @@ -0,0 +1,43 @@ + +:mod:`pkgutil` --- Package extension utility +============================================ + +.. module:: pkgutil + :synopsis: Utilities to support extension of packages. + + +.. versionadded:: 2.3 + +This module provides a single function: + + +.. function:: extend_path(path, name) + + Extend the search path for the modules which comprise a package. Intended use is + to place the following code in a package's :file:`__init__.py`:: + + from pkgutil import extend_path + __path__ = extend_path(__path__, __name__) + + This will add to the package's ``__path__`` all subdirectories of directories on + ``sys.path`` named after the package. This is useful if one wants to distribute + different parts of a single logical package as multiple directories. + + It also looks for :file:`\*.pkg` files beginning where ``*`` matches the *name* + argument. This feature is similar to :file:`\*.pth` files (see the :mod:`site` + module for more information), except that it doesn't special-case lines starting + with ``import``. A :file:`\*.pkg` file is trusted at face value: apart from + checking for duplicates, all entries found in a :file:`\*.pkg` file are added to + the path, regardless of whether they exist on the filesystem. (This is a + feature.) + + If the input path is not a list (as is the case for frozen packages) it is + returned unchanged. The input path is not modified; an extended copy is + returned. Items are only appended to the copy at the end. + + It is assumed that ``sys.path`` is a sequence. Items of ``sys.path`` that are + not (Unicode or 8-bit) strings referring to existing directories are ignored. + Unicode items on ``sys.path`` that cause errors when used as filenames may cause + this function to raise an exception (in line with :func:`os.path.isdir` + behavior). + diff --git a/Doc/library/platform.rst b/Doc/library/platform.rst new file mode 100644 index 0000000..a4570d2 --- /dev/null +++ b/Doc/library/platform.rst @@ -0,0 +1,256 @@ + +:mod:`platform` --- Access to underlying platform's identifying data. +====================================================================== + +.. module:: platform + :synopsis: Retrieves as much platform identifying data as possible. +.. moduleauthor:: Marc-Andre Lemburg <mal@egenix.com> +.. sectionauthor:: Bjorn Pettersen <bpettersen@corp.fairisaac.com> + + +.. versionadded:: 2.3 + +.. note:: + + Specific platforms listed alphabetically, with Linux included in the Unix + section. + + +Cross Platform +-------------- + + +.. function:: architecture(executable=sys.executable, bits='', linkage='') + + Queries the given executable (defaults to the Python interpreter binary) for + various architecture information. + + Returns a tuple ``(bits, linkage)`` which contain information about the bit + architecture and the linkage format used for the executable. Both values are + returned as strings. + + Values that cannot be determined are returned as given by the parameter presets. + If bits is given as ``''``, the :cfunc:`sizeof(pointer)` (or + :cfunc:`sizeof(long)` on Python version < 1.5.2) is used as indicator for the + supported pointer size. + + The function relies on the system's :file:`file` command to do the actual work. + This is available on most if not all Unix platforms and some non-Unix platforms + and then only if the executable points to the Python interpreter. Reasonable + defaults are used when the above needs are not met. + + +.. function:: machine() + + Returns the machine type, e.g. ``'i386'``. An empty string is returned if the + value cannot be determined. + + +.. function:: node() + + Returns the computer's network name (may not be fully qualified!). An empty + string is returned if the value cannot be determined. + + +.. function:: platform(aliased=0, terse=0) + + Returns a single string identifying the underlying platform with as much useful + information as possible. + + The output is intended to be *human readable* rather than machine parseable. It + may look different on different platforms and this is intended. + + If *aliased* is true, the function will use aliases for various platforms that + report system names which differ from their common names, for example SunOS will + be reported as Solaris. The :func:`system_alias` function is used to implement + this. + + Setting *terse* to true causes the function to return only the absolute minimum + information needed to identify the platform. + + +.. function:: processor() + + Returns the (real) processor name, e.g. ``'amdk6'``. + + An empty string is returned if the value cannot be determined. Note that many + platforms do not provide this information or simply return the same value as for + :func:`machine`. NetBSD does this. + + +.. function:: python_build() + + Returns a tuple ``(buildno, builddate)`` stating the Python build number and + date as strings. + + +.. function:: python_compiler() + + Returns a string identifying the compiler used for compiling Python. + + +.. function:: python_branch() + + Returns a string identifying the Python implementation SCM branch. + + .. versionadded:: 2.6 + + +.. function:: python_implementation() + + Returns a string identifying the Python implementation. Possible return values + are: 'CPython', 'IronPython', 'Jython' + + .. versionadded:: 2.6 + + +.. function:: python_revision() + + Returns a string identifying the Python implementation SCM revision. + + .. versionadded:: 2.6 + + +.. function:: python_version() + + Returns the Python version as string ``'major.minor.patchlevel'`` + + Note that unlike the Python ``sys.version``, the returned value will always + include the patchlevel (it defaults to 0). + + +.. function:: python_version_tuple() + + Returns the Python version as tuple ``(major, minor, patchlevel)`` of strings. + + Note that unlike the Python ``sys.version``, the returned value will always + include the patchlevel (it defaults to ``'0'``). + + +.. function:: release() + + Returns the system's release, e.g. ``'2.2.0'`` or ``'NT'`` An empty string is + returned if the value cannot be determined. + + +.. function:: system() + + Returns the system/OS name, e.g. ``'Linux'``, ``'Windows'``, or ``'Java'``. An + empty string is returned if the value cannot be determined. + + +.. function:: system_alias(system, release, version) + + Returns ``(system, release, version)`` aliased to common marketing names used + for some systems. It also does some reordering of the information in some cases + where it would otherwise cause confusion. + + +.. function:: version() + + Returns the system's release version, e.g. ``'#3 on degas'``. An empty string is + returned if the value cannot be determined. + + +.. function:: uname() + + Fairly portable uname interface. Returns a tuple of strings ``(system, node, + release, version, machine, processor)`` identifying the underlying platform. + + Note that unlike the :func:`os.uname` function this also returns possible + processor information as additional tuple entry. + + Entries which cannot be determined are set to ``''``. + + +Java Platform +------------- + + +.. function:: java_ver(release='', vendor='', vminfo=('','',''), osinfo=('','','')) + + Version interface for JPython. + + Returns a tuple ``(release, vendor, vminfo, osinfo)`` with *vminfo* being a + tuple ``(vm_name, vm_release, vm_vendor)`` and *osinfo* being a tuple + ``(os_name, os_version, os_arch)``. Values which cannot be determined are set to + the defaults given as parameters (which all default to ``''``). + + +Windows Platform +---------------- + + +.. function:: win32_ver(release='', version='', csd='', ptype='') + + Get additional version information from the Windows Registry and return a tuple + ``(version, csd, ptype)`` referring to version number, CSD level and OS type + (multi/single processor). + + As a hint: *ptype* is ``'Uniprocessor Free'`` on single processor NT machines + and ``'Multiprocessor Free'`` on multi processor machines. The *'Free'* refers + to the OS version being free of debugging code. It could also state *'Checked'* + which means the OS version uses debugging code, i.e. code that checks arguments, + ranges, etc. + + .. note:: + + This function only works if Mark Hammond's :mod:`win32all` package is installed + and (obviously) only runs on Win32 compatible platforms. + + +Win95/98 specific +^^^^^^^^^^^^^^^^^ + + +.. function:: popen(cmd, mode='r', bufsize=None) + + Portable :func:`popen` interface. Find a working popen implementation + preferring :func:`win32pipe.popen`. On Windows NT, :func:`win32pipe.popen` + should work; on Windows 9x it hangs due to bugs in the MS C library. + + .. % This KnowledgeBase article appears to be missing... + .. % See also \ulink{MS KnowledgeBase article Q150956}{}. + + +Mac OS Platform +--------------- + + +.. function:: mac_ver(release='', versioninfo=('','',''), machine='') + + Get Mac OS version information and return it as tuple ``(release, versioninfo, + machine)`` with *versioninfo* being a tuple ``(version, dev_stage, + non_release_version)``. + + Entries which cannot be determined are set to ``''``. All tuple entries are + strings. + + Documentation for the underlying :cfunc:`gestalt` API is available online at + http://www.rgaros.nl/gestalt/. + + +Unix Platforms +-------------- + + +.. function:: dist(distname='', version='', id='', supported_dists=('SuSE','debian','redhat','mandrake')) + + Tries to determine the name of the OS distribution name Returns a tuple + ``(distname, version, id)`` which defaults to the args given as parameters. + +.. % Document linux_distribution()? + + +.. function:: libc_ver(executable=sys.executable, lib='', version='', chunksize=2048) + + Tries to determine the libc version against which the file executable (defaults + to the Python interpreter) is linked. Returns a tuple of strings ``(lib, + version)`` which default to the given parameters in case the lookup fails. + + Note that this function has intimate knowledge of how different libc versions + add symbols to the executable is probably only useable for executables compiled + using :program:`gcc`. + + The file is read and scanned in chunks of *chunksize* bytes. + diff --git a/Doc/library/poplib.rst b/Doc/library/poplib.rst new file mode 100644 index 0000000..5716204 --- /dev/null +++ b/Doc/library/poplib.rst @@ -0,0 +1,202 @@ + +:mod:`poplib` --- POP3 protocol client +====================================== + +.. module:: poplib + :synopsis: POP3 protocol client (requires sockets). + + +.. index:: pair: POP3; protocol + +.. % By Andrew T. Csillag +.. % Even though I put it into LaTeX, I cannot really claim that I wrote +.. % it since I just stole most of it from the poplib.py source code and +.. % the imaplib ``chapter''. +.. % Revised by ESR, January 2000 + +This module defines a class, :class:`POP3`, which encapsulates a connection to a +POP3 server and implements the protocol as defined in :rfc:`1725`. The +:class:`POP3` class supports both the minimal and optional command sets. +Additionally, this module provides a class :class:`POP3_SSL`, which provides +support for connecting to POP3 servers that use SSL as an underlying protocol +layer. + +Note that POP3, though widely supported, is obsolescent. The implementation +quality of POP3 servers varies widely, and too many are quite poor. If your +mailserver supports IMAP, you would be better off using the +:class:`imaplib.IMAP4` class, as IMAP servers tend to be better implemented. + +A single class is provided by the :mod:`poplib` module: + + +.. class:: POP3(host[, port[, timeout]]) + + This class implements the actual POP3 protocol. The connection is created when + the instance is initialized. If *port* is omitted, the standard POP3 port (110) + is used. The optional *timeout* parameter specifies a timeout in seconds for the + connection attempt (if not specified, or passed as None, the global default + timeout setting will be used). + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. class:: POP3_SSL(host[, port[, keyfile[, certfile]]]) + + This is a subclass of :class:`POP3` that connects to the server over an SSL + encrypted socket. If *port* is not specified, 995, the standard POP3-over-SSL + port is used. *keyfile* and *certfile* are also optional - they can contain a + PEM formatted private key and certificate chain file for the SSL connection. + + .. versionadded:: 2.4 + +One exception is defined as an attribute of the :mod:`poplib` module: + + +.. exception:: error_proto + + Exception raised on any errors from this module (errors from :mod:`socket` + module are not caught). The reason for the exception is passed to the + constructor as a string. + + +.. seealso:: + + Module :mod:`imaplib` + The standard Python IMAP module. + + `Frequently Asked Questions About Fetchmail <http://www.catb.org/~esr/fetchmail/fetchmail-FAQ.html>`_ + The FAQ for the :program:`fetchmail` POP/IMAP client collects information on + POP3 server variations and RFC noncompliance that may be useful if you need to + write an application based on the POP protocol. + + +.. _pop3-objects: + +POP3 Objects +------------ + +All POP3 commands are represented by methods of the same name, in lower-case; +most return the response text sent by the server. + +An :class:`POP3` instance has the following methods: + + +.. method:: POP3.set_debuglevel(level) + + Set the instance's debugging level. This controls the amount of debugging + output printed. The default, ``0``, produces no debugging output. A value of + ``1`` produces a moderate amount of debugging output, generally a single line + per request. A value of ``2`` or higher produces the maximum amount of + debugging output, logging each line sent and received on the control connection. + + +.. method:: POP3.getwelcome() + + Returns the greeting string sent by the POP3 server. + + +.. method:: POP3.user(username) + + Send user command, response should indicate that a password is required. + + +.. method:: POP3.pass_(password) + + Send password, response includes message count and mailbox size. Note: the + mailbox on the server is locked until :meth:`quit` is called. + + +.. method:: POP3.apop(user, secret) + + Use the more secure APOP authentication to log into the POP3 server. + + +.. method:: POP3.rpop(user) + + Use RPOP authentication (similar to UNIX r-commands) to log into POP3 server. + + +.. method:: POP3.stat() + + Get mailbox status. The result is a tuple of 2 integers: ``(message count, + mailbox size)``. + + +.. method:: POP3.list([which]) + + Request message list, result is in the form ``(response, ['mesg_num octets', + ...], octets)``. If *which* is set, it is the message to list. + + +.. method:: POP3.retr(which) + + Retrieve whole message number *which*, and set its seen flag. Result is in form + ``(response, ['line', ...], octets)``. + + +.. method:: POP3.dele(which) + + Flag message number *which* for deletion. On most servers deletions are not + actually performed until QUIT (the major exception is Eudora QPOP, which + deliberately violates the RFCs by doing pending deletes on any disconnect). + + +.. method:: POP3.rset() + + Remove any deletion marks for the mailbox. + + +.. method:: POP3.noop() + + Do nothing. Might be used as a keep-alive. + + +.. method:: POP3.quit() + + Signoff: commit changes, unlock mailbox, drop connection. + + +.. method:: POP3.top(which, howmuch) + + Retrieves the message header plus *howmuch* lines of the message after the + header of message number *which*. Result is in form ``(response, ['line', ...], + octets)``. + + The POP3 TOP command this method uses, unlike the RETR command, doesn't set the + message's seen flag; unfortunately, TOP is poorly specified in the RFCs and is + frequently broken in off-brand servers. Test this method by hand against the + POP3 servers you will use before trusting it. + + +.. method:: POP3.uidl([which]) + + Return message digest (unique id) list. If *which* is specified, result contains + the unique id for that message in the form ``'response mesgnum uid``, otherwise + result is list ``(response, ['mesgnum uid', ...], octets)``. + +Instances of :class:`POP3_SSL` have no additional methods. The interface of this +subclass is identical to its parent. + + +.. _pop3-example: + +POP3 Example +------------ + +Here is a minimal example (without error checking) that opens a mailbox and +retrieves and prints all messages:: + + import getpass, poplib + + M = poplib.POP3('localhost') + M.user(getpass.getuser()) + M.pass_(getpass.getpass()) + numMessages = len(M.list()[1]) + for i in range(numMessages): + for j in M.retr(i+1)[1]: + print j + +At the end of the module, there is a test section that contains a more extensive +example of usage. + diff --git a/Doc/library/posix.rst b/Doc/library/posix.rst new file mode 100644 index 0000000..07ecb48 --- /dev/null +++ b/Doc/library/posix.rst @@ -0,0 +1,103 @@ + +:mod:`posix` --- The most common POSIX system calls +=================================================== + +.. module:: posix + :platform: Unix + :synopsis: The most common POSIX system calls (normally used via module os). + + +This module provides access to operating system functionality that is +standardized by the C Standard and the POSIX standard (a thinly disguised Unix +interface). + +.. index:: module: os + +**Do not import this module directly.** Instead, import the module :mod:`os`, +which provides a *portable* version of this interface. On Unix, the :mod:`os` +module provides a superset of the :mod:`posix` interface. On non-Unix operating +systems the :mod:`posix` module is not available, but a subset is always +available through the :mod:`os` interface. Once :mod:`os` is imported, there is +*no* performance penalty in using it instead of :mod:`posix`. In addition, +:mod:`os` provides some additional functionality, such as automatically calling +:func:`putenv` when an entry in ``os.environ`` is changed. + +The descriptions below are very terse; refer to the corresponding Unix manual +(or POSIX documentation) entry for more information. Arguments called *path* +refer to a pathname given as a string. + +Errors are reported as exceptions; the usual exceptions are given for type +errors, while errors reported by the system calls raise :exc:`error` (a synonym +for the standard exception :exc:`OSError`), described below. + + +.. _posix-large-files: + +Large File Support +------------------ + +.. index:: + single: large files + single: file; large files + +.. sectionauthor:: Steve Clift <clift@mail.anacapa.net> + + +Several operating systems (including AIX, HPUX, Irix and Solaris) provide +support for files that are larger than 2 Gb from a C programming model where +:ctype:`int` and :ctype:`long` are 32-bit values. This is typically accomplished +by defining the relevant size and offset types as 64-bit values. Such files are +sometimes referred to as :dfn:`large files`. + +Large file support is enabled in Python when the size of an :ctype:`off_t` is +larger than a :ctype:`long` and the :ctype:`long long` type is available and is +at least as large as an :ctype:`off_t`. Python longs are then used to represent +file sizes, offsets and other values that can exceed the range of a Python int. +It may be necessary to configure and compile Python with certain compiler flags +to enable this mode. For example, it is enabled by default with recent versions +of Irix, but with Solaris 2.6 and 2.7 you need to do something like:: + + CFLAGS="`getconf LFS_CFLAGS`" OPT="-g -O2 $CFLAGS" \ + ./configure + +On large-file-capable Linux systems, this might work: + +.. % $ <-- bow to font-lock + +:: + + CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" \ + ./configure + +.. % $ <-- bow to font-lock + + +.. _posix-contents: + +Module Contents +--------------- + +Module :mod:`posix` defines the following data item: + + +.. data:: environ + + A dictionary representing the string environment at the time the interpreter was + started. For example, ``environ['HOME']`` is the pathname of your home + directory, equivalent to ``getenv("HOME")`` in C. + + Modifying this dictionary does not affect the string environment passed on by + :func:`execv`, :func:`popen` or :func:`system`; if you need to change the + environment, pass ``environ`` to :func:`execve` or add variable assignments and + export statements to the command string for :func:`system` or :func:`popen`. + + .. note:: + + The :mod:`os` module provides an alternate implementation of ``environ`` which + updates the environment on modification. Note also that updating ``os.environ`` + will render this dictionary obsolete. Use of the :mod:`os` module version of + this is recommended over direct access to the :mod:`posix` module. + +Additional contents of this module should only be accessed via the :mod:`os` +module; refer to the documentation for that module for further information. + diff --git a/Doc/library/pprint.rst b/Doc/library/pprint.rst new file mode 100644 index 0000000..3630176 --- /dev/null +++ b/Doc/library/pprint.rst @@ -0,0 +1,213 @@ + +:mod:`pprint` --- Data pretty printer +===================================== + +.. module:: pprint + :synopsis: Data pretty printer. +.. moduleauthor:: Fred L. Drake, Jr. <fdrake@acm.org> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`pprint` module provides a capability to "pretty-print" arbitrary +Python data structures in a form which can be used as input to the interpreter. +If the formatted structures include objects which are not fundamental Python +types, the representation may not be loadable. This may be the case if objects +such as files, sockets, classes, or instances are included, as well as many +other builtin objects which are not representable as Python constants. + +The formatted representation keeps objects on a single line if it can, and +breaks them onto multiple lines if they don't fit within the allowed width. +Construct :class:`PrettyPrinter` objects explicitly if you need to adjust the +width constraint. + +.. versionchanged:: 2.5 + Dictionaries are sorted by key before the display is computed; before 2.5, a + dictionary was sorted only if its display required more than one line, although + that wasn't documented. + +The :mod:`pprint` module defines one class: + +.. % First the implementation class: + + +.. class:: PrettyPrinter(...) + + Construct a :class:`PrettyPrinter` instance. This constructor understands + several keyword parameters. An output stream may be set using the *stream* + keyword; the only method used on the stream object is the file protocol's + :meth:`write` method. If not specified, the :class:`PrettyPrinter` adopts + ``sys.stdout``. Three additional parameters may be used to control the + formatted representation. The keywords are *indent*, *depth*, and *width*. The + amount of indentation added for each recursive level is specified by *indent*; + the default is one. Other values can cause output to look a little odd, but can + make nesting easier to spot. The number of levels which may be printed is + controlled by *depth*; if the data structure being printed is too deep, the next + contained level is replaced by ``...``. By default, there is no constraint on + the depth of the objects being formatted. The desired output width is + constrained using the *width* parameter; the default is 80 characters. If a + structure cannot be formatted within the constrained width, a best effort will + be made. :: + + >>> import pprint, sys + >>> stuff = sys.path[:] + >>> stuff.insert(0, stuff[:]) + >>> pp = pprint.PrettyPrinter(indent=4) + >>> pp.pprint(stuff) + [ [ '', + '/usr/local/lib/python1.5', + '/usr/local/lib/python1.5/test', + '/usr/local/lib/python1.5/sunos5', + '/usr/local/lib/python1.5/sharedmodules', + '/usr/local/lib/python1.5/tkinter'], + '', + '/usr/local/lib/python1.5', + '/usr/local/lib/python1.5/test', + '/usr/local/lib/python1.5/sunos5', + '/usr/local/lib/python1.5/sharedmodules', + '/usr/local/lib/python1.5/tkinter'] + >>> + >>> import parser + >>> tup = parser.ast2tuple( + ... parser.suite(open('pprint.py').read()))[1][1][1] + >>> pp = pprint.PrettyPrinter(depth=6) + >>> pp.pprint(tup) + (266, (267, (307, (287, (288, (...)))))) + +The :class:`PrettyPrinter` class supports several derivative functions: + +.. % Now the derivative functions: + + +.. function:: pformat(object[, indent[, width[, depth]]]) + + Return the formatted representation of *object* as a string. *indent*, *width* + and *depth* will be passed to the :class:`PrettyPrinter` constructor as + formatting parameters. + + .. versionchanged:: 2.4 + The parameters *indent*, *width* and *depth* were added. + + +.. function:: pprint(object[, stream[, indent[, width[, depth]]]]) + + Prints the formatted representation of *object* on *stream*, followed by a + newline. If *stream* is omitted, ``sys.stdout`` is used. This may be used in + the interactive interpreter instead of a :keyword:`print` statement for + inspecting values. *indent*, *width* and *depth* will be passed to the + :class:`PrettyPrinter` constructor as formatting parameters. :: + + >>> stuff = sys.path[:] + >>> stuff.insert(0, stuff) + >>> pprint.pprint(stuff) + [<Recursion on list with id=869440>, + '', + '/usr/local/lib/python1.5', + '/usr/local/lib/python1.5/test', + '/usr/local/lib/python1.5/sunos5', + '/usr/local/lib/python1.5/sharedmodules', + '/usr/local/lib/python1.5/tkinter'] + + .. versionchanged:: 2.4 + The parameters *indent*, *width* and *depth* were added. + + +.. function:: isreadable(object) + + .. index:: builtin: eval + + Determine if the formatted representation of *object* is "readable," or can be + used to reconstruct the value using :func:`eval`. This always returns ``False`` + for recursive objects. :: + + >>> pprint.isreadable(stuff) + False + + +.. function:: isrecursive(object) + + Determine if *object* requires a recursive representation. + +One more support function is also defined: + + +.. function:: saferepr(object) + + Return a string representation of *object*, protected against recursive data + structures. If the representation of *object* exposes a recursive entry, the + recursive reference will be represented as ``<Recursion on typename with + id=number>``. The representation is not otherwise formatted. + +.. % This example is outside the {funcdesc} to keep it from running over +.. % the right margin. + +:: + + >>> pprint.saferepr(stuff) + "[<Recursion on list with id=682968>, '', '/usr/local/lib/python1.5', '/usr/loca + l/lib/python1.5/test', '/usr/local/lib/python1.5/sunos5', '/usr/local/lib/python + 1.5/sharedmodules', '/usr/local/lib/python1.5/tkinter']" + + +.. _prettyprinter-objects: + +PrettyPrinter Objects +--------------------- + +:class:`PrettyPrinter` instances have the following methods: + + +.. method:: PrettyPrinter.pformat(object) + + Return the formatted representation of *object*. This takes into account the + options passed to the :class:`PrettyPrinter` constructor. + + +.. method:: PrettyPrinter.pprint(object) + + Print the formatted representation of *object* on the configured stream, + followed by a newline. + +The following methods provide the implementations for the corresponding +functions of the same names. Using these methods on an instance is slightly +more efficient since new :class:`PrettyPrinter` objects don't need to be +created. + + +.. method:: PrettyPrinter.isreadable(object) + + .. index:: builtin: eval + + Determine if the formatted representation of the object is "readable," or can be + used to reconstruct the value using :func:`eval`. Note that this returns + ``False`` for recursive objects. If the *depth* parameter of the + :class:`PrettyPrinter` is set and the object is deeper than allowed, this + returns ``False``. + + +.. method:: PrettyPrinter.isrecursive(object) + + Determine if the object requires a recursive representation. + +This method is provided as a hook to allow subclasses to modify the way objects +are converted to strings. The default implementation uses the internals of the +:func:`saferepr` implementation. + + +.. method:: PrettyPrinter.format(object, context, maxlevels, level) + + Returns three values: the formatted version of *object* as a string, a flag + indicating whether the result is readable, and a flag indicating whether + recursion was detected. The first argument is the object to be presented. The + second is a dictionary which contains the :func:`id` of objects that are part of + the current presentation context (direct and indirect containers for *object* + that are affecting the presentation) as the keys; if an object needs to be + presented which is already represented in *context*, the third return value + should be ``True``. Recursive calls to the :meth:`format` method should add + additional entries for containers to this dictionary. The third argument, + *maxlevels*, gives the requested limit to recursion; this will be ``0`` if there + is no requested limit. This argument should be passed unmodified to recursive + calls. The fourth argument, *level*, gives the current level; recursive calls + should be passed a value less than that of the current call. + + .. versionadded:: 2.3 + diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst new file mode 100644 index 0000000..2ab24c5 --- /dev/null +++ b/Doc/library/profile.rst @@ -0,0 +1,682 @@ + +.. _profile: + +******************** +The Python Profilers +******************** + +.. sectionauthor:: James Roskind + + +.. index:: single: InfoSeek Corporation + +Copyright © 1994, by InfoSeek Corporation, all rights reserved. + +Written by James Roskind. [#]_ + +Permission to use, copy, modify, and distribute this Python software and its +associated documentation for any purpose (subject to the restriction in the +following sentence) without fee is hereby granted, provided that the above +copyright notice appears in all copies, and that both that copyright notice and +this permission notice appear in supporting documentation, and that the name of +InfoSeek not be used in advertising or publicity pertaining to distribution of +the software without specific, written prior permission. This permission is +explicitly restricted to the copying and modification of the software to remain +in Python, compiled Python, or other languages (such as C) wherein the modified +or derived code is exclusively imported into a Python module. + +INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, +INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT +SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL +DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING +OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +The profiler was written after only programming in Python for 3 weeks. As a +result, it is probably clumsy code, but I don't know for sure yet 'cause I'm a +beginner :-). I did work hard to make the code run fast, so that profiling +would be a reasonable thing to do. I tried not to repeat code fragments, but +I'm sure I did some stuff in really awkward ways at times. Please send +suggestions for improvements to: jar@netscape.com. I won't promise *any* +support. ...but I'd appreciate the feedback. + + +.. _profiler-introduction: + +Introduction to the profilers +============================= + +.. index:: + single: deterministic profiling + single: profiling, deterministic + +A :dfn:`profiler` is a program that describes the run time performance of a +program, providing a variety of statistics. This documentation describes the +profiler functionality provided in the modules :mod:`profile` and :mod:`pstats`. +This profiler provides :dfn:`deterministic profiling` of any Python programs. +It also provides a series of report generation tools to allow users to rapidly +examine the results of a profile operation. + +The Python standard library provides three different profilers: + +#. :mod:`profile`, a pure Python module, described in the sequel. Copyright © + 1994, by InfoSeek Corporation. + + .. versionchanged:: 2.4 + also reports the time spent in calls to built-in functions and methods. + +#. :mod:`cProfile`, a module written in C, with a reasonable overhead that makes + it suitable for profiling long-running programs. Based on :mod:`lsprof`, + contributed by Brett Rosen and Ted Czotter. + + .. versionadded:: 2.5 + +#. :mod:`hotshot`, a C module focusing on minimizing the overhead while + profiling, at the expense of long data post-processing times. + + .. versionchanged:: 2.5 + the results should be more meaningful than in the past: the timing core + contained a critical bug. + +The :mod:`profile` and :mod:`cProfile` modules export the same interface, so +they are mostly interchangeables; :mod:`cProfile` has a much lower overhead but +is not so far as well-tested and might not be available on all systems. +:mod:`cProfile` is really a compatibility layer on top of the internal +:mod:`_lsprof` module. The :mod:`hotshot` module is reserved to specialized +usages. + +.. % \section{How Is This Profiler Different From The Old Profiler?} +.. % \nodename{Profiler Changes} +.. % +.. % (This section is of historical importance only; the old profiler +.. % discussed here was last seen in Python 1.1.) +.. % +.. % The big changes from old profiling module are that you get more +.. % information, and you pay less CPU time. It's not a trade-off, it's a +.. % trade-up. +.. % +.. % To be specific: +.. % +.. % \begin{description} +.. % +.. % \item[Bugs removed:] +.. % Local stack frame is no longer molested, execution time is now charged +.. % to correct functions. +.. % +.. % \item[Accuracy increased:] +.. % Profiler execution time is no longer charged to user's code, +.. % calibration for platform is supported, file reads are not done \emph{by} +.. % profiler \emph{during} profiling (and charged to user's code!). +.. % +.. % \item[Speed increased:] +.. % Overhead CPU cost was reduced by more than a factor of two (perhaps a +.. % factor of five), lightweight profiler module is all that must be +.. % loaded, and the report generating module (\module{pstats}) is not needed +.. % during profiling. +.. % +.. % \item[Recursive functions support:] +.. % Cumulative times in recursive functions are correctly calculated; +.. % recursive entries are counted. +.. % +.. % \item[Large growth in report generating UI:] +.. % Distinct profiles runs can be added together forming a comprehensive +.. % report; functions that import statistics take arbitrary lists of +.. % files; sorting criteria is now based on keywords (instead of 4 integer +.. % options); reports shows what functions were profiled as well as what +.. % profile file was referenced; output format has been improved. +.. % +.. % \end{description} + + +.. _profile-instant: + +Instant User's Manual +===================== + +This section is provided for users that "don't want to read the manual." It +provides a very brief overview, and allows a user to rapidly perform profiling +on an existing application. + +To profile an application with a main entry point of :func:`foo`, you would add +the following to your module:: + + import cProfile + cProfile.run('foo()') + +(Use :mod:`profile` instead of :mod:`cProfile` if the latter is not available on +your system.) + +The above action would cause :func:`foo` to be run, and a series of informative +lines (the profile) to be printed. The above approach is most useful when +working with the interpreter. If you would like to save the results of a +profile into a file for later examination, you can supply a file name as the +second argument to the :func:`run` function:: + + import cProfile + cProfile.run('foo()', 'fooprof') + +The file :file:`cProfile.py` can also be invoked as a script to profile another +script. For example:: + + python -m cProfile myscript.py + +:file:`cProfile.py` accepts two optional arguments on the command line:: + + cProfile.py [-o output_file] [-s sort_order] + +:option:`-s` only applies to standard output (:option:`-o` is not supplied). +Look in the :class:`Stats` documentation for valid sort values. + +When you wish to review the profile, you should use the methods in the +:mod:`pstats` module. Typically you would load the statistics data as follows:: + + import pstats + p = pstats.Stats('fooprof') + +The class :class:`Stats` (the above code just created an instance of this class) +has a variety of methods for manipulating and printing the data that was just +read into ``p``. When you ran :func:`cProfile.run` above, what was printed was +the result of three method calls:: + + p.strip_dirs().sort_stats(-1).print_stats() + +The first method removed the extraneous path from all the module names. The +second method sorted all the entries according to the standard module/line/name +string that is printed. The third method printed out all the statistics. You +might try the following sort calls: + +.. % (this is to comply with the semantics of the old profiler). + +:: + + p.sort_stats('name') + p.print_stats() + +The first call will actually sort the list by function name, and the second call +will print out the statistics. The following are some interesting calls to +experiment with:: + + p.sort_stats('cumulative').print_stats(10) + +This sorts the profile by cumulative time in a function, and then only prints +the ten most significant lines. If you want to understand what algorithms are +taking time, the above line is what you would use. + +If you were looking to see what functions were looping a lot, and taking a lot +of time, you would do:: + + p.sort_stats('time').print_stats(10) + +to sort according to time spent within each function, and then print the +statistics for the top ten functions. + +You might also try:: + + p.sort_stats('file').print_stats('__init__') + +This will sort all the statistics by file name, and then print out statistics +for only the class init methods (since they are spelled with ``__init__`` in +them). As one final example, you could try:: + + p.sort_stats('time', 'cum').print_stats(.5, 'init') + +This line sorts statistics with a primary key of time, and a secondary key of +cumulative time, and then prints out some of the statistics. To be specific, the +list is first culled down to 50% (re: ``.5``) of its original size, then only +lines containing ``init`` are maintained, and that sub-sub-list is printed. + +If you wondered what functions called the above functions, you could now (``p`` +is still sorted according to the last criteria) do:: + + p.print_callers(.5, 'init') + +and you would get a list of callers for each of the listed functions. + +If you want more functionality, you're going to have to read the manual, or +guess what the following functions do:: + + p.print_callees() + p.add('fooprof') + +Invoked as a script, the :mod:`pstats` module is a statistics browser for +reading and examining profile dumps. It has a simple line-oriented interface +(implemented using :mod:`cmd`) and interactive help. + + +.. _deterministic-profiling: + +What Is Deterministic Profiling? +================================ + +:dfn:`Deterministic profiling` is meant to reflect the fact that all *function +call*, *function return*, and *exception* events are monitored, and precise +timings are made for the intervals between these events (during which time the +user's code is executing). In contrast, :dfn:`statistical profiling` (which is +not done by this module) randomly samples the effective instruction pointer, and +deduces where time is being spent. The latter technique traditionally involves +less overhead (as the code does not need to be instrumented), but provides only +relative indications of where time is being spent. + +In Python, since there is an interpreter active during execution, the presence +of instrumented code is not required to do deterministic profiling. Python +automatically provides a :dfn:`hook` (optional callback) for each event. In +addition, the interpreted nature of Python tends to add so much overhead to +execution, that deterministic profiling tends to only add small processing +overhead in typical applications. The result is that deterministic profiling is +not that expensive, yet provides extensive run time statistics about the +execution of a Python program. + +Call count statistics can be used to identify bugs in code (surprising counts), +and to identify possible inline-expansion points (high call counts). Internal +time statistics can be used to identify "hot loops" that should be carefully +optimized. Cumulative time statistics should be used to identify high level +errors in the selection of algorithms. Note that the unusual handling of +cumulative times in this profiler allows statistics for recursive +implementations of algorithms to be directly compared to iterative +implementations. + + +Reference Manual -- :mod:`profile` and :mod:`cProfile` +====================================================== + +.. module:: cProfile + :synopsis: Python profiler + + +The primary entry point for the profiler is the global function +:func:`profile.run` (resp. :func:`cProfile.run`). It is typically used to create +any profile information. The reports are formatted and printed using methods of +the class :class:`pstats.Stats`. The following is a description of all of these +standard entry points and functions. For a more in-depth view of some of the +code, consider reading the later section on Profiler Extensions, which includes +discussion of how to derive "better" profilers from the classes presented, or +reading the source code for these modules. + + +.. function:: run(command[, filename]) + + This function takes a single argument that can be passed to the :func:`exec` + function, and an optional file name. In all cases this routine attempts to + :func:`exec` its first argument, and gather profiling statistics from the + execution. If no file name is present, then this function automatically + prints a simple profiling report, sorted by the standard name string + (file/line/function-name) that is presented in each line. The following is a + typical output from such a call:: + + 2706 function calls (2004 primitive calls) in 4.504 CPU seconds + + Ordered by: standard name + + ncalls tottime percall cumtime percall filename:lineno(function) + 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects) + 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate) + ... + + The first line indicates that 2706 calls were monitored. Of those calls, 2004 + were :dfn:`primitive`. We define :dfn:`primitive` to mean that the call was not + induced via recursion. The next line: ``Ordered by: standard name``, indicates + that the text string in the far right column was used to sort the output. The + column headings include: + + ncalls + for the number of calls, + + tottime + for the total time spent in the given function (and excluding time made in calls + to sub-functions), + + percall + is the quotient of ``tottime`` divided by ``ncalls`` + + cumtime + is the total time spent in this and all subfunctions (from invocation till + exit). This figure is accurate *even* for recursive functions. + + percall + is the quotient of ``cumtime`` divided by primitive calls + + filename:lineno(function) + provides the respective data of each function + + When there are two numbers in the first column (for example, ``43/3``), then the + latter is the number of primitive calls, and the former is the actual number of + calls. Note that when the function does not recurse, these two values are the + same, and only the single figure is printed. + + +.. function:: runctx(command, globals, locals[, filename]) + + This function is similar to :func:`run`, with added arguments to supply the + globals and locals dictionaries for the *command* string. + +Analysis of the profiler data is done using the :class:`Stats` class. + +.. note:: + + The :class:`Stats` class is defined in the :mod:`pstats` module. + + +.. module:: pstats + :synopsis: Statistics object for use with the profiler. + + +.. class:: Stats(filename[, stream=sys.stdout[, ...]]) + + This class constructor creates an instance of a "statistics object" from a + *filename* (or set of filenames). :class:`Stats` objects are manipulated by + methods, in order to print useful reports. You may specify an alternate output + stream by giving the keyword argument, ``stream``. + + The file selected by the above constructor must have been created by the + corresponding version of :mod:`profile` or :mod:`cProfile`. To be specific, + there is *no* file compatibility guaranteed with future versions of this + profiler, and there is no compatibility with files produced by other profilers. + If several files are provided, all the statistics for identical functions will + be coalesced, so that an overall view of several processes can be considered in + a single report. If additional files need to be combined with data in an + existing :class:`Stats` object, the :meth:`add` method can be used. + + .. % (such as the old system profiler). + + .. versionchanged:: 2.5 + The *stream* parameter was added. + + +.. _profile-stats: + +The :class:`Stats` Class +------------------------ + +:class:`Stats` objects have the following methods: + + +.. method:: Stats.strip_dirs() + + This method for the :class:`Stats` class removes all leading path information + from file names. It is very useful in reducing the size of the printout to fit + within (close to) 80 columns. This method modifies the object, and the stripped + information is lost. After performing a strip operation, the object is + considered to have its entries in a "random" order, as it was just after object + initialization and loading. If :meth:`strip_dirs` causes two function names to + be indistinguishable (they are on the same line of the same filename, and have + the same function name), then the statistics for these two entries are + accumulated into a single entry. + + +.. method:: Stats.add(filename[, ...]) + + This method of the :class:`Stats` class accumulates additional profiling + information into the current profiling object. Its arguments should refer to + filenames created by the corresponding version of :func:`profile.run` or + :func:`cProfile.run`. Statistics for identically named (re: file, line, name) + functions are automatically accumulated into single function statistics. + + +.. method:: Stats.dump_stats(filename) + + Save the data loaded into the :class:`Stats` object to a file named *filename*. + The file is created if it does not exist, and is overwritten if it already + exists. This is equivalent to the method of the same name on the + :class:`profile.Profile` and :class:`cProfile.Profile` classes. + + .. versionadded:: 2.3 + + +.. method:: Stats.sort_stats(key[, ...]) + + This method modifies the :class:`Stats` object by sorting it according to the + supplied criteria. The argument is typically a string identifying the basis of + a sort (example: ``'time'`` or ``'name'``). + + When more than one key is provided, then additional keys are used as secondary + criteria when there is equality in all keys selected before them. For example, + ``sort_stats('name', 'file')`` will sort all the entries according to their + function name, and resolve all ties (identical function names) by sorting by + file name. + + Abbreviations can be used for any key names, as long as the abbreviation is + unambiguous. The following are the keys currently defined: + + +------------------+----------------------+ + | Valid Arg | Meaning | + +==================+======================+ + | ``'calls'`` | call count | + +------------------+----------------------+ + | ``'cumulative'`` | cumulative time | + +------------------+----------------------+ + | ``'file'`` | file name | + +------------------+----------------------+ + | ``'module'`` | file name | + +------------------+----------------------+ + | ``'pcalls'`` | primitive call count | + +------------------+----------------------+ + | ``'line'`` | line number | + +------------------+----------------------+ + | ``'name'`` | function name | + +------------------+----------------------+ + | ``'nfl'`` | name/file/line | + +------------------+----------------------+ + | ``'stdname'`` | standard name | + +------------------+----------------------+ + | ``'time'`` | internal time | + +------------------+----------------------+ + + Note that all sorts on statistics are in descending order (placing most time + consuming items first), where as name, file, and line number searches are in + ascending order (alphabetical). The subtle distinction between ``'nfl'`` and + ``'stdname'`` is that the standard name is a sort of the name as printed, which + means that the embedded line numbers get compared in an odd way. For example, + lines 3, 20, and 40 would (if the file names were the same) appear in the string + order 20, 3 and 40. In contrast, ``'nfl'`` does a numeric compare of the line + numbers. In fact, ``sort_stats('nfl')`` is the same as ``sort_stats('name', + 'file', 'line')``. + + For backward-compatibility reasons, the numeric arguments ``-1``, ``0``, ``1``, + and ``2`` are permitted. They are interpreted as ``'stdname'``, ``'calls'``, + ``'time'``, and ``'cumulative'`` respectively. If this old style format + (numeric) is used, only one sort key (the numeric key) will be used, and + additional arguments will be silently ignored. + + .. % For compatibility with the old profiler, + + +.. method:: Stats.reverse_order() + + This method for the :class:`Stats` class reverses the ordering of the basic list + within the object. Note that by default ascending vs descending order is + properly selected based on the sort key of choice. + + .. % This method is provided primarily for + .. % compatibility with the old profiler. + + +.. method:: Stats.print_stats([restriction, ...]) + + This method for the :class:`Stats` class prints out a report as described in the + :func:`profile.run` definition. + + The order of the printing is based on the last :meth:`sort_stats` operation done + on the object (subject to caveats in :meth:`add` and :meth:`strip_dirs`). + + The arguments provided (if any) can be used to limit the list down to the + significant entries. Initially, the list is taken to be the complete set of + profiled functions. Each restriction is either an integer (to select a count of + lines), or a decimal fraction between 0.0 and 1.0 inclusive (to select a + percentage of lines), or a regular expression (to pattern match the standard + name that is printed; as of Python 1.5b1, this uses the Perl-style regular + expression syntax defined by the :mod:`re` module). If several restrictions are + provided, then they are applied sequentially. For example:: + + print_stats(.1, 'foo:') + + would first limit the printing to first 10% of list, and then only print + functions that were part of filename :file:`.\*foo:`. In contrast, the + command:: + + print_stats('foo:', .1) + + would limit the list to all functions having file names :file:`.\*foo:`, and + then proceed to only print the first 10% of them. + + +.. method:: Stats.print_callers([restriction, ...]) + + This method for the :class:`Stats` class prints a list of all functions that + called each function in the profiled database. The ordering is identical to + that provided by :meth:`print_stats`, and the definition of the restricting + argument is also identical. Each caller is reported on its own line. The + format differs slightly depending on the profiler that produced the stats: + + * With :mod:`profile`, a number is shown in parentheses after each caller to + show how many times this specific call was made. For convenience, a second + non-parenthesized number repeats the cumulative time spent in the function + at the right. + + * With :mod:`cProfile`, each caller is preceeded by three numbers: the number of + times this specific call was made, and the total and cumulative times spent in + the current function while it was invoked by this specific caller. + + +.. method:: Stats.print_callees([restriction, ...]) + + This method for the :class:`Stats` class prints a list of all function that were + called by the indicated function. Aside from this reversal of direction of + calls (re: called vs was called by), the arguments and ordering are identical to + the :meth:`print_callers` method. + + +.. _profile-limits: + +Limitations +=========== + +One limitation has to do with accuracy of timing information. There is a +fundamental problem with deterministic profilers involving accuracy. The most +obvious restriction is that the underlying "clock" is only ticking at a rate +(typically) of about .001 seconds. Hence no measurements will be more accurate +than the underlying clock. If enough measurements are taken, then the "error" +will tend to average out. Unfortunately, removing this first error induces a +second source of error. + +The second problem is that it "takes a while" from when an event is dispatched +until the profiler's call to get the time actually *gets* the state of the +clock. Similarly, there is a certain lag when exiting the profiler event +handler from the time that the clock's value was obtained (and then squirreled +away), until the user's code is once again executing. As a result, functions +that are called many times, or call many functions, will typically accumulate +this error. The error that accumulates in this fashion is typically less than +the accuracy of the clock (less than one clock tick), but it *can* accumulate +and become very significant. + +The problem is more important with :mod:`profile` than with the lower-overhead +:mod:`cProfile`. For this reason, :mod:`profile` provides a means of +calibrating itself for a given platform so that this error can be +probabilistically (on the average) removed. After the profiler is calibrated, it +will be more accurate (in a least square sense), but it will sometimes produce +negative numbers (when call counts are exceptionally low, and the gods of +probability work against you :-). ) Do *not* be alarmed by negative numbers in +the profile. They should *only* appear if you have calibrated your profiler, +and the results are actually better than without calibration. + + +.. _profile-calibration: + +Calibration +=========== + +The profiler of the :mod:`profile` module subtracts a constant from each event +handling time to compensate for the overhead of calling the time function, and +socking away the results. By default, the constant is 0. The following +procedure can be used to obtain a better constant for a given platform (see +discussion in section Limitations above). :: + + import profile + pr = profile.Profile() + for i in range(5): + print pr.calibrate(10000) + +The method executes the number of Python calls given by the argument, directly +and again under the profiler, measuring the time for both. It then computes the +hidden overhead per profiler event, and returns that as a float. For example, +on an 800 MHz Pentium running Windows 2000, and using Python's time.clock() as +the timer, the magical number is about 12.5e-6. + +The object of this exercise is to get a fairly consistent result. If your +computer is *very* fast, or your timer function has poor resolution, you might +have to pass 100000, or even 1000000, to get consistent results. + +When you have a consistent answer, there are three ways you can use it: [#]_ :: + + import profile + + # 1. Apply computed bias to all Profile instances created hereafter. + profile.Profile.bias = your_computed_bias + + # 2. Apply computed bias to a specific Profile instance. + pr = profile.Profile() + pr.bias = your_computed_bias + + # 3. Specify computed bias in instance constructor. + pr = profile.Profile(bias=your_computed_bias) + +If you have a choice, you are better off choosing a smaller constant, and then +your results will "less often" show up as negative in profile statistics. + + +.. _profiler-extensions: + +Extensions --- Deriving Better Profilers +======================================== + +The :class:`Profile` class of both modules, :mod:`profile` and :mod:`cProfile`, +were written so that derived classes could be developed to extend the profiler. +The details are not described here, as doing this successfully requires an +expert understanding of how the :class:`Profile` class works internally. Study +the source code of the module carefully if you want to pursue this. + +If all you want to do is change how current time is determined (for example, to +force use of wall-clock time or elapsed process time), pass the timing function +you want to the :class:`Profile` class constructor:: + + pr = profile.Profile(your_time_func) + +The resulting profiler will then call :func:`your_time_func`. + +:class:`profile.Profile` + :func:`your_time_func` should return a single number, or a list of numbers whose + sum is the current time (like what :func:`os.times` returns). If the function + returns a single time number, or the list of returned numbers has length 2, then + you will get an especially fast version of the dispatch routine. + + Be warned that you should calibrate the profiler class for the timer function + that you choose. For most machines, a timer that returns a lone integer value + will provide the best results in terms of low overhead during profiling. + (:func:`os.times` is *pretty* bad, as it returns a tuple of floating point + values). If you want to substitute a better timer in the cleanest fashion, + derive a class and hardwire a replacement dispatch method that best handles your + timer call, along with the appropriate calibration constant. + +:class:`cProfile.Profile` + :func:`your_time_func` should return a single number. If it returns plain + integers, you can also invoke the class constructor with a second argument + specifying the real duration of one unit of time. For example, if + :func:`your_integer_time_func` returns times measured in thousands of seconds, + you would constuct the :class:`Profile` instance as follows:: + + pr = profile.Profile(your_integer_time_func, 0.001) + + As the :mod:`cProfile.Profile` class cannot be calibrated, custom timer + functions should be used with care and should be as fast as possible. For the + best results with a custom timer, it might be necessary to hard-code it in the C + source of the internal :mod:`_lsprof` module. + +.. rubric:: Footnotes + +.. [#] Updated and converted to LaTeX by Guido van Rossum. Further updated by Armin + Rigo to integrate the documentation for the new :mod:`cProfile` module of Python + 2.5. + +.. [#] Prior to Python 2.2, it was necessary to edit the profiler source code to embed + the bias as a literal number. You still can, but that method is no longer + described, because no longer needed. + diff --git a/Doc/library/pty.rst b/Doc/library/pty.rst new file mode 100644 index 0000000..5e1da22 --- /dev/null +++ b/Doc/library/pty.rst @@ -0,0 +1,48 @@ + +:mod:`pty` --- Pseudo-terminal utilities +======================================== + +.. module:: pty + :platform: IRIX, Linux + :synopsis: Pseudo-Terminal Handling for SGI and Linux. +.. moduleauthor:: Steen Lumholt +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`pty` module defines operations for handling the pseudo-terminal +concept: starting another process and being able to write to and read from its +controlling terminal programmatically. + +Because pseudo-terminal handling is highly platform dependant, there is code to +do it only for SGI and Linux. (The Linux code is supposed to work on other +platforms, but hasn't been tested yet.) + +The :mod:`pty` module defines the following functions: + + +.. function:: fork() + + Fork. Connect the child's controlling terminal to a pseudo-terminal. Return + value is ``(pid, fd)``. Note that the child gets *pid* 0, and the *fd* is + *invalid*. The parent's return value is the *pid* of the child, and *fd* is a + file descriptor connected to the child's controlling terminal (and also to the + child's standard input and output). + + +.. function:: openpty() + + Open a new pseudo-terminal pair, using :func:`os.openpty` if possible, or + emulation code for SGI and generic Unix systems. Return a pair of file + descriptors ``(master, slave)``, for the master and the slave end, respectively. + + +.. function:: spawn(argv[, master_read[, stdin_read]]) + + Spawn a process, and connect its controlling terminal with the current + process's standard io. This is often used to baffle programs which insist on + reading from the controlling terminal. + + The functions *master_read* and *stdin_read* should be functions which read from + a file-descriptor. The defaults try to read 1024 bytes each time they are + called. + diff --git a/Doc/library/pwd.rst b/Doc/library/pwd.rst new file mode 100644 index 0000000..562afd9 --- /dev/null +++ b/Doc/library/pwd.rst @@ -0,0 +1,76 @@ + +:mod:`pwd` --- The password database +==================================== + +.. module:: pwd + :platform: Unix + :synopsis: The password database (getpwnam() and friends). + + +This module provides access to the Unix user account and password database. It +is available on all Unix versions. + +Password database entries are reported as a tuple-like object, whose attributes +correspond to the members of the ``passwd`` structure (Attribute field below, +see ``<pwd.h>``): + ++-------+---------------+-----------------------------+ +| Index | Attribute | Meaning | ++=======+===============+=============================+ +| 0 | ``pw_name`` | Login name | ++-------+---------------+-----------------------------+ +| 1 | ``pw_passwd`` | Optional encrypted password | ++-------+---------------+-----------------------------+ +| 2 | ``pw_uid`` | Numerical user ID | ++-------+---------------+-----------------------------+ +| 3 | ``pw_gid`` | Numerical group ID | ++-------+---------------+-----------------------------+ +| 4 | ``pw_gecos`` | User name or comment field | ++-------+---------------+-----------------------------+ +| 5 | ``pw_dir`` | User home directory | ++-------+---------------+-----------------------------+ +| 6 | ``pw_shell`` | User command interpreter | ++-------+---------------+-----------------------------+ + +The uid and gid items are integers, all others are strings. :exc:`KeyError` is +raised if the entry asked for cannot be found. + +.. note:: + + .. index:: module: crypt + + In traditional Unix the field ``pw_passwd`` usually contains a password + encrypted with a DES derived algorithm (see module :mod:`crypt`). However most + modern unices use a so-called *shadow password* system. On those unices the + *pw_passwd* field only contains an asterisk (``'*'``) or the letter ``'x'`` + where the encrypted password is stored in a file :file:`/etc/shadow` which is + not world readable. Whether the *pw_passwd* field contains anything useful is + system-dependent. If available, the :mod:`spwd` module should be used where + access to the encrypted password is required. + +It defines the following items: + + +.. function:: getpwuid(uid) + + Return the password database entry for the given numeric user ID. + + +.. function:: getpwnam(name) + + Return the password database entry for the given user name. + + +.. function:: getpwall() + + Return a list of all available password database entries, in arbitrary order. + + +.. seealso:: + + Module :mod:`grp` + An interface to the group database, similar to this. + + Module :mod:`spwd` + An interface to the shadow password database, similar to this. + diff --git a/Doc/library/py_compile.rst b/Doc/library/py_compile.rst new file mode 100644 index 0000000..c815846 --- /dev/null +++ b/Doc/library/py_compile.rst @@ -0,0 +1,55 @@ +:mod:`py_compile` --- Compile Python source files +================================================= + +.. module:: py_compile + :synopsis: Generate byte-code files from Python source files. + +.. % Documentation based on module docstrings, by Fred L. Drake, Jr. +.. % <fdrake@acm.org> + + + +.. index:: pair: file; byte-code + +The :mod:`py_compile` module provides a function to generate a byte-code file +from a source file, and another function used when the module source file is +invoked as a script. + +Though not often needed, this function can be useful when installing modules for +shared use, especially if some of the users may not have permission to write the +byte-code cache files in the directory containing the source code. + + +.. exception:: PyCompileError + + Exception raised when an error occurs while attempting to compile the file. + + +.. function:: compile(file[, cfile[, dfile[, doraise]]]) + + Compile a source file to byte-code and write out the byte-code cache file. The + source code is loaded from the file name *file*. The byte-code is written to + *cfile*, which defaults to *file* ``+`` ``'c'`` (``'o'`` if optimization is + enabled in the current interpreter). If *dfile* is specified, it is used as the + name of the source file in error messages instead of *file*. If *doraise* is + true, a :exc:`PyCompileError` is raised when an error is encountered while + compiling *file*. If *doraise* is false (the default), an error string is + written to ``sys.stderr``, but no exception is raised. + + +.. function:: main([args]) + + Compile several source files. The files named in *args* (or on the command + line, if *args* is not specified) are compiled and the resulting bytecode is + cached in the normal manner. This function does not search a directory + structure to locate source files; it only compiles files named explicitly. + +When this module is run as a script, the :func:`main` is used to compile all the +files named on the command line. + + +.. seealso:: + + Module :mod:`compileall` + Utilities to compile all Python source files in a directory tree. + diff --git a/Doc/library/pyclbr.rst b/Doc/library/pyclbr.rst new file mode 100644 index 0000000..5a77b4e --- /dev/null +++ b/Doc/library/pyclbr.rst @@ -0,0 +1,112 @@ + +:mod:`pyclbr` --- Python class browser support +============================================== + +.. module:: pyclbr + :synopsis: Supports information extraction for a Python class browser. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`pyclbr` can be used to determine some limited information about the +classes, methods and top-level functions defined in a module. The information +provided is sufficient to implement a traditional three-pane class browser. The +information is extracted from the source code rather than by importing the +module, so this module is safe to use with untrusted source code. This +restriction makes it impossible to use this module with modules not implemented +in Python, including many standard and optional extension modules. + + +.. function:: readmodule(module[, path]) + + Read a module and return a dictionary mapping class names to class descriptor + objects. The parameter *module* should be the name of a module as a string; it + may be the name of a module within a package. The *path* parameter should be a + sequence, and is used to augment the value of ``sys.path``, which is used to + locate module source code. + + .. % The 'inpackage' parameter appears to be for internal use only.... + + +.. function:: readmodule_ex(module[, path]) + + Like :func:`readmodule`, but the returned dictionary, in addition to mapping + class names to class descriptor objects, also maps top-level function names to + function descriptor objects. Moreover, if the module being read is a package, + the key ``'__path__'`` in the returned dictionary has as its value a list which + contains the package search path. + + .. % The 'inpackage' parameter appears to be for internal use only.... + + +.. _pyclbr-class-objects: + +Class Descriptor Objects +------------------------ + +The class descriptor objects used as values in the dictionary returned by +:func:`readmodule` and :func:`readmodule_ex` provide the following data members: + + +.. attribute:: class_descriptor.module + + The name of the module defining the class described by the class descriptor. + + +.. attribute:: class_descriptor.name + + The name of the class. + + +.. attribute:: class_descriptor.super + + A list of class descriptors which describe the immediate base classes of the + class being described. Classes which are named as superclasses but which are + not discoverable by :func:`readmodule` are listed as a string with the class + name instead of class descriptors. + + +.. attribute:: class_descriptor.methods + + A dictionary mapping method names to line numbers. + + +.. attribute:: class_descriptor.file + + Name of the file containing the ``class`` statement defining the class. + + +.. attribute:: class_descriptor.lineno + + The line number of the ``class`` statement within the file named by + :attr:`file`. + + +.. _pyclbr-function-objects: + +Function Descriptor Objects +--------------------------- + +The function descriptor objects used as values in the dictionary returned by +:func:`readmodule_ex` provide the following data members: + + +.. attribute:: function_descriptor.module + + The name of the module defining the function described by the function + descriptor. + + +.. attribute:: function_descriptor.name + + The name of the function. + + +.. attribute:: function_descriptor.file + + Name of the file containing the ``def`` statement defining the function. + + +.. attribute:: function_descriptor.lineno + + The line number of the ``def`` statement within the file named by :attr:`file`. + diff --git a/Doc/library/pydoc.rst b/Doc/library/pydoc.rst new file mode 100644 index 0000000..2df127c --- /dev/null +++ b/Doc/library/pydoc.rst @@ -0,0 +1,65 @@ + +:mod:`pydoc` --- Documentation generator and online help system +=============================================================== + +.. module:: pydoc + :synopsis: Documentation generator and online help system. +.. moduleauthor:: Ka-Ping Yee <ping@lfw.org> +.. sectionauthor:: Ka-Ping Yee <ping@lfw.org> + + +.. versionadded:: 2.1 + +.. index:: + single: documentation; generation + single: documentation; online + single: help; online + +The :mod:`pydoc` module automatically generates documentation from Python +modules. The documentation can be presented as pages of text on the console, +served to a Web browser, or saved to HTML files. + +The built-in function :func:`help` invokes the online help system in the +interactive interpreter, which uses :mod:`pydoc` to generate its documentation +as text on the console. The same text documentation can also be viewed from +outside the Python interpreter by running :program:`pydoc` as a script at the +operating system's command prompt. For example, running :: + + pydoc sys + +at a shell prompt will display documentation on the :mod:`sys` module, in a +style similar to the manual pages shown by the Unix :program:`man` command. The +argument to :program:`pydoc` can be the name of a function, module, or package, +or a dotted reference to a class, method, or function within a module or module +in a package. If the argument to :program:`pydoc` looks like a path (that is, +it contains the path separator for your operating system, such as a slash in +Unix), and refers to an existing Python source file, then documentation is +produced for that file. + +Specifying a :option:`-w` flag before the argument will cause HTML documentation +to be written out to a file in the current directory, instead of displaying text +on the console. + +Specifying a :option:`-k` flag before the argument will search the synopsis +lines of all available modules for the keyword given as the argument, again in a +manner similar to the Unix :program:`man` command. The synopsis line of a +module is the first line of its documentation string. + +You can also use :program:`pydoc` to start an HTTP server on the local machine +that will serve documentation to visiting Web browsers. :program:`pydoc` +:option:`-p 1234` will start a HTTP server on port 1234, allowing you to browse +the documentation at ``http://localhost:1234/`` in your preferred Web browser. +:program:`pydoc` :option:`-g` will start the server and additionally bring up a +small :mod:`Tkinter`\ -based graphical interface to help you search for +documentation pages. + +When :program:`pydoc` generates documentation, it uses the current environment +and path to locate modules. Thus, invoking :program:`pydoc` :option:`spam` +documents precisely the version of the module you would get if you started the +Python interpreter and typed ``import spam``. + +Module docs for core modules are assumed to reside in +http://www.python.org/doc/current/lib/. This can be overridden by setting the +:envvar:`PYTHONDOCS` environment variable to a different URL or to a local +directory containing the Library Reference Manual pages. + diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst new file mode 100644 index 0000000..87ed501 --- /dev/null +++ b/Doc/library/pyexpat.rst @@ -0,0 +1,873 @@ + +:mod:`xml.parsers.expat` --- Fast XML parsing using Expat +========================================================= + +.. module:: xml.parsers.expat + :synopsis: An interface to the Expat non-validating XML parser. +.. moduleauthor:: Paul Prescod <paul@prescod.net> + + +.. % Markup notes: +.. % +.. % Many of the attributes of the XMLParser objects are callbacks. +.. % Since signature information must be presented, these are described +.. % using the methoddesc environment. Since they are attributes which +.. % are set by client code, in-text references to these attributes +.. % should be marked using the \member macro and should not include the +.. % parentheses used when marking functions and methods. + +.. versionadded:: 2.0 + +.. index:: single: Expat + +The :mod:`xml.parsers.expat` module is a Python interface to the Expat +non-validating XML parser. The module provides a single extension type, +:class:`xmlparser`, that represents the current state of an XML parser. After +an :class:`xmlparser` object has been created, various attributes of the object +can be set to handler functions. When an XML document is then fed to the +parser, the handler functions are called for the character data and markup in +the XML document. + +.. index:: module: pyexpat + +This module uses the :mod:`pyexpat` module to provide access to the Expat +parser. Direct use of the :mod:`pyexpat` module is deprecated. + +This module provides one exception and one type object: + + +.. exception:: ExpatError + + The exception raised when Expat reports an error. See section + :ref:`expaterror-objects` for more information on interpreting Expat errors. + + +.. exception:: error + + Alias for :exc:`ExpatError`. + + +.. data:: XMLParserType + + The type of the return values from the :func:`ParserCreate` function. + +The :mod:`xml.parsers.expat` module contains two functions: + + +.. function:: ErrorString(errno) + + Returns an explanatory string for a given error number *errno*. + + +.. function:: ParserCreate([encoding[, namespace_separator]]) + + Creates and returns a new :class:`xmlparser` object. *encoding*, if specified, + must be a string naming the encoding used by the XML data. Expat doesn't + support as many encodings as Python does, and its repertoire of encodings can't + be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. If + *encoding* is given it will override the implicit or explicit encoding of the + document. + + Expat can optionally do XML namespace processing for you, enabled by providing a + value for *namespace_separator*. The value must be a one-character string; a + :exc:`ValueError` will be raised if the string has an illegal length (``None`` + is considered the same as omission). When namespace processing is enabled, + element type names and attribute names that belong to a namespace will be + expanded. The element name passed to the element handlers + :attr:`StartElementHandler` and :attr:`EndElementHandler` will be the + concatenation of the namespace URI, the namespace separator character, and the + local part of the name. If the namespace separator is a zero byte (``chr(0)``) + then the namespace URI and the local part will be concatenated without any + separator. + + For example, if *namespace_separator* is set to a space character (``' '``) and + the following document is parsed:: + + <?xml version="1.0"?> + <root xmlns = "http://default-namespace.org/" + xmlns:py = "http://www.python.org/ns/"> + <py:elem1 /> + <elem2 xmlns="" /> + </root> + + :attr:`StartElementHandler` will receive the following strings for each + element:: + + http://default-namespace.org/ root + http://www.python.org/ns/ elem1 + elem2 + + +.. seealso:: + + `The Expat XML Parser <http://www.libexpat.org/>`_ + Home page of the Expat project. + + +.. _xmlparser-objects: + +XMLParser Objects +----------------- + +:class:`xmlparser` objects have the following methods: + + +.. method:: xmlparser.Parse(data[, isfinal]) + + Parses the contents of the string *data*, calling the appropriate handler + functions to process the parsed data. *isfinal* must be true on the final call + to this method. *data* can be the empty string at any time. + + +.. method:: xmlparser.ParseFile(file) + + Parse XML data reading from the object *file*. *file* only needs to provide + the ``read(nbytes)`` method, returning the empty string when there's no more + data. + + +.. method:: xmlparser.SetBase(base) + + Sets the base to be used for resolving relative URIs in system identifiers in + declarations. Resolving relative identifiers is left to the application: this + value will be passed through as the *base* argument to the + :func:`ExternalEntityRefHandler`, :func:`NotationDeclHandler`, and + :func:`UnparsedEntityDeclHandler` functions. + + +.. method:: xmlparser.GetBase() + + Returns a string containing the base set by a previous call to :meth:`SetBase`, + or ``None`` if :meth:`SetBase` hasn't been called. + + +.. method:: xmlparser.GetInputContext() + + Returns the input data that generated the current event as a string. The data is + in the encoding of the entity which contains the text. When called while an + event handler is not active, the return value is ``None``. + + .. versionadded:: 2.1 + + +.. method:: xmlparser.ExternalEntityParserCreate(context[, encoding]) + + Create a "child" parser which can be used to parse an external parsed entity + referred to by content parsed by the parent parser. The *context* parameter + should be the string passed to the :meth:`ExternalEntityRefHandler` handler + function, described below. The child parser is created with the + :attr:`ordered_attributes` and :attr:`specified_attributes` set to the values of + this parser. + + +.. method:: xmlparser.UseForeignDTD([flag]) + + Calling this with a true value for *flag* (the default) will cause Expat to call + the :attr:`ExternalEntityRefHandler` with :const:`None` for all arguments to + allow an alternate DTD to be loaded. If the document does not contain a + document type declaration, the :attr:`ExternalEntityRefHandler` will still be + called, but the :attr:`StartDoctypeDeclHandler` and + :attr:`EndDoctypeDeclHandler` will not be called. + + Passing a false value for *flag* will cancel a previous call that passed a true + value, but otherwise has no effect. + + This method can only be called before the :meth:`Parse` or :meth:`ParseFile` + methods are called; calling it after either of those have been called causes + :exc:`ExpatError` to be raised with the :attr:`code` attribute set to + :const:`errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING`. + + .. versionadded:: 2.3 + +:class:`xmlparser` objects have the following attributes: + + +.. attribute:: xmlparser.buffer_size + + The size of the buffer used when :attr:`buffer_text` is true. This value cannot + be changed at this time. + + .. versionadded:: 2.3 + + +.. attribute:: xmlparser.buffer_text + + Setting this to true causes the :class:`xmlparser` object to buffer textual + content returned by Expat to avoid multiple calls to the + :meth:`CharacterDataHandler` callback whenever possible. This can improve + performance substantially since Expat normally breaks character data into chunks + at every line ending. This attribute is false by default, and may be changed at + any time. + + .. versionadded:: 2.3 + + +.. attribute:: xmlparser.buffer_used + + If :attr:`buffer_text` is enabled, the number of bytes stored in the buffer. + These bytes represent UTF-8 encoded text. This attribute has no meaningful + interpretation when :attr:`buffer_text` is false. + + .. versionadded:: 2.3 + + +.. attribute:: xmlparser.ordered_attributes + + Setting this attribute to a non-zero integer causes the attributes to be + reported as a list rather than a dictionary. The attributes are presented in + the order found in the document text. For each attribute, two list entries are + presented: the attribute name and the attribute value. (Older versions of this + module also used this format.) By default, this attribute is false; it may be + changed at any time. + + .. versionadded:: 2.1 + + +.. attribute:: xmlparser.specified_attributes + + If set to a non-zero integer, the parser will report only those attributes which + were specified in the document instance and not those which were derived from + attribute declarations. Applications which set this need to be especially + careful to use what additional information is available from the declarations as + needed to comply with the standards for the behavior of XML processors. By + default, this attribute is false; it may be changed at any time. + + .. versionadded:: 2.1 + +The following attributes contain values relating to the most recent error +encountered by an :class:`xmlparser` object, and will only have correct values +once a call to :meth:`Parse` or :meth:`ParseFile` has raised a +:exc:`xml.parsers.expat.ExpatError` exception. + + +.. attribute:: xmlparser.ErrorByteIndex + + Byte index at which an error occurred. + + +.. attribute:: xmlparser.ErrorCode + + Numeric code specifying the problem. This value can be passed to the + :func:`ErrorString` function, or compared to one of the constants defined in the + ``errors`` object. + + +.. attribute:: xmlparser.ErrorColumnNumber + + Column number at which an error occurred. + + +.. attribute:: xmlparser.ErrorLineNumber + + Line number at which an error occurred. + +The following attributes contain values relating to the current parse location +in an :class:`xmlparser` object. During a callback reporting a parse event they +indicate the location of the first of the sequence of characters that generated +the event. When called outside of a callback, the position indicated will be +just past the last parse event (regardless of whether there was an associated +callback). + +.. versionadded:: 2.4 + + +.. attribute:: xmlparser.CurrentByteIndex + + Current byte index in the parser input. + + +.. attribute:: xmlparser.CurrentColumnNumber + + Current column number in the parser input. + + +.. attribute:: xmlparser.CurrentLineNumber + + Current line number in the parser input. + +Here is the list of handlers that can be set. To set a handler on an +:class:`xmlparser` object *o*, use ``o.handlername = func``. *handlername* must +be taken from the following list, and *func* must be a callable object accepting +the correct number of arguments. The arguments are all strings, unless +otherwise stated. + + +.. method:: xmlparser.XmlDeclHandler(version, encoding, standalone) + + Called when the XML declaration is parsed. The XML declaration is the + (optional) declaration of the applicable version of the XML recommendation, the + encoding of the document text, and an optional "standalone" declaration. + *version* and *encoding* will be strings, and *standalone* will be ``1`` if the + document is declared standalone, ``0`` if it is declared not to be standalone, + or ``-1`` if the standalone clause was omitted. This is only available with + Expat version 1.95.0 or newer. + + .. versionadded:: 2.1 + + +.. method:: xmlparser.StartDoctypeDeclHandler(doctypeName, systemId, publicId, has_internal_subset) + + Called when Expat begins parsing the document type declaration (``<!DOCTYPE + ...``). The *doctypeName* is provided exactly as presented. The *systemId* and + *publicId* parameters give the system and public identifiers if specified, or + ``None`` if omitted. *has_internal_subset* will be true if the document + contains and internal document declaration subset. This requires Expat version + 1.2 or newer. + + +.. method:: xmlparser.EndDoctypeDeclHandler() + + Called when Expat is done parsing the document type declaration. This requires + Expat version 1.2 or newer. + + +.. method:: xmlparser.ElementDeclHandler(name, model) + + Called once for each element type declaration. *name* is the name of the + element type, and *model* is a representation of the content model. + + +.. method:: xmlparser.AttlistDeclHandler(elname, attname, type, default, required) + + Called for each declared attribute for an element type. If an attribute list + declaration declares three attributes, this handler is called three times, once + for each attribute. *elname* is the name of the element to which the + declaration applies and *attname* is the name of the attribute declared. The + attribute type is a string passed as *type*; the possible values are + ``'CDATA'``, ``'ID'``, ``'IDREF'``, ... *default* gives the default value for + the attribute used when the attribute is not specified by the document instance, + or ``None`` if there is no default value (``#IMPLIED`` values). If the + attribute is required to be given in the document instance, *required* will be + true. This requires Expat version 1.95.0 or newer. + + +.. method:: xmlparser.StartElementHandler(name, attributes) + + Called for the start of every element. *name* is a string containing the + element name, and *attributes* is a dictionary mapping attribute names to their + values. + + +.. method:: xmlparser.EndElementHandler(name) + + Called for the end of every element. + + +.. method:: xmlparser.ProcessingInstructionHandler(target, data) + + Called for every processing instruction. + + +.. method:: xmlparser.CharacterDataHandler(data) + + Called for character data. This will be called for normal character data, CDATA + marked content, and ignorable whitespace. Applications which must distinguish + these cases can use the :attr:`StartCdataSectionHandler`, + :attr:`EndCdataSectionHandler`, and :attr:`ElementDeclHandler` callbacks to + collect the required information. + + +.. method:: xmlparser.UnparsedEntityDeclHandler(entityName, base, systemId, publicId, notationName) + + Called for unparsed (NDATA) entity declarations. This is only present for + version 1.2 of the Expat library; for more recent versions, use + :attr:`EntityDeclHandler` instead. (The underlying function in the Expat + library has been declared obsolete.) + + +.. method:: xmlparser.EntityDeclHandler(entityName, is_parameter_entity, value, base, systemId, publicId, notationName) + + Called for all entity declarations. For parameter and internal entities, + *value* will be a string giving the declared contents of the entity; this will + be ``None`` for external entities. The *notationName* parameter will be + ``None`` for parsed entities, and the name of the notation for unparsed + entities. *is_parameter_entity* will be true if the entity is a parameter entity + or false for general entities (most applications only need to be concerned with + general entities). This is only available starting with version 1.95.0 of the + Expat library. + + .. versionadded:: 2.1 + + +.. method:: xmlparser.NotationDeclHandler(notationName, base, systemId, publicId) + + Called for notation declarations. *notationName*, *base*, and *systemId*, and + *publicId* are strings if given. If the public identifier is omitted, + *publicId* will be ``None``. + + +.. method:: xmlparser.StartNamespaceDeclHandler(prefix, uri) + + Called when an element contains a namespace declaration. Namespace declarations + are processed before the :attr:`StartElementHandler` is called for the element + on which declarations are placed. + + +.. method:: xmlparser.EndNamespaceDeclHandler(prefix) + + Called when the closing tag is reached for an element that contained a + namespace declaration. This is called once for each namespace declaration on + the element in the reverse of the order for which the + :attr:`StartNamespaceDeclHandler` was called to indicate the start of each + namespace declaration's scope. Calls to this handler are made after the + corresponding :attr:`EndElementHandler` for the end of the element. + + +.. method:: xmlparser.CommentHandler(data) + + Called for comments. *data* is the text of the comment, excluding the leading + '``<!-``\ ``-``' and trailing '``-``\ ``->``'. + + +.. method:: xmlparser.StartCdataSectionHandler() + + Called at the start of a CDATA section. This and :attr:`EndCdataSectionHandler` + are needed to be able to identify the syntactical start and end for CDATA + sections. + + +.. method:: xmlparser.EndCdataSectionHandler() + + Called at the end of a CDATA section. + + +.. method:: xmlparser.DefaultHandler(data) + + Called for any characters in the XML document for which no applicable handler + has been specified. This means characters that are part of a construct which + could be reported, but for which no handler has been supplied. + + +.. method:: xmlparser.DefaultHandlerExpand(data) + + This is the same as the :func:`DefaultHandler`, but doesn't inhibit expansion + of internal entities. The entity reference will not be passed to the default + handler. + + +.. method:: xmlparser.NotStandaloneHandler() + + Called if the XML document hasn't been declared as being a standalone document. + This happens when there is an external subset or a reference to a parameter + entity, but the XML declaration does not set standalone to ``yes`` in an XML + declaration. If this handler returns ``0``, then the parser will throw an + :const:`XML_ERROR_NOT_STANDALONE` error. If this handler is not set, no + exception is raised by the parser for this condition. + + +.. method:: xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId) + + Called for references to external entities. *base* is the current base, as set + by a previous call to :meth:`SetBase`. The public and system identifiers, + *systemId* and *publicId*, are strings if given; if the public identifier is not + given, *publicId* will be ``None``. The *context* value is opaque and should + only be used as described below. + + For external entities to be parsed, this handler must be implemented. It is + responsible for creating the sub-parser using + ``ExternalEntityParserCreate(context)``, initializing it with the appropriate + callbacks, and parsing the entity. This handler should return an integer; if it + returns ``0``, the parser will throw an + :const:`XML_ERROR_EXTERNAL_ENTITY_HANDLING` error, otherwise parsing will + continue. + + If this handler is not provided, external entities are reported by the + :attr:`DefaultHandler` callback, if provided. + + +.. _expaterror-objects: + +ExpatError Exceptions +--------------------- + +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +:exc:`ExpatError` exceptions have a number of interesting attributes: + + +.. attribute:: ExpatError.code + + Expat's internal error number for the specific error. This will match one of + the constants defined in the ``errors`` object from this module. + + .. versionadded:: 2.1 + + +.. attribute:: ExpatError.lineno + + Line number on which the error was detected. The first line is numbered ``1``. + + .. versionadded:: 2.1 + + +.. attribute:: ExpatError.offset + + Character offset into the line where the error occurred. The first column is + numbered ``0``. + + .. versionadded:: 2.1 + + +.. _expat-example: + +Example +------- + +The following program defines three handlers that just print out their +arguments. :: + + import xml.parsers.expat + + # 3 handler functions + def start_element(name, attrs): + print 'Start element:', name, attrs + def end_element(name): + print 'End element:', name + def char_data(data): + print 'Character data:', repr(data) + + p = xml.parsers.expat.ParserCreate() + + p.StartElementHandler = start_element + p.EndElementHandler = end_element + p.CharacterDataHandler = char_data + + p.Parse("""<?xml version="1.0"?> + <parent id="top"><child1 name="paul">Text goes here</child1> + <child2 name="fred">More text</child2> + </parent>""", 1) + +The output from this program is:: + + Start element: parent {'id': 'top'} + Start element: child1 {'name': 'paul'} + Character data: 'Text goes here' + End element: child1 + Character data: '\n' + Start element: child2 {'name': 'fred'} + Character data: 'More text' + End element: child2 + Character data: '\n' + End element: parent + + +.. _expat-content-models: + +Content Model Descriptions +-------------------------- + +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +Content modules are described using nested tuples. Each tuple contains four +values: the type, the quantifier, the name, and a tuple of children. Children +are simply additional content module descriptions. + +The values of the first two fields are constants defined in the ``model`` object +of the :mod:`xml.parsers.expat` module. These constants can be collected in two +groups: the model type group and the quantifier group. + +The constants in the model type group are: + + +.. data:: XML_CTYPE_ANY + :noindex: + + The element named by the model name was declared to have a content model of + ``ANY``. + + +.. data:: XML_CTYPE_CHOICE + :noindex: + + The named element allows a choice from a number of options; this is used for + content models such as ``(A | B | C)``. + + +.. data:: XML_CTYPE_EMPTY + :noindex: + + Elements which are declared to be ``EMPTY`` have this model type. + + +.. data:: XML_CTYPE_MIXED + :noindex: + + +.. data:: XML_CTYPE_NAME + :noindex: + + +.. data:: XML_CTYPE_SEQ + :noindex: + + Models which represent a series of models which follow one after the other are + indicated with this model type. This is used for models such as ``(A, B, C)``. + +The constants in the quantifier group are: + + +.. data:: XML_CQUANT_NONE + :noindex: + + No modifier is given, so it can appear exactly once, as for ``A``. + + +.. data:: XML_CQUANT_OPT + :noindex: + + The model is optional: it can appear once or not at all, as for ``A?``. + + +.. data:: XML_CQUANT_PLUS + :noindex: + + The model must occur one or more times (like ``A+``). + + +.. data:: XML_CQUANT_REP + :noindex: + + The model must occur zero or more times, as for ``A*``. + + +.. _expat-errors: + +Expat error constants +--------------------- + +The following constants are provided in the ``errors`` object of the +:mod:`xml.parsers.expat` module. These constants are useful in interpreting +some of the attributes of the :exc:`ExpatError` exception objects raised when an +error has occurred. + +The ``errors`` object has the following attributes: + + +.. data:: XML_ERROR_ASYNC_ENTITY + :noindex: + + +.. data:: XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF + :noindex: + + An entity reference in an attribute value referred to an external entity instead + of an internal entity. + + +.. data:: XML_ERROR_BAD_CHAR_REF + :noindex: + + A character reference referred to a character which is illegal in XML (for + example, character ``0``, or '``�``'). + + +.. data:: XML_ERROR_BINARY_ENTITY_REF + :noindex: + + An entity reference referred to an entity which was declared with a notation, so + cannot be parsed. + + +.. data:: XML_ERROR_DUPLICATE_ATTRIBUTE + :noindex: + + An attribute was used more than once in a start tag. + + +.. data:: XML_ERROR_INCORRECT_ENCODING + :noindex: + + +.. data:: XML_ERROR_INVALID_TOKEN + :noindex: + + Raised when an input byte could not properly be assigned to a character; for + example, a NUL byte (value ``0``) in a UTF-8 input stream. + + +.. data:: XML_ERROR_JUNK_AFTER_DOC_ELEMENT + :noindex: + + Something other than whitespace occurred after the document element. + + +.. data:: XML_ERROR_MISPLACED_XML_PI + :noindex: + + An XML declaration was found somewhere other than the start of the input data. + + +.. data:: XML_ERROR_NO_ELEMENTS + :noindex: + + The document contains no elements (XML requires all documents to contain exactly + one top-level element).. + + +.. data:: XML_ERROR_NO_MEMORY + :noindex: + + Expat was not able to allocate memory internally. + + +.. data:: XML_ERROR_PARAM_ENTITY_REF + :noindex: + + A parameter entity reference was found where it was not allowed. + + +.. data:: XML_ERROR_PARTIAL_CHAR + :noindex: + + An incomplete character was found in the input. + + +.. data:: XML_ERROR_RECURSIVE_ENTITY_REF + :noindex: + + An entity reference contained another reference to the same entity; possibly via + a different name, and possibly indirectly. + + +.. data:: XML_ERROR_SYNTAX + :noindex: + + Some unspecified syntax error was encountered. + + +.. data:: XML_ERROR_TAG_MISMATCH + :noindex: + + An end tag did not match the innermost open start tag. + + +.. data:: XML_ERROR_UNCLOSED_TOKEN + :noindex: + + Some token (such as a start tag) was not closed before the end of the stream or + the next token was encountered. + + +.. data:: XML_ERROR_UNDEFINED_ENTITY + :noindex: + + A reference was made to a entity which was not defined. + + +.. data:: XML_ERROR_UNKNOWN_ENCODING + :noindex: + + The document encoding is not supported by Expat. + + +.. data:: XML_ERROR_UNCLOSED_CDATA_SECTION + :noindex: + + A CDATA marked section was not closed. + + +.. data:: XML_ERROR_EXTERNAL_ENTITY_HANDLING + :noindex: + + +.. data:: XML_ERROR_NOT_STANDALONE + :noindex: + + The parser determined that the document was not "standalone" though it declared + itself to be in the XML declaration, and the :attr:`NotStandaloneHandler` was + set and returned ``0``. + + +.. data:: XML_ERROR_UNEXPECTED_STATE + :noindex: + + +.. data:: XML_ERROR_ENTITY_DECLARED_IN_PE + :noindex: + + +.. data:: XML_ERROR_FEATURE_REQUIRES_XML_DTD + :noindex: + + An operation was requested that requires DTD support to be compiled in, but + Expat was configured without DTD support. This should never be reported by a + standard build of the :mod:`xml.parsers.expat` module. + + +.. data:: XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING + :noindex: + + A behavioral change was requested after parsing started that can only be changed + before parsing has started. This is (currently) only raised by + :meth:`UseForeignDTD`. + + +.. data:: XML_ERROR_UNBOUND_PREFIX + :noindex: + + An undeclared prefix was found when namespace processing was enabled. + + +.. data:: XML_ERROR_UNDECLARING_PREFIX + :noindex: + + The document attempted to remove the namespace declaration associated with a + prefix. + + +.. data:: XML_ERROR_INCOMPLETE_PE + :noindex: + + A parameter entity contained incomplete markup. + + +.. data:: XML_ERROR_XML_DECL + :noindex: + + The document contained no document element at all. + + +.. data:: XML_ERROR_TEXT_DECL + :noindex: + + There was an error parsing a text declaration in an external entity. + + +.. data:: XML_ERROR_PUBLICID + :noindex: + + Characters were found in the public id that are not allowed. + + +.. data:: XML_ERROR_SUSPENDED + :noindex: + + The requested operation was made on a suspended parser, but isn't allowed. This + includes attempts to provide additional input or to stop the parser. + + +.. data:: XML_ERROR_NOT_SUSPENDED + :noindex: + + An attempt to resume the parser was made when the parser had not been suspended. + + +.. data:: XML_ERROR_ABORTED + :noindex: + + This should not be reported to Python applications. + + +.. data:: XML_ERROR_FINISHED + :noindex: + + The requested operation was made on a parser which was finished parsing input, + but isn't allowed. This includes attempts to provide additional input or to + stop the parser. + + +.. data:: XML_ERROR_SUSPEND_PE + :noindex: + diff --git a/Doc/library/python.rst b/Doc/library/python.rst new file mode 100644 index 0000000..3b58eee --- /dev/null +++ b/Doc/library/python.rst @@ -0,0 +1,27 @@ + +.. _python: + +*********************** +Python Runtime Services +*********************** + +The modules described in this chapter provide a wide range of services related +to the Python interpreter and its interaction with its environment. Here's an +overview: + + +.. toctree:: + + sys.rst + __builtin__.rst + __main__.rst + warnings.rst + contextlib.rst + atexit.rst + traceback.rst + __future__.rst + gc.rst + inspect.rst + site.rst + user.rst + fpectl.rst diff --git a/Doc/library/queue.rst b/Doc/library/queue.rst new file mode 100644 index 0000000..c7b65fd --- /dev/null +++ b/Doc/library/queue.rst @@ -0,0 +1,152 @@ + +:mod:`Queue` --- A synchronized queue class +=========================================== + +.. module:: Queue + :synopsis: A synchronized queue class. + + +The :mod:`Queue` module implements a multi-producer, multi-consumer FIFO queue. +It is especially useful in threads programming when information must be +exchanged safely between multiple threads. The :class:`Queue` class in this +module implements all the required locking semantics. It depends on the +availability of thread support in Python. + +The :mod:`Queue` module defines the following class and exception: + + +.. class:: Queue(maxsize) + + Constructor for the class. *maxsize* is an integer that sets the upperbound + limit on the number of items that can be placed in the queue. Insertion will + block once this size has been reached, until queue items are consumed. If + *maxsize* is less than or equal to zero, the queue size is infinite. + + +.. exception:: Empty + + Exception raised when non-blocking :meth:`get` (or :meth:`get_nowait`) is called + on a :class:`Queue` object which is empty. + + +.. exception:: Full + + Exception raised when non-blocking :meth:`put` (or :meth:`put_nowait`) is called + on a :class:`Queue` object which is full. + + +.. _queueobjects: + +Queue Objects +------------- + +Class :class:`Queue` implements queue objects and has the methods described +below. This class can be derived from in order to implement other queue +organizations (e.g. stack) but the inheritable interface is not described here. +See the source code for details. The public methods are: + + +.. method:: Queue.qsize() + + Return the approximate size of the queue. Because of multithreading semantics, + this number is not reliable. + + +.. method:: Queue.empty() + + Return ``True`` if the queue is empty, ``False`` otherwise. Because of + multithreading semantics, this is not reliable. + + +.. method:: Queue.full() + + Return ``True`` if the queue is full, ``False`` otherwise. Because of + multithreading semantics, this is not reliable. + + +.. method:: Queue.put(item[, block[, timeout]]) + + Put *item* into the queue. If optional args *block* is true and *timeout* is + None (the default), block if necessary until a free slot is available. If + *timeout* is a positive number, it blocks at most *timeout* seconds and raises + the :exc:`Full` exception if no free slot was available within that time. + Otherwise (*block* is false), put an item on the queue if a free slot is + immediately available, else raise the :exc:`Full` exception (*timeout* is + ignored in that case). + + .. versionadded:: 2.3 + The *timeout* parameter. + + +.. method:: Queue.put_nowait(item) + + Equivalent to ``put(item, False)``. + + +.. method:: Queue.get([block[, timeout]]) + + Remove and return an item from the queue. If optional args *block* is true and + *timeout* is None (the default), block if necessary until an item is available. + If *timeout* is a positive number, it blocks at most *timeout* seconds and + raises the :exc:`Empty` exception if no item was available within that time. + Otherwise (*block* is false), return an item if one is immediately available, + else raise the :exc:`Empty` exception (*timeout* is ignored in that case). + + .. versionadded:: 2.3 + The *timeout* parameter. + + +.. method:: Queue.get_nowait() + + Equivalent to ``get(False)``. + +Two methods are offered to support tracking whether enqueued tasks have been +fully processed by daemon consumer threads. + + +.. method:: Queue.task_done() + + Indicate that a formerly enqueued task is complete. Used by queue consumer + threads. For each :meth:`get` used to fetch a task, a subsequent call to + :meth:`task_done` tells the queue that the processing on the task is complete. + + If a :meth:`join` is currently blocking, it will resume when all items have been + processed (meaning that a :meth:`task_done` call was received for every item + that had been :meth:`put` into the queue). + + Raises a :exc:`ValueError` if called more times than there were items placed in + the queue. + + .. versionadded:: 2.5 + + +.. method:: Queue.join() + + Blocks until all items in the queue have been gotten and processed. + + The count of unfinished tasks goes up whenever an item is added to the queue. + The count goes down whenever a consumer thread calls :meth:`task_done` to + indicate that the item was retrieved and all work on it is complete. When the + count of unfinished tasks drops to zero, join() unblocks. + + .. versionadded:: 2.5 + +Example of how to wait for enqueued tasks to be completed:: + + def worker(): + while True: + item = q.get() + do_work(item) + q.task_done() + + q = Queue() + for i in range(num_worker_threads): + t = Thread(target=worker) + t.setDaemon(True) + t.start() + + for item in source(): + q.put(item) + + q.join() # block until all tasks are done + diff --git a/Doc/library/quopri.rst b/Doc/library/quopri.rst new file mode 100644 index 0000000..8f525ef --- /dev/null +++ b/Doc/library/quopri.rst @@ -0,0 +1,61 @@ + +:mod:`quopri` --- Encode and decode MIME quoted-printable data +============================================================== + +.. module:: quopri + :synopsis: Encode and decode files using the MIME quoted-printable encoding. + + +.. index:: + pair: quoted-printable; encoding + single: MIME; quoted-printable encoding + +This module performs quoted-printable transport encoding and decoding, as +defined in :rfc:`1521`: "MIME (Multipurpose Internet Mail Extensions) Part One: +Mechanisms for Specifying and Describing the Format of Internet Message Bodies". +The quoted-printable encoding is designed for data where there are relatively +few nonprintable characters; the base64 encoding scheme available via the +:mod:`base64` module is more compact if there are many such characters, as when +sending a graphics file. + + +.. function:: decode(input, output[,header]) + + Decode the contents of the *input* file and write the resulting decoded binary + data to the *output* file. *input* and *output* must either be file objects or + objects that mimic the file object interface. *input* will be read until + ``input.readline()`` returns an empty string. If the optional argument *header* + is present and true, underscore will be decoded as space. This is used to decode + "Q"-encoded headers as described in :rfc:`1522`: "MIME (Multipurpose Internet + Mail Extensions) Part Two: Message Header Extensions for Non-ASCII Text". + + +.. function:: encode(input, output, quotetabs) + + Encode the contents of the *input* file and write the resulting quoted-printable + data to the *output* file. *input* and *output* must either be file objects or + objects that mimic the file object interface. *input* will be read until + ``input.readline()`` returns an empty string. *quotetabs* is a flag which + controls whether to encode embedded spaces and tabs; when true it encodes such + embedded whitespace, and when false it leaves them unencoded. Note that spaces + and tabs appearing at the end of lines are always encoded, as per :rfc:`1521`. + + +.. function:: decodestring(s[,header]) + + Like :func:`decode`, except that it accepts a source string and returns the + corresponding decoded string. + + +.. function:: encodestring(s[, quotetabs]) + + Like :func:`encode`, except that it accepts a source string and returns the + corresponding encoded string. *quotetabs* is optional (defaulting to 0), and is + passed straight through to :func:`encode`. + + +.. seealso:: + + Module :mod:`base64` + Encode and decode MIME base64 data + diff --git a/Doc/library/random.rst b/Doc/library/random.rst new file mode 100644 index 0000000..c5d289c --- /dev/null +++ b/Doc/library/random.rst @@ -0,0 +1,315 @@ + +:mod:`random` --- Generate pseudo-random numbers +================================================ + +.. module:: random + :synopsis: Generate pseudo-random numbers with various common distributions. + + +This module implements pseudo-random number generators for various +distributions. + +For integers, uniform selection from a range. For sequences, uniform selection +of a random element, a function to generate a random permutation of a list +in-place, and a function for random sampling without replacement. + +On the real line, there are functions to compute uniform, normal (Gaussian), +lognormal, negative exponential, gamma, and beta distributions. For generating +distributions of angles, the von Mises distribution is available. + +Almost all module functions depend on the basic function :func:`random`, which +generates a random float uniformly in the semi-open range [0.0, 1.0). Python +uses the Mersenne Twister as the core generator. It produces 53-bit precision +floats and has a period of 2\*\*19937-1. The underlying implementation in C is +both fast and threadsafe. The Mersenne Twister is one of the most extensively +tested random number generators in existence. However, being completely +deterministic, it is not suitable for all purposes, and is completely unsuitable +for cryptographic purposes. + +The functions supplied by this module are actually bound methods of a hidden +instance of the :class:`random.Random` class. You can instantiate your own +instances of :class:`Random` to get generators that don't share state. This is +especially useful for multi-threaded programs, creating a different instance of +:class:`Random` for each thread, and using the :meth:`jumpahead` method to make +it likely that the generated sequences seen by each thread don't overlap. + +Class :class:`Random` can also be subclassed if you want to use a different +basic generator of your own devising: in that case, override the :meth:`random`, +:meth:`seed`, :meth:`getstate`, :meth:`setstate` and :meth:`jumpahead` methods. +Optionally, a new generator can supply a :meth:`getrandombits` method --- this +allows :meth:`randrange` to produce selections over an arbitrarily large range. + +.. versionadded:: 2.4 + the :meth:`getrandombits` method. + +As an example of subclassing, the :mod:`random` module provides the +:class:`WichmannHill` class that implements an alternative generator in pure +Python. The class provides a backward compatible way to reproduce results from +earlier versions of Python, which used the Wichmann-Hill algorithm as the core +generator. Note that this Wichmann-Hill generator can no longer be recommended: +its period is too short by contemporary standards, and the sequence generated is +known to fail some stringent randomness tests. See the references below for a +recent variant that repairs these flaws. + +.. versionchanged:: 2.3 + Substituted MersenneTwister for Wichmann-Hill. + +Bookkeeping functions: + + +.. function:: seed([x]) + + Initialize the basic random number generator. Optional argument *x* can be any + hashable object. If *x* is omitted or ``None``, current system time is used; + current system time is also used to initialize the generator when the module is + first imported. If randomness sources are provided by the operating system, + they are used instead of the system time (see the :func:`os.urandom` function + for details on availability). + + .. versionchanged:: 2.4 + formerly, operating system resources were not used. + + If *x* is not ``None`` or an int or long, ``hash(x)`` is used instead. If *x* is + an int or long, *x* is used directly. + + +.. function:: getstate() + + Return an object capturing the current internal state of the generator. This + object can be passed to :func:`setstate` to restore the state. + + .. versionadded:: 2.1 + + +.. function:: setstate(state) + + *state* should have been obtained from a previous call to :func:`getstate`, and + :func:`setstate` restores the internal state of the generator to what it was at + the time :func:`setstate` was called. + + .. versionadded:: 2.1 + + +.. function:: jumpahead(n) + + Change the internal state to one different from and likely far away from the + current state. *n* is a non-negative integer which is used to scramble the + current state vector. This is most useful in multi-threaded programs, in + conjuction with multiple instances of the :class:`Random` class: + :meth:`setstate` or :meth:`seed` can be used to force all instances into the + same internal state, and then :meth:`jumpahead` can be used to force the + instances' states far apart. + + .. versionadded:: 2.1 + + .. versionchanged:: 2.3 + Instead of jumping to a specific state, *n* steps ahead, ``jumpahead(n)`` + jumps to another state likely to be separated by many steps. + + +.. function:: getrandbits(k) + + Returns a python :class:`long` int with *k* random bits. This method is supplied + with the MersenneTwister generator and some other generators may also provide it + as an optional part of the API. When available, :meth:`getrandbits` enables + :meth:`randrange` to handle arbitrarily large ranges. + + .. versionadded:: 2.4 + +Functions for integers: + + +.. function:: randrange([start,] stop[, step]) + + Return a randomly selected element from ``range(start, stop, step)``. This is + equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a + range object. + + .. versionadded:: 1.5.2 + + +.. function:: randint(a, b) + + Return a random integer *N* such that ``a <= N <= b``. + +Functions for sequences: + + +.. function:: choice(seq) + + Return a random element from the non-empty sequence *seq*. If *seq* is empty, + raises :exc:`IndexError`. + + +.. function:: shuffle(x[, random]) + + Shuffle the sequence *x* in place. The optional argument *random* is a + 0-argument function returning a random float in [0.0, 1.0); by default, this is + the function :func:`random`. + + Note that for even rather small ``len(x)``, the total number of permutations of + *x* is larger than the period of most random number generators; this implies + that most permutations of a long sequence can never be generated. + + +.. function:: sample(population, k) + + Return a *k* length list of unique elements chosen from the population sequence. + Used for random sampling without replacement. + + .. versionadded:: 2.3 + + Returns a new list containing elements from the population while leaving the + original population unchanged. The resulting list is in selection order so that + all sub-slices will also be valid random samples. This allows raffle winners + (the sample) to be partitioned into grand prize and second place winners (the + subslices). + + Members of the population need not be hashable or unique. If the population + contains repeats, then each occurrence is a possible selection in the sample. + + To choose a sample from a range of integers, use an :func:`range` object as an + argument. This is especially fast and space efficient for sampling from a large + population: ``sample(range(10000000), 60)``. + +The following functions generate specific real-valued distributions. Function +parameters are named after the corresponding variables in the distribution's +equation, as used in common mathematical practice; most of these equations can +be found in any statistics text. + + +.. function:: random() + + Return the next random floating point number in the range [0.0, 1.0). + + +.. function:: uniform(a, b) + + Return a random floating point number *N* such that ``a <= N < b``. + + +.. function:: betavariate(alpha, beta) + + Beta distribution. Conditions on the parameters are ``alpha > 0`` and ``beta > + 0``. Returned values range between 0 and 1. + + +.. function:: expovariate(lambd) + + Exponential distribution. *lambd* is 1.0 divided by the desired mean. (The + parameter would be called "lambda", but that is a reserved word in Python.) + Returned values range from 0 to positive infinity. + + +.. function:: gammavariate(alpha, beta) + + Gamma distribution. (*Not* the gamma function!) Conditions on the parameters + are ``alpha > 0`` and ``beta > 0``. + + +.. function:: gauss(mu, sigma) + + Gaussian distribution. *mu* is the mean, and *sigma* is the standard deviation. + This is slightly faster than the :func:`normalvariate` function defined below. + + +.. function:: lognormvariate(mu, sigma) + + Log normal distribution. If you take the natural logarithm of this + distribution, you'll get a normal distribution with mean *mu* and standard + deviation *sigma*. *mu* can have any value, and *sigma* must be greater than + zero. + + +.. function:: normalvariate(mu, sigma) + + Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. + + +.. function:: vonmisesvariate(mu, kappa) + + *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* + is the concentration parameter, which must be greater than or equal to zero. If + *kappa* is equal to zero, this distribution reduces to a uniform random angle + over the range 0 to 2\*\ *pi*. + + +.. function:: paretovariate(alpha) + + Pareto distribution. *alpha* is the shape parameter. + + +.. function:: weibullvariate(alpha, beta) + + Weibull distribution. *alpha* is the scale parameter and *beta* is the shape + parameter. + + +Alternative Generators: + +.. class:: WichmannHill([seed]) + + Class that implements the Wichmann-Hill algorithm as the core generator. Has all + of the same methods as :class:`Random` plus the :meth:`whseed` method described + below. Because this class is implemented in pure Python, it is not threadsafe + and may require locks between calls. The period of the generator is + 6,953,607,871,644 which is small enough to require care that two independent + random sequences do not overlap. + + +.. function:: whseed([x]) + + This is obsolete, supplied for bit-level compatibility with versions of Python + prior to 2.1. See :func:`seed` for details. :func:`whseed` does not guarantee + that distinct integer arguments yield distinct internal states, and can yield no + more than about 2\*\*24 distinct internal states in all. + + +.. class:: SystemRandom([seed]) + + Class that uses the :func:`os.urandom` function for generating random numbers + from sources provided by the operating system. Not available on all systems. + Does not rely on software state and sequences are not reproducible. Accordingly, + the :meth:`seed` and :meth:`jumpahead` methods have no effect and are ignored. + The :meth:`getstate` and :meth:`setstate` methods raise + :exc:`NotImplementedError` if called. + + .. versionadded:: 2.4 + +Examples of basic usage:: + + >>> random.random() # Random float x, 0.0 <= x < 1.0 + 0.37444887175646646 + >>> random.uniform(1, 10) # Random float x, 1.0 <= x < 10.0 + 1.1800146073117523 + >>> random.randint(1, 10) # Integer from 1 to 10, endpoints included + 7 + >>> random.randrange(0, 101, 2) # Even integer from 0 to 100 + 26 + >>> random.choice('abcdefghij') # Choose a random element + 'c' + + >>> items = [1, 2, 3, 4, 5, 6, 7] + >>> random.shuffle(items) + >>> items + [7, 3, 2, 5, 6, 4, 1] + + >>> random.sample([1, 2, 3, 4, 5], 3) # Choose 3 elements + [4, 1, 5] + + + +.. seealso:: + + M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally + equidistributed uniform pseudorandom number generator", ACM Transactions on + Modeling and Computer Simulation Vol. 8, No. 1, January pp.3-30 1998. + + Wichmann, B. A. & Hill, I. D., "Algorithm AS 183: An efficient and portable + pseudo-random number generator", Applied Statistics 31 (1982) 188-190. + + http://www.npl.co.uk/ssfm/download/abstracts.html#196 + A modern variation of the Wichmann-Hill generator that greatly increases the + period, and passes now-standard statistical tests that the original generator + failed. + diff --git a/Doc/library/re.rst b/Doc/library/re.rst new file mode 100644 index 0000000..027ff16 --- /dev/null +++ b/Doc/library/re.rst @@ -0,0 +1,921 @@ + +:mod:`re` --- Regular expression operations +=========================================== + +.. module:: re + :synopsis: Regular expression operations. +.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> +.. sectionauthor:: Andrew M. Kuchling <amk@amk.ca> + + + + +This module provides regular expression matching operations similar to +those found in Perl. Both patterns and strings to be searched can be +Unicode strings as well as 8-bit strings. The :mod:`re` module is +always available. + +Regular expressions use the backslash character (``'\'``) to indicate +special forms or to allow special characters to be used without invoking +their special meaning. This collides with Python's usage of the same +character for the same purpose in string literals; for example, to match +a literal backslash, one might have to write ``'\\\\'`` as the pattern +string, because the regular expression must be ``\\``, and each +backslash must be expressed as ``\\`` inside a regular Python string +literal. + +The solution is to use Python's raw string notation for regular expression +patterns; backslashes are not handled in any special way in a string literal +prefixed with ``'r'``. So ``r"\n"`` is a two-character string containing +``'\'`` and ``'n'``, while ``"\n"`` is a one-character string containing a +newline. Usually patterns will be expressed in Python code using this raw string +notation. + +.. seealso:: + + Mastering Regular Expressions + Book on regular expressions by Jeffrey Friedl, published by O'Reilly. The + second edition of the book no longer covers Python at all, but the first + edition covered writing good regular expression patterns in great detail. + + +.. _re-syntax: + +Regular Expression Syntax +------------------------- + +A regular expression (or RE) specifies a set of strings that matches it; the +functions in this module let you check if a particular string matches a given +regular expression (or if a given regular expression matches a particular +string, which comes down to the same thing). + +Regular expressions can be concatenated to form new regular expressions; if *A* +and *B* are both regular expressions, then *AB* is also a regular expression. +In general, if a string *p* matches *A* and another string *q* matches *B*, the +string *pq* will match AB. This holds unless *A* or *B* contain low precedence +operations; boundary conditions between *A* and *B*; or have numbered group +references. Thus, complex expressions can easily be constructed from simpler +primitive expressions like the ones described here. For details of the theory +and implementation of regular expressions, consult the Friedl book referenced +above, or almost any textbook about compiler construction. + +A brief explanation of the format of regular expressions follows. For further +information and a gentler presentation, consult the Regular Expression HOWTO, +accessible from http://www.python.org/doc/howto/. + +Regular expressions can contain both special and ordinary characters. Most +ordinary characters, like ``'A'``, ``'a'``, or ``'0'``, are the simplest regular +expressions; they simply match themselves. You can concatenate ordinary +characters, so ``last`` matches the string ``'last'``. (In the rest of this +section, we'll write RE's in ``this special style``, usually without quotes, and +strings to be matched ``'in single quotes'``.) + +Some characters, like ``'|'`` or ``'('``, are special. Special +characters either stand for classes of ordinary characters, or affect +how the regular expressions around them are interpreted. Regular +expression pattern strings may not contain null bytes, but can specify +the null byte using the ``\number`` notation, e.g., ``'\x00'``. + + +The special characters are: + +.. % + +``'.'`` + (Dot.) In the default mode, this matches any character except a newline. If + the :const:`DOTALL` flag has been specified, this matches any character + including a newline. + +``'^'`` + (Caret.) Matches the start of the string, and in :const:`MULTILINE` mode also + matches immediately after each newline. + +``'$'`` + Matches the end of the string or just before the newline at the end of the + string, and in :const:`MULTILINE` mode also matches before a newline. ``foo`` + matches both 'foo' and 'foobar', while the regular expression ``foo$`` matches + only 'foo'. More interestingly, searching for ``foo.$`` in ``'foo1\nfoo2\n'`` + matches 'foo2' normally, but 'foo1' in :const:`MULTILINE` mode. + +``'*'`` + Causes the resulting RE to match 0 or more repetitions of the preceding RE, as + many repetitions as are possible. ``ab*`` will match 'a', 'ab', or 'a' followed + by any number of 'b's. + +``'+'`` + Causes the resulting RE to match 1 or more repetitions of the preceding RE. + ``ab+`` will match 'a' followed by any non-zero number of 'b's; it will not + match just 'a'. + +``'?'`` + Causes the resulting RE to match 0 or 1 repetitions of the preceding RE. + ``ab?`` will match either 'a' or 'ab'. + +``*?``, ``+?``, ``??`` + The ``'*'``, ``'+'``, and ``'?'`` qualifiers are all :dfn:`greedy`; they match + as much text as possible. Sometimes this behaviour isn't desired; if the RE + ``<.*>`` is matched against ``'<H1>title</H1>'``, it will match the entire + string, and not just ``'<H1>'``. Adding ``'?'`` after the qualifier makes it + perform the match in :dfn:`non-greedy` or :dfn:`minimal` fashion; as *few* + characters as possible will be matched. Using ``.*?`` in the previous + expression will match only ``'<H1>'``. + +``{m}`` + Specifies that exactly *m* copies of the previous RE should be matched; fewer + matches cause the entire RE not to match. For example, ``a{6}`` will match + exactly six ``'a'`` characters, but not five. + +``{m,n}`` + Causes the resulting RE to match from *m* to *n* repetitions of the preceding + RE, attempting to match as many repetitions as possible. For example, + ``a{3,5}`` will match from 3 to 5 ``'a'`` characters. Omitting *m* specifies a + lower bound of zero, and omitting *n* specifies an infinite upper bound. As an + example, ``a{4,}b`` will match ``aaaab`` or a thousand ``'a'`` characters + followed by a ``b``, but not ``aaab``. The comma may not be omitted or the + modifier would be confused with the previously described form. + +``{m,n}?`` + Causes the resulting RE to match from *m* to *n* repetitions of the preceding + RE, attempting to match as *few* repetitions as possible. This is the + non-greedy version of the previous qualifier. For example, on the + 6-character string ``'aaaaaa'``, ``a{3,5}`` will match 5 ``'a'`` characters, + while ``a{3,5}?`` will only match 3 characters. + +``'\'`` + Either escapes special characters (permitting you to match characters like + ``'*'``, ``'?'``, and so forth), or signals a special sequence; special + sequences are discussed below. + + If you're not using a raw string to express the pattern, remember that Python + also uses the backslash as an escape sequence in string literals; if the escape + sequence isn't recognized by Python's parser, the backslash and subsequent + character are included in the resulting string. However, if Python would + recognize the resulting sequence, the backslash should be repeated twice. This + is complicated and hard to understand, so it's highly recommended that you use + raw strings for all but the simplest expressions. + +``[]`` + Used to indicate a set of characters. Characters can be listed individually, or + a range of characters can be indicated by giving two characters and separating + them by a ``'-'``. Special characters are not active inside sets. For example, + ``[akm$]`` will match any of the characters ``'a'``, ``'k'``, + ``'m'``, or ``'$'``; ``[a-z]`` will match any lowercase letter, and + ``[a-zA-Z0-9]`` matches any letter or digit. Character classes such + as ``\w`` or ``\S`` (defined below) are also acceptable inside a + range, although the characters they match depends on whether :const:`LOCALE` + or :const:`UNICODE` mode is in force. If you want to include a + ``']'`` or a ``'-'`` inside a set, precede it with a backslash, or + place it as the first character. The pattern ``[]]`` will match + ``']'``, for example. + + You can match the characters not within a range by :dfn:`complementing` the set. + This is indicated by including a ``'^'`` as the first character of the set; + ``'^'`` elsewhere will simply match the ``'^'`` character. For example, + ``[^5]`` will match any character except ``'5'``, and ``[^^]`` will match any + character except ``'^'``. + +``'|'`` + ``A|B``, where A and B can be arbitrary REs, creates a regular expression that + will match either A or B. An arbitrary number of REs can be separated by the + ``'|'`` in this way. This can be used inside groups (see below) as well. As + the target string is scanned, REs separated by ``'|'`` are tried from left to + right. When one pattern completely matches, that branch is accepted. This means + that once ``A`` matches, ``B`` will not be tested further, even if it would + produce a longer overall match. In other words, the ``'|'`` operator is never + greedy. To match a literal ``'|'``, use ``\|``, or enclose it inside a + character class, as in ``[|]``. + +``(...)`` + Matches whatever regular expression is inside the parentheses, and indicates the + start and end of a group; the contents of a group can be retrieved after a match + has been performed, and can be matched later in the string with the ``\number`` + special sequence, described below. To match the literals ``'('`` or ``')'``, + use ``\(`` or ``\)``, or enclose them inside a character class: ``[(] [)]``. + +``(?...)`` + This is an extension notation (a ``'?'`` following a ``'('`` is not meaningful + otherwise). The first character after the ``'?'`` determines what the meaning + and further syntax of the construct is. Extensions usually do not create a new + group; ``(?P<name>...)`` is the only exception to this rule. Following are the + currently supported extensions. + +``(?iLmsux)`` + (One or more letters from the set ``'i'``, ``'L'``, ``'m'``, ``'s'``, + ``'u'``, ``'x'``.) The group matches the empty string; the letters + set the corresponding flags: :const:`re.I` (ignore case), + :const:`re.L` (locale dependent), :const:`re.M` (multi-line), + :const:`re.S` (dot matches all), :const:`re.U` (Unicode dependent), + and :const:`re.X` (verbose), for the entire regular expression. (The + flags are described in :ref:`contents-of-module-re`.) This + is useful if you wish to include the flags as part of the regular + expression, instead of passing a *flag* argument to the + :func:`compile` function. + + Note that the ``(?x)`` flag changes how the expression is parsed. It should be + used first in the expression string, or after one or more whitespace characters. + If there are non-whitespace characters before the flag, the results are + undefined. + +``(?:...)`` + A non-grouping version of regular parentheses. Matches whatever regular + expression is inside the parentheses, but the substring matched by the group + *cannot* be retrieved after performing a match or referenced later in the + pattern. + +``(?P<name>...)`` + Similar to regular parentheses, but the substring matched by the group is + accessible via the symbolic group name *name*. Group names must be valid Python + identifiers, and each group name must be defined only once within a regular + expression. A symbolic group is also a numbered group, just as if the group + were not named. So the group named 'id' in the example below can also be + referenced as the numbered group 1. + + For example, if the pattern is ``(?P<id>[a-zA-Z_]\w*)``, the group can be + referenced by its name in arguments to methods of match objects, such as + ``m.group('id')`` or ``m.end('id')``, and also by name in pattern text (for + example, ``(?P=id)``) and replacement text (such as ``\g<id>``). + +``(?P=name)`` + Matches whatever text was matched by the earlier group named *name*. + +``(?#...)`` + A comment; the contents of the parentheses are simply ignored. + +``(?=...)`` + Matches if ``...`` matches next, but doesn't consume any of the string. This is + called a lookahead assertion. For example, ``Isaac (?=Asimov)`` will match + ``'Isaac '`` only if it's followed by ``'Asimov'``. + +``(?!...)`` + Matches if ``...`` doesn't match next. This is a negative lookahead assertion. + For example, ``Isaac (?!Asimov)`` will match ``'Isaac '`` only if it's *not* + followed by ``'Asimov'``. + +``(?<=...)`` + Matches if the current position in the string is preceded by a match for ``...`` + that ends at the current position. This is called a :dfn:`positive lookbehind + assertion`. ``(?<=abc)def`` will find a match in ``abcdef``, since the + lookbehind will back up 3 characters and check if the contained pattern matches. + The contained pattern must only match strings of some fixed length, meaning that + ``abc`` or ``a|b`` are allowed, but ``a*`` and ``a{3,4}`` are not. Note that + patterns which start with positive lookbehind assertions will never match at the + beginning of the string being searched; you will most likely want to use the + :func:`search` function rather than the :func:`match` function:: + + >>> import re + >>> m = re.search('(?<=abc)def', 'abcdef') + >>> m.group(0) + 'def' + + This example looks for a word following a hyphen:: + + >>> m = re.search('(?<=-)\w+', 'spam-egg') + >>> m.group(0) + 'egg' + +``(?<!...)`` + Matches if the current position in the string is not preceded by a match for + ``...``. This is called a :dfn:`negative lookbehind assertion`. Similar to + positive lookbehind assertions, the contained pattern must only match strings of + some fixed length. Patterns which start with negative lookbehind assertions may + match at the beginning of the string being searched. + +``(?(id/name)yes-pattern|no-pattern)`` + Will try to match with ``yes-pattern`` if the group with given *id* or *name* + exists, and with ``no-pattern`` if it doesn't. ``no-pattern`` is optional and + can be omitted. For example, ``(<)?(\w+@\w+(?:\.\w+)+)(?(1)>)`` is a poor email + matching pattern, which will match with ``'<user@host.com>'`` as well as + ``'user@host.com'``, but not with ``'<user@host.com'``. + + .. versionadded:: 2.4 + +The special sequences consist of ``'\'`` and a character from the list below. +If the ordinary character is not on the list, then the resulting RE will match +the second character. For example, ``\$`` matches the character ``'$'``. + +.. % + +``\number`` + Matches the contents of the group of the same number. Groups are numbered + starting from 1. For example, ``(.+) \1`` matches ``'the the'`` or ``'55 55'``, + but not ``'the end'`` (note the space after the group). This special sequence + can only be used to match one of the first 99 groups. If the first digit of + *number* is 0, or *number* is 3 octal digits long, it will not be interpreted as + a group match, but as the character with octal value *number*. Inside the + ``'['`` and ``']'`` of a character class, all numeric escapes are treated as + characters. + +``\A`` + Matches only at the start of the string. + +``\b`` + Matches the empty string, but only at the beginning or end of a word. A word is + defined as a sequence of alphanumeric or underscore characters, so the end of a + word is indicated by whitespace or a non-alphanumeric, non-underscore character. + Note that ``\b`` is defined as the boundary between ``\w`` and ``\ W``, so the + precise set of characters deemed to be alphanumeric depends on the values of the + ``UNICODE`` and ``LOCALE`` flags. Inside a character range, ``\b`` represents + the backspace character, for compatibility with Python's string literals. + +``\B`` + Matches the empty string, but only when it is *not* at the beginning or end of a + word. This is just the opposite of ``\b``, so is also subject to the settings + of ``LOCALE`` and ``UNICODE``. + +``\d`` + When the :const:`UNICODE` flag is not specified, matches any decimal digit; this + is equivalent to the set ``[0-9]``. With :const:`UNICODE`, it will match + whatever is classified as a digit in the Unicode character properties database. + +``\D`` + When the :const:`UNICODE` flag is not specified, matches any non-digit + character; this is equivalent to the set ``[^0-9]``. With :const:`UNICODE`, it + will match anything other than character marked as digits in the Unicode + character properties database. + +``\s`` + When the :const:`LOCALE` and :const:`UNICODE` flags are not specified, matches + any whitespace character; this is equivalent to the set ``[ \t\n\r\f\v]``. With + :const:`LOCALE`, it will match this set plus whatever characters are defined as + space for the current locale. If :const:`UNICODE` is set, this will match the + characters ``[ \t\n\r\f\v]`` plus whatever is classified as space in the Unicode + character properties database. + +``\S`` + When the :const:`LOCALE` and :const:`UNICODE` flags are not specified, matches + any non-whitespace character; this is equivalent to the set ``[^ \t\n\r\f\v]`` + With :const:`LOCALE`, it will match any character not in this set, and not + defined as space in the current locale. If :const:`UNICODE` is set, this will + match anything other than ``[ \t\n\r\f\v]`` and characters marked as space in + the Unicode character properties database. + +``\w`` + When the :const:`LOCALE` and :const:`UNICODE` flags are not specified, matches + any alphanumeric character and the underscore; this is equivalent to the set + ``[a-zA-Z0-9_]``. With :const:`LOCALE`, it will match the set ``[0-9_]`` plus + whatever characters are defined as alphanumeric for the current locale. If + :const:`UNICODE` is set, this will match the characters ``[0-9_]`` plus whatever + is classified as alphanumeric in the Unicode character properties database. + +``\W`` + When the :const:`LOCALE` and :const:`UNICODE` flags are not specified, matches + any non-alphanumeric character; this is equivalent to the set ``[^a-zA-Z0-9_]``. + With :const:`LOCALE`, it will match any character not in the set ``[0-9_]``, and + not defined as alphanumeric for the current locale. If :const:`UNICODE` is set, + this will match anything other than ``[0-9_]`` and characters marked as + alphanumeric in the Unicode character properties database. + +``\Z`` + Matches only at the end of the string. + +Most of the standard escapes supported by Python string literals are also +accepted by the regular expression parser:: + + \a \b \f \n + \r \t \v \x + \\ + +Octal escapes are included in a limited form: If the first digit is a 0, or if +there are three octal digits, it is considered an octal escape. Otherwise, it is +a group reference. As for string literals, octal escapes are always at most +three digits in length. + +.. % Note the lack of a period in the section title; it causes problems +.. % with readers of the GNU info version. See http://www.python.org/sf/581414. + + +.. _matching-searching: + +Matching vs Searching +--------------------- + +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +Python offers two different primitive operations based on regular expressions: +match and search. If you are accustomed to Perl's semantics, the search +operation is what you're looking for. See the :func:`search` function and +corresponding method of compiled regular expression objects. + +Note that match may differ from search using a regular expression beginning with +``'^'``: ``'^'`` matches only at the start of the string, or in +:const:`MULTILINE` mode also immediately following a newline. The "match" +operation succeeds only if the pattern matches at the start of the string +regardless of mode, or at the starting position given by the optional *pos* +argument regardless of whether a newline precedes it. + +.. % Examples from Tim Peters: + +:: + + re.compile("a").match("ba", 1) # succeeds + re.compile("^a").search("ba", 1) # fails; 'a' not at start + re.compile("^a").search("\na", 1) # fails; 'a' not at start + re.compile("^a", re.M).search("\na", 1) # succeeds + re.compile("^a", re.M).search("ba", 1) # fails; no preceding \n + + +.. _contents-of-module-re: + +Module Contents +--------------- + +The module defines several functions, constants, and an exception. Some of the +functions are simplified versions of the full featured methods for compiled +regular expressions. Most non-trivial applications always use the compiled +form. + + +.. function:: compile(pattern[, flags]) + + Compile a regular expression pattern into a regular expression object, which can + be used for matching using its :func:`match` and :func:`search` methods, + described below. + + The expression's behaviour can be modified by specifying a *flags* value. + Values can be any of the following variables, combined using bitwise OR (the + ``|`` operator). + + The sequence :: + + prog = re.compile(pat) + result = prog.match(str) + + is equivalent to :: + + result = re.match(pat, str) + + but the version using :func:`compile` is more efficient when the expression will + be used several times in a single program. + + .. % (The compiled version of the last pattern passed to + .. % \function{re.match()} or \function{re.search()} is cached, so + .. % programs that use only a single regular expression at a time needn't + .. % worry about compiling regular expressions.) + + +.. data:: I + IGNORECASE + + Perform case-insensitive matching; expressions like ``[A-Z]`` will match + lowercase letters, too. This is not affected by the current locale. + + +.. data:: L + LOCALE + + Make ``\w``, ``\W``, ``\b``, ``\B``, ``\s`` and ``\S`` dependent on the current + locale. + + +.. data:: M + MULTILINE + + When specified, the pattern character ``'^'`` matches at the beginning of the + string and at the beginning of each line (immediately following each newline); + and the pattern character ``'$'`` matches at the end of the string and at the + end of each line (immediately preceding each newline). By default, ``'^'`` + matches only at the beginning of the string, and ``'$'`` only at the end of the + string and immediately before the newline (if any) at the end of the string. + + +.. data:: S + DOTALL + + Make the ``'.'`` special character match any character at all, including a + newline; without this flag, ``'.'`` will match anything *except* a newline. + + +.. data:: U + UNICODE + + Make ``\w``, ``\W``, ``\b``, ``\B``, ``\d``, ``\D``, ``\s`` and ``\S`` dependent + on the Unicode character properties database. + + .. versionadded:: 2.0 + + +.. data:: X + VERBOSE + + This flag allows you to write regular expressions that look nicer. Whitespace + within the pattern is ignored, except when in a character class or preceded by + an unescaped backslash, and, when a line contains a ``'#'`` neither in a + character class or preceded by an unescaped backslash, all characters from the + leftmost such ``'#'`` through the end of the line are ignored. + + .. % XXX should add an example here + + +.. function:: search(pattern, string[, flags]) + + Scan through *string* looking for a location where the regular expression + *pattern* produces a match, and return a corresponding :class:`MatchObject` + instance. Return ``None`` if no position in the string matches the pattern; note + that this is different from finding a zero-length match at some point in the + string. + + +.. function:: match(pattern, string[, flags]) + + If zero or more characters at the beginning of *string* match the regular + expression *pattern*, return a corresponding :class:`MatchObject` instance. + Return ``None`` if the string does not match the pattern; note that this is + different from a zero-length match. + + .. note:: + + If you want to locate a match anywhere in *string*, use :meth:`search` instead. + + +.. function:: split(pattern, string[, maxsplit=0]) + + Split *string* by the occurrences of *pattern*. If capturing parentheses are + used in *pattern*, then the text of all groups in the pattern are also returned + as part of the resulting list. If *maxsplit* is nonzero, at most *maxsplit* + splits occur, and the remainder of the string is returned as the final element + of the list. (Incompatibility note: in the original Python 1.5 release, + *maxsplit* was ignored. This has been fixed in later releases.) :: + + >>> re.split('\W+', 'Words, words, words.') + ['Words', 'words', 'words', ''] + >>> re.split('(\W+)', 'Words, words, words.') + ['Words', ', ', 'words', ', ', 'words', '.', ''] + >>> re.split('\W+', 'Words, words, words.', 1) + ['Words', 'words, words.'] + + +.. function:: findall(pattern, string[, flags]) + + Return a list of all non-overlapping matches of *pattern* in *string*. If one + or more groups are present in the pattern, return a list of groups; this will be + a list of tuples if the pattern has more than one group. Empty matches are + included in the result unless they touch the beginning of another match. + + .. versionadded:: 1.5.2 + + .. versionchanged:: 2.4 + Added the optional flags argument. + + +.. function:: finditer(pattern, string[, flags]) + + Return an iterator over all non-overlapping matches for the RE *pattern* in + *string*. For each match, the iterator returns a match object. Empty matches + are included in the result unless they touch the beginning of another match. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.4 + Added the optional flags argument. + + +.. function:: sub(pattern, repl, string[, count]) + + Return the string obtained by replacing the leftmost non-overlapping occurrences + of *pattern* in *string* by the replacement *repl*. If the pattern isn't found, + *string* is returned unchanged. *repl* can be a string or a function; if it is + a string, any backslash escapes in it are processed. That is, ``\n`` is + converted to a single newline character, ``\r`` is converted to a linefeed, and + so forth. Unknown escapes such as ``\j`` are left alone. Backreferences, such + as ``\6``, are replaced with the substring matched by group 6 in the pattern. + For example:: + + >>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):', + ... r'static PyObject*\npy_\1(void)\n{', + ... 'def myfunc():') + 'static PyObject*\npy_myfunc(void)\n{' + + If *repl* is a function, it is called for every non-overlapping occurrence of + *pattern*. The function takes a single match object argument, and returns the + replacement string. For example:: + + >>> def dashrepl(matchobj): + ... if matchobj.group(0) == '-': return ' ' + ... else: return '-' + >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files') + 'pro--gram files' + + The pattern may be a string or an RE object; if you need to specify regular + expression flags, you must use a RE object, or use embedded modifiers in a + pattern; for example, ``sub("(?i)b+", "x", "bbbb BBBB")`` returns ``'x x'``. + + The optional argument *count* is the maximum number of pattern occurrences to be + replaced; *count* must be a non-negative integer. If omitted or zero, all + occurrences will be replaced. Empty matches for the pattern are replaced only + when not adjacent to a previous match, so ``sub('x*', '-', 'abc')`` returns + ``'-a-b-c-'``. + + In addition to character escapes and backreferences as described above, + ``\g<name>`` will use the substring matched by the group named ``name``, as + defined by the ``(?P<name>...)`` syntax. ``\g<number>`` uses the corresponding + group number; ``\g<2>`` is therefore equivalent to ``\2``, but isn't ambiguous + in a replacement such as ``\g<2>0``. ``\20`` would be interpreted as a + reference to group 20, not a reference to group 2 followed by the literal + character ``'0'``. The backreference ``\g<0>`` substitutes in the entire + substring matched by the RE. + + +.. function:: subn(pattern, repl, string[, count]) + + Perform the same operation as :func:`sub`, but return a tuple ``(new_string, + number_of_subs_made)``. + + +.. function:: escape(string) + + Return *string* with all non-alphanumerics backslashed; this is useful if you + want to match an arbitrary literal string that may have regular expression + metacharacters in it. + + +.. exception:: error + + Exception raised when a string passed to one of the functions here is not a + valid regular expression (for example, it might contain unmatched parentheses) + or when some other error occurs during compilation or matching. It is never an + error if a string contains no match for a pattern. + + +.. _re-objects: + +Regular Expression Objects +-------------------------- + +Compiled regular expression objects support the following methods and +attributes: + + +.. method:: RegexObject.match(string[, pos[, endpos]]) + + If zero or more characters at the beginning of *string* match this regular + expression, return a corresponding :class:`MatchObject` instance. Return + ``None`` if the string does not match the pattern; note that this is different + from a zero-length match. + + .. note:: + + If you want to locate a match anywhere in *string*, use :meth:`search` instead. + + The optional second parameter *pos* gives an index in the string where the + search is to start; it defaults to ``0``. This is not completely equivalent to + slicing the string; the ``'^'`` pattern character matches at the real beginning + of the string and at positions just after a newline, but not necessarily at the + index where the search is to start. + + The optional parameter *endpos* limits how far the string will be searched; it + will be as if the string is *endpos* characters long, so only the characters + from *pos* to ``endpos - 1`` will be searched for a match. If *endpos* is less + than *pos*, no match will be found, otherwise, if *rx* is a compiled regular + expression object, ``rx.match(string, 0, 50)`` is equivalent to + ``rx.match(string[:50], 0)``. + + +.. method:: RegexObject.search(string[, pos[, endpos]]) + + Scan through *string* looking for a location where this regular expression + produces a match, and return a corresponding :class:`MatchObject` instance. + Return ``None`` if no position in the string matches the pattern; note that this + is different from finding a zero-length match at some point in the string. + + The optional *pos* and *endpos* parameters have the same meaning as for the + :meth:`match` method. + + +.. method:: RegexObject.split(string[, maxsplit=0]) + + Identical to the :func:`split` function, using the compiled pattern. + + +.. method:: RegexObject.findall(string[, pos[, endpos]]) + + Identical to the :func:`findall` function, using the compiled pattern. + + +.. method:: RegexObject.finditer(string[, pos[, endpos]]) + + Identical to the :func:`finditer` function, using the compiled pattern. + + +.. method:: RegexObject.sub(repl, string[, count=0]) + + Identical to the :func:`sub` function, using the compiled pattern. + + +.. method:: RegexObject.subn(repl, string[, count=0]) + + Identical to the :func:`subn` function, using the compiled pattern. + + +.. attribute:: RegexObject.flags + + The flags argument used when the RE object was compiled, or ``0`` if no flags + were provided. + + +.. attribute:: RegexObject.groupindex + + A dictionary mapping any symbolic group names defined by ``(?P<id>)`` to group + numbers. The dictionary is empty if no symbolic groups were used in the + pattern. + + +.. attribute:: RegexObject.pattern + + The pattern string from which the RE object was compiled. + + +.. _match-objects: + +Match Objects +------------- + +:class:`MatchObject` instances support the following methods and attributes: + + +.. method:: MatchObject.expand(template) + + Return the string obtained by doing backslash substitution on the template + string *template*, as done by the :meth:`sub` method. Escapes such as ``\n`` are + converted to the appropriate characters, and numeric backreferences (``\1``, + ``\2``) and named backreferences (``\g<1>``, ``\g<name>``) are replaced by the + contents of the corresponding group. + + +.. method:: MatchObject.group([group1, ...]) + + Returns one or more subgroups of the match. If there is a single argument, the + result is a single string; if there are multiple arguments, the result is a + tuple with one item per argument. Without arguments, *group1* defaults to zero + (the whole match is returned). If a *groupN* argument is zero, the corresponding + return value is the entire matching string; if it is in the inclusive range + [1..99], it is the string matching the corresponding parenthesized group. If a + group number is negative or larger than the number of groups defined in the + pattern, an :exc:`IndexError` exception is raised. If a group is contained in a + part of the pattern that did not match, the corresponding result is ``None``. + If a group is contained in a part of the pattern that matched multiple times, + the last match is returned. + + If the regular expression uses the ``(?P<name>...)`` syntax, the *groupN* + arguments may also be strings identifying groups by their group name. If a + string argument is not used as a group name in the pattern, an :exc:`IndexError` + exception is raised. + + A moderately complicated example:: + + m = re.match(r"(?P<int>\d+)\.(\d*)", '3.14') + + After performing this match, ``m.group(1)`` is ``'3'``, as is + ``m.group('int')``, and ``m.group(2)`` is ``'14'``. + + +.. method:: MatchObject.groups([default]) + + Return a tuple containing all the subgroups of the match, from 1 up to however + many groups are in the pattern. The *default* argument is used for groups that + did not participate in the match; it defaults to ``None``. (Incompatibility + note: in the original Python 1.5 release, if the tuple was one element long, a + string would be returned instead. In later versions (from 1.5.1 on), a + singleton tuple is returned in such cases.) + + +.. method:: MatchObject.groupdict([default]) + + Return a dictionary containing all the *named* subgroups of the match, keyed by + the subgroup name. The *default* argument is used for groups that did not + participate in the match; it defaults to ``None``. + + +.. method:: MatchObject.start([group]) + MatchObject.end([group]) + + Return the indices of the start and end of the substring matched by *group*; + *group* defaults to zero (meaning the whole matched substring). Return ``-1`` if + *group* exists but did not contribute to the match. For a match object *m*, and + a group *g* that did contribute to the match, the substring matched by group *g* + (equivalent to ``m.group(g)``) is :: + + m.string[m.start(g):m.end(g)] + + Note that ``m.start(group)`` will equal ``m.end(group)`` if *group* matched a + null string. For example, after ``m = re.search('b(c?)', 'cba')``, + ``m.start(0)`` is 1, ``m.end(0)`` is 2, ``m.start(1)`` and ``m.end(1)`` are both + 2, and ``m.start(2)`` raises an :exc:`IndexError` exception. + + +.. method:: MatchObject.span([group]) + + For :class:`MatchObject` *m*, return the 2-tuple ``(m.start(group), + m.end(group))``. Note that if *group* did not contribute to the match, this is + ``(-1, -1)``. Again, *group* defaults to zero. + + +.. attribute:: MatchObject.pos + + The value of *pos* which was passed to the :func:`search` or :func:`match` + method of the :class:`RegexObject`. This is the index into the string at which + the RE engine started looking for a match. + + +.. attribute:: MatchObject.endpos + + The value of *endpos* which was passed to the :func:`search` or :func:`match` + method of the :class:`RegexObject`. This is the index into the string beyond + which the RE engine will not go. + + +.. attribute:: MatchObject.lastindex + + The integer index of the last matched capturing group, or ``None`` if no group + was matched at all. For example, the expressions ``(a)b``, ``((a)(b))``, and + ``((ab))`` will have ``lastindex == 1`` if applied to the string ``'ab'``, while + the expression ``(a)(b)`` will have ``lastindex == 2``, if applied to the same + string. + + +.. attribute:: MatchObject.lastgroup + + The name of the last matched capturing group, or ``None`` if the group didn't + have a name, or if no group was matched at all. + + +.. attribute:: MatchObject.re + + The regular expression object whose :meth:`match` or :meth:`search` method + produced this :class:`MatchObject` instance. + + +.. attribute:: MatchObject.string + + The string passed to :func:`match` or :func:`search`. + + +Examples +-------- + +**Simulating scanf()** + +.. index:: single: scanf() + +Python does not currently have an equivalent to :cfunc:`scanf`. Regular +expressions are generally more powerful, though also more verbose, than +:cfunc:`scanf` format strings. The table below offers some more-or-less +equivalent mappings between :cfunc:`scanf` format tokens and regular +expressions. + ++--------------------------------+---------------------------------------------+ +| :cfunc:`scanf` Token | Regular Expression | ++================================+=============================================+ +| ``%c`` | ``.`` | ++--------------------------------+---------------------------------------------+ +| ``%5c`` | ``.{5}`` | ++--------------------------------+---------------------------------------------+ +| ``%d`` | ``[-+]?\d+`` | ++--------------------------------+---------------------------------------------+ +| ``%e``, ``%E``, ``%f``, ``%g`` | ``[-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?`` | ++--------------------------------+---------------------------------------------+ +| ``%i`` | ``[-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)`` | ++--------------------------------+---------------------------------------------+ +| ``%o`` | ``0[0-7]*`` | ++--------------------------------+---------------------------------------------+ +| ``%s`` | ``\S+`` | ++--------------------------------+---------------------------------------------+ +| ``%u`` | ``\d+`` | ++--------------------------------+---------------------------------------------+ +| ``%x``, ``%X`` | ``0[xX][\dA-Fa-f]+`` | ++--------------------------------+---------------------------------------------+ + +To extract the filename and numbers from a string like :: + + /usr/sbin/sendmail - 0 errors, 4 warnings + +you would use a :cfunc:`scanf` format like :: + + %s - %d errors, %d warnings + +The equivalent regular expression would be :: + + (\S+) - (\d+) errors, (\d+) warnings + +**Avoiding recursion** + +If you create regular expressions that require the engine to perform a lot of +recursion, you may encounter a :exc:`RuntimeError` exception with the message +``maximum recursion limit`` exceeded. For example, :: + + >>> import re + >>> s = 'Begin ' + 1000*'a very long string ' + 'end' + >>> re.match('Begin (\w| )*? end', s).end() + Traceback (most recent call last): + File "<stdin>", line 1, in ? + File "/usr/local/lib/python2.5/re.py", line 132, in match + return _compile(pattern, flags).match(string) + RuntimeError: maximum recursion limit exceeded + +You can often restructure your regular expression to avoid recursion. + +Starting with Python 2.3, simple uses of the ``*?`` pattern are special-cased to +avoid recursion. Thus, the above regular expression can avoid recursion by +being recast as ``Begin [a-zA-Z0-9_ ]*?end``. As a further benefit, such +regular expressions will run faster than their recursive equivalents. + diff --git a/Doc/library/readline.rst b/Doc/library/readline.rst new file mode 100644 index 0000000..9a40747 --- /dev/null +++ b/Doc/library/readline.rst @@ -0,0 +1,222 @@ + +:mod:`readline` --- GNU readline interface +========================================== + +.. module:: readline + :platform: Unix + :synopsis: GNU readline support for Python. +.. sectionauthor:: Skip Montanaro <skip@mojam.com> + + +The :mod:`readline` module defines a number of functions to facilitate +completion and reading/writing of history files from the Python interpreter. +This module can be used directly or via the :mod:`rlcompleter` module. Settings +made using this module affect the behaviour of both the interpreter's +interactive prompt and the prompts offered by the :func:`raw_input` and +:func:`input` built-in functions. + +The :mod:`readline` module defines the following functions: + + +.. function:: parse_and_bind(string) + + Parse and execute single line of a readline init file. + + +.. function:: get_line_buffer() + + Return the current contents of the line buffer. + + +.. function:: insert_text(string) + + Insert text into the command line. + + +.. function:: read_init_file([filename]) + + Parse a readline initialization file. The default filename is the last filename + used. + + +.. function:: read_history_file([filename]) + + Load a readline history file. The default filename is :file:`~/.history`. + + +.. function:: write_history_file([filename]) + + Save a readline history file. The default filename is :file:`~/.history`. + + +.. function:: clear_history() + + Clear the current history. (Note: this function is not available if the + installed version of GNU readline doesn't support it.) + + .. versionadded:: 2.4 + + +.. function:: get_history_length() + + Return the desired length of the history file. Negative values imply unlimited + history file size. + + +.. function:: set_history_length(length) + + Set the number of lines to save in the history file. :func:`write_history_file` + uses this value to truncate the history file when saving. Negative values imply + unlimited history file size. + + +.. function:: get_current_history_length() + + Return the number of lines currently in the history. (This is different from + :func:`get_history_length`, which returns the maximum number of lines that will + be written to a history file.) + + .. versionadded:: 2.3 + + +.. function:: get_history_item(index) + + Return the current contents of history item at *index*. + + .. versionadded:: 2.3 + + +.. function:: remove_history_item(pos) + + Remove history item specified by its position from the history. + + .. versionadded:: 2.4 + + +.. function:: replace_history_item(pos, line) + + Replace history item specified by its position with the given line. + + .. versionadded:: 2.4 + + +.. function:: redisplay() + + Change what's displayed on the screen to reflect the current contents of the + line buffer. + + .. versionadded:: 2.3 + + +.. function:: set_startup_hook([function]) + + Set or remove the startup_hook function. If *function* is specified, it will be + used as the new startup_hook function; if omitted or ``None``, any hook function + already installed is removed. The startup_hook function is called with no + arguments just before readline prints the first prompt. + + +.. function:: set_pre_input_hook([function]) + + Set or remove the pre_input_hook function. If *function* is specified, it will + be used as the new pre_input_hook function; if omitted or ``None``, any hook + function already installed is removed. The pre_input_hook function is called + with no arguments after the first prompt has been printed and just before + readline starts reading input characters. + + +.. function:: set_completer([function]) + + Set or remove the completer function. If *function* is specified, it will be + used as the new completer function; if omitted or ``None``, any completer + function already installed is removed. The completer function is called as + ``function(text, state)``, for *state* in ``0``, ``1``, ``2``, ..., until it + returns a non-string value. It should return the next possible completion + starting with *text*. + + +.. function:: get_completer() + + Get the completer function, or ``None`` if no completer function has been set. + + .. versionadded:: 2.3 + + +.. function:: get_begidx() + + Get the beginning index of the readline tab-completion scope. + + +.. function:: get_endidx() + + Get the ending index of the readline tab-completion scope. + + +.. function:: set_completer_delims(string) + + Set the readline word delimiters for tab-completion. + + +.. function:: get_completer_delims() + + Get the readline word delimiters for tab-completion. + + +.. function:: add_history(line) + + Append a line to the history buffer, as if it was the last line typed. + + +.. seealso:: + + Module :mod:`rlcompleter` + Completion of Python identifiers at the interactive prompt. + + +.. _readline-example: + +Example +------- + +The following example demonstrates how to use the :mod:`readline` module's +history reading and writing functions to automatically load and save a history +file named :file:`.pyhist` from the user's home directory. The code below would +normally be executed automatically during interactive sessions from the user's +:envvar:`PYTHONSTARTUP` file. :: + + import os + histfile = os.path.join(os.environ["HOME"], ".pyhist") + try: + readline.read_history_file(histfile) + except IOError: + pass + import atexit + atexit.register(readline.write_history_file, histfile) + del os, histfile + +The following example extends the :class:`code.InteractiveConsole` class to +support history save/restore. :: + + import code + import readline + import atexit + import os + + class HistoryConsole(code.InteractiveConsole): + def __init__(self, locals=None, filename="<console>", + histfile=os.path.expanduser("~/.console-history")): + code.InteractiveConsole.__init__(self) + self.init_history(histfile) + + def init_history(self, histfile): + readline.parse_and_bind("tab: complete") + if hasattr(readline, "read_history_file"): + try: + readline.read_history_file(histfile) + except IOError: + pass + atexit.register(self.save_history, histfile) + + def save_history(self, histfile): + readline.write_history_file(histfile) + diff --git a/Doc/library/repr.rst b/Doc/library/repr.rst new file mode 100644 index 0000000..493e2b3 --- /dev/null +++ b/Doc/library/repr.rst @@ -0,0 +1,136 @@ + +:mod:`repr` --- Alternate :func:`repr` implementation +===================================================== + +.. module:: repr + :synopsis: Alternate repr() implementation with size limits. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`repr` module provides a means for producing object representations +with limits on the size of the resulting strings. This is used in the Python +debugger and may be useful in other contexts as well. + +This module provides a class, an instance, and a function: + + +.. class:: Repr() + + Class which provides formatting services useful in implementing functions + similar to the built-in :func:`repr`; size limits for different object types + are added to avoid the generation of representations which are excessively long. + + +.. data:: aRepr + + This is an instance of :class:`Repr` which is used to provide the :func:`repr` + function described below. Changing the attributes of this object will affect + the size limits used by :func:`repr` and the Python debugger. + + +.. function:: repr(obj) + + This is the :meth:`repr` method of ``aRepr``. It returns a string similar to + that returned by the built-in function of the same name, but with limits on + most sizes. + + +.. _repr-objects: + +Repr Objects +------------ + +:class:`Repr` instances provide several members which can be used to provide +size limits for the representations of different object types, and methods +which format specific object types. + + +.. attribute:: Repr.maxlevel + + Depth limit on the creation of recursive representations. The default is ``6``. + + +.. attribute:: Repr.maxdict + Repr.maxlist + Repr.maxtuple + Repr.maxset + Repr.maxfrozenset + Repr.maxdeque + Repr.maxarray + + Limits on the number of entries represented for the named object type. The + default is ``4`` for :attr:`maxdict`, ``5`` for :attr:`maxarray`, and ``6`` for + the others. + + .. versionadded:: 2.4 + :attr:`maxset`, :attr:`maxfrozenset`, and :attr:`set`. + + +.. attribute:: Repr.maxlong + + Maximum number of characters in the representation for a long integer. Digits + are dropped from the middle. The default is ``40``. + + +.. attribute:: Repr.maxstring + + Limit on the number of characters in the representation of the string. Note + that the "normal" representation of the string is used as the character source: + if escape sequences are needed in the representation, these may be mangled when + the representation is shortened. The default is ``30``. + + +.. attribute:: Repr.maxother + + This limit is used to control the size of object types for which no specific + formatting method is available on the :class:`Repr` object. It is applied in a + similar manner as :attr:`maxstring`. The default is ``20``. + + +.. method:: Repr.repr(obj) + + The equivalent to the built-in :func:`repr` that uses the formatting imposed by + the instance. + + +.. method:: Repr.repr1(obj, level) + + Recursive implementation used by :meth:`repr`. This uses the type of *obj* to + determine which formatting method to call, passing it *obj* and *level*. The + type-specific methods should call :meth:`repr1` to perform recursive formatting, + with ``level - 1`` for the value of *level* in the recursive call. + + +.. method:: Repr.repr_TYPE(obj, level) + :noindex: + + Formatting methods for specific types are implemented as methods with a name + based on the type name. In the method name, **TYPE** is replaced by + ``string.join(string.split(type(obj).__name__, '_'))``. Dispatch to these + methods is handled by :meth:`repr1`. Type-specific methods which need to + recursively format a value should call ``self.repr1(subobj, level - 1)``. + + +.. _subclassing-reprs: + +Subclassing Repr Objects +------------------------ + +The use of dynamic dispatching by :meth:`Repr.repr1` allows subclasses of +:class:`Repr` to add support for additional built-in object types or to modify +the handling of types already supported. This example shows how special support +for file objects could be added:: + + import repr + import sys + + class MyRepr(repr.Repr): + def repr_file(self, obj, level): + if obj.name in ['<stdin>', '<stdout>', '<stderr>']: + return obj.name + else: + return `obj` + + aRepr = MyRepr() + print aRepr.repr(sys.stdin) # prints '<stdin>' + diff --git a/Doc/library/resource.rst b/Doc/library/resource.rst new file mode 100644 index 0000000..834dace --- /dev/null +++ b/Doc/library/resource.rst @@ -0,0 +1,238 @@ + +:mod:`resource` --- Resource usage information +============================================== + +.. module:: resource + :platform: Unix + :synopsis: An interface to provide resource usage information on the current process. +.. moduleauthor:: Jeremy Hylton <jeremy@alum.mit.edu> +.. sectionauthor:: Jeremy Hylton <jeremy@alum.mit.edu> + + +This module provides basic mechanisms for measuring and controlling system +resources utilized by a program. + +Symbolic constants are used to specify particular system resources and to +request usage information about either the current process or its children. + +A single exception is defined for errors: + + +.. exception:: error + + The functions described below may raise this error if the underlying system call + failures unexpectedly. + + +Resource Limits +--------------- + +Resources usage can be limited using the :func:`setrlimit` function described +below. Each resource is controlled by a pair of limits: a soft limit and a hard +limit. The soft limit is the current limit, and may be lowered or raised by a +process over time. The soft limit can never exceed the hard limit. The hard +limit can be lowered to any value greater than the soft limit, but not raised. +(Only processes with the effective UID of the super-user can raise a hard +limit.) + +The specific resources that can be limited are system dependent. They are +described in the :manpage:`getrlimit(2)` man page. The resources listed below +are supported when the underlying operating system supports them; resources +which cannot be checked or controlled by the operating system are not defined in +this module for those platforms. + + +.. function:: getrlimit(resource) + + Returns a tuple ``(soft, hard)`` with the current soft and hard limits of + *resource*. Raises :exc:`ValueError` if an invalid resource is specified, or + :exc:`error` if the underlying system call fails unexpectedly. + + +.. function:: setrlimit(resource, limits) + + Sets new limits of consumption of *resource*. The *limits* argument must be a + tuple ``(soft, hard)`` of two integers describing the new limits. A value of + ``-1`` can be used to specify the maximum possible upper limit. + + Raises :exc:`ValueError` if an invalid resource is specified, if the new soft + limit exceeds the hard limit, or if a process tries to raise its hard limit + (unless the process has an effective UID of super-user). Can also raise + :exc:`error` if the underlying system call fails. + +These symbols define resources whose consumption can be controlled using the +:func:`setrlimit` and :func:`getrlimit` functions described below. The values of +these symbols are exactly the constants used by C programs. + +The Unix man page for :manpage:`getrlimit(2)` lists the available resources. +Note that not all systems use the same symbol or same value to denote the same +resource. This module does not attempt to mask platform differences --- symbols +not defined for a platform will not be available from this module on that +platform. + + +.. data:: RLIMIT_CORE + + The maximum size (in bytes) of a core file that the current process can create. + This may result in the creation of a partial core file if a larger core would be + required to contain the entire process image. + + +.. data:: RLIMIT_CPU + + The maximum amount of processor time (in seconds) that a process can use. If + this limit is exceeded, a :const:`SIGXCPU` signal is sent to the process. (See + the :mod:`signal` module documentation for information about how to catch this + signal and do something useful, e.g. flush open files to disk.) + + +.. data:: RLIMIT_FSIZE + + The maximum size of a file which the process may create. This only affects the + stack of the main thread in a multi-threaded process. + + +.. data:: RLIMIT_DATA + + The maximum size (in bytes) of the process's heap. + + +.. data:: RLIMIT_STACK + + The maximum size (in bytes) of the call stack for the current process. + + +.. data:: RLIMIT_RSS + + The maximum resident set size that should be made available to the process. + + +.. data:: RLIMIT_NPROC + + The maximum number of processes the current process may create. + + +.. data:: RLIMIT_NOFILE + + The maximum number of open file descriptors for the current process. + + +.. data:: RLIMIT_OFILE + + The BSD name for :const:`RLIMIT_NOFILE`. + + +.. data:: RLIMIT_MEMLOCK + + The maximum address space which may be locked in memory. + + +.. data:: RLIMIT_VMEM + + The largest area of mapped memory which the process may occupy. + + +.. data:: RLIMIT_AS + + The maximum area (in bytes) of address space which may be taken by the process. + + +Resource Usage +-------------- + +These functions are used to retrieve resource usage information: + + +.. function:: getrusage(who) + + This function returns an object that describes the resources consumed by either + the current process or its children, as specified by the *who* parameter. The + *who* parameter should be specified using one of the :const:`RUSAGE_\*` + constants described below. + + The fields of the return value each describe how a particular system resource + has been used, e.g. amount of time spent running is user mode or number of times + the process was swapped out of main memory. Some values are dependent on the + clock tick internal, e.g. the amount of memory the process is using. + + For backward compatibility, the return value is also accessible as a tuple of 16 + elements. + + The fields :attr:`ru_utime` and :attr:`ru_stime` of the return value are + floating point values representing the amount of time spent executing in user + mode and the amount of time spent executing in system mode, respectively. The + remaining values are integers. Consult the :manpage:`getrusage(2)` man page for + detailed information about these values. A brief summary is presented here: + + +--------+---------------------+-------------------------------+ + | Index | Field | Resource | + +========+=====================+===============================+ + | ``0`` | :attr:`ru_utime` | time in user mode (float) | + +--------+---------------------+-------------------------------+ + | ``1`` | :attr:`ru_stime` | time in system mode (float) | + +--------+---------------------+-------------------------------+ + | ``2`` | :attr:`ru_maxrss` | maximum resident set size | + +--------+---------------------+-------------------------------+ + | ``3`` | :attr:`ru_ixrss` | shared memory size | + +--------+---------------------+-------------------------------+ + | ``4`` | :attr:`ru_idrss` | unshared memory size | + +--------+---------------------+-------------------------------+ + | ``5`` | :attr:`ru_isrss` | unshared stack size | + +--------+---------------------+-------------------------------+ + | ``6`` | :attr:`ru_minflt` | page faults not requiring I/O | + +--------+---------------------+-------------------------------+ + | ``7`` | :attr:`ru_majflt` | page faults requiring I/O | + +--------+---------------------+-------------------------------+ + | ``8`` | :attr:`ru_nswap` | number of swap outs | + +--------+---------------------+-------------------------------+ + | ``9`` | :attr:`ru_inblock` | block input operations | + +--------+---------------------+-------------------------------+ + | ``10`` | :attr:`ru_oublock` | block output operations | + +--------+---------------------+-------------------------------+ + | ``11`` | :attr:`ru_msgsnd` | messages sent | + +--------+---------------------+-------------------------------+ + | ``12`` | :attr:`ru_msgrcv` | messages received | + +--------+---------------------+-------------------------------+ + | ``13`` | :attr:`ru_nsignals` | signals received | + +--------+---------------------+-------------------------------+ + | ``14`` | :attr:`ru_nvcsw` | voluntary context switches | + +--------+---------------------+-------------------------------+ + | ``15`` | :attr:`ru_nivcsw` | involuntary context switches | + +--------+---------------------+-------------------------------+ + + This function will raise a :exc:`ValueError` if an invalid *who* parameter is + specified. It may also raise :exc:`error` exception in unusual circumstances. + + .. versionchanged:: 2.3 + Added access to values as attributes of the returned object. + + +.. function:: getpagesize() + + Returns the number of bytes in a system page. (This need not be the same as the + hardware page size.) This function is useful for determining the number of bytes + of memory a process is using. The third element of the tuple returned by + :func:`getrusage` describes memory usage in pages; multiplying by page size + produces number of bytes. + +The following :const:`RUSAGE_\*` symbols are passed to the :func:`getrusage` +function to specify which processes information should be provided for. + + +.. data:: RUSAGE_SELF + + :const:`RUSAGE_SELF` should be used to request information pertaining only to + the process itself. + + +.. data:: RUSAGE_CHILDREN + + Pass to :func:`getrusage` to request resource information for child processes of + the calling process. + + +.. data:: RUSAGE_BOTH + + Pass to :func:`getrusage` to request resources consumed by both the current + process and child processes. May not be available on all systems. + diff --git a/Doc/library/rfc822.rst b/Doc/library/rfc822.rst new file mode 100644 index 0000000..fa25ba5 --- /dev/null +++ b/Doc/library/rfc822.rst @@ -0,0 +1,351 @@ + +:mod:`rfc822` --- Parse RFC 2822 mail headers +============================================= + +.. module:: rfc822 + :synopsis: Parse 2822 style mail messages. + + +.. deprecated:: 2.3 + The :mod:`email` package should be used in preference to the :mod:`rfc822` + module. This module is present only to maintain backward compatibility. + +This module defines a class, :class:`Message`, which represents an "email +message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages +consist of a collection of message headers, and a message body. This module +also defines a helper class :class:`AddressList` for parsing :rfc:`2822` +addresses. Please refer to the RFC for information on the specific syntax of +:rfc:`2822` messages. + +.. index:: module: mailbox + +The :mod:`mailbox` module provides classes to read mailboxes produced by +various end-user mail programs. + + +.. class:: Message(file[, seekable]) + + A :class:`Message` instance is instantiated with an input object as parameter. + Message relies only on the input object having a :meth:`readline` method; in + particular, ordinary file objects qualify. Instantiation reads headers from the + input object up to a delimiter line (normally a blank line) and stores them in + the instance. The message body, following the headers, is not consumed. + + This class can work with any input object that supports a :meth:`readline` + method. If the input object has seek and tell capability, the + :meth:`rewindbody` method will work; also, illegal lines will be pushed back + onto the input stream. If the input object lacks seek but has an :meth:`unread` + method that can push back a line of input, :class:`Message` will use that to + push back illegal lines. Thus this class can be used to parse messages coming + from a buffered stream. + + The optional *seekable* argument is provided as a workaround for certain stdio + libraries in which :cfunc:`tell` discards buffered data before discovering that + the :cfunc:`lseek` system call doesn't work. For maximum portability, you + should set the seekable argument to zero to prevent that initial :meth:`tell` + when passing in an unseekable object such as a file object created from a socket + object. + + Input lines as read from the file may either be terminated by CR-LF or by a + single linefeed; a terminating CR-LF is replaced by a single linefeed before the + line is stored. + + All header matching is done independent of upper or lower case; e.g. + ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result. + + +.. class:: AddressList(field) + + You may instantiate the :class:`AddressList` helper class using a single string + parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The + parameter ``None`` yields an empty list.) + + +.. function:: quote(str) + + Return a new string with backslashes in *str* replaced by two backslashes and + double quotes replaced by backslash-double quote. + + +.. function:: unquote(str) + + Return a new string which is an *unquoted* version of *str*. If *str* ends and + begins with double quotes, they are stripped off. Likewise if *str* ends and + begins with angle brackets, they are stripped off. + + +.. function:: parseaddr(address) + + Parse *address*, which should be the value of some address-containing field such + as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and + "email address" parts. Returns a tuple of that information, unless the parse + fails, in which case a 2-tuple ``(None, None)`` is returned. + + +.. function:: dump_address_pair(pair) + + The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, + email_address)`` and returns the string value suitable for a :mailheader:`To` or + :mailheader:`Cc` header. If the first element of *pair* is false, then the + second element is returned unmodified. + + +.. function:: parsedate(date) + + Attempts to parse a date according to the rules in :rfc:`2822`. however, some + mailers don't follow that format as specified, so :func:`parsedate` tries to + guess correctly in such cases. *date* is a string containing an :rfc:`2822` + date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing + the date, :func:`parsedate` returns a 9-tuple that can be passed directly to + :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6, + 7, and 8 of the result tuple are not usable. + + +.. function:: parsedate_tz(date) + + Performs the same function as :func:`parsedate`, but returns either ``None`` or + a 10-tuple; the first 9 elements make up a tuple that can be passed directly to + :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC + (which is the official term for Greenwich Mean Time). (Note that the sign of + the timezone offset is the opposite of the sign of the ``time.timezone`` + variable for the same timezone; the latter variable follows the POSIX standard + while this module follows :rfc:`2822`.) If the input string has no timezone, + the last element of the tuple returned is ``None``. Note that indexes 6, 7, and + 8 of the result tuple are not usable. + + +.. function:: mktime_tz(tuple) + + Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If + the timezone item in the tuple is ``None``, assume local time. Minor + deficiency: this first interprets the first 8 elements as a local time and then + compensates for the timezone difference; this may yield a slight error around + daylight savings time switch dates. Not enough to worry about for common use. + + +.. seealso:: + + Module :mod:`email` + Comprehensive email handling package; supersedes the :mod:`rfc822` module. + + Module :mod:`mailbox` + Classes to read various mailbox formats produced by end-user mail programs. + + Module :mod:`mimetools` + Subclass of :class:`rfc822.Message` that handles MIME encoded messages. + + +.. _message-objects: + +Message Objects +--------------- + +A :class:`Message` instance has the following methods: + + +.. method:: Message.rewindbody() + + Seek to the start of the message body. This only works if the file object is + seekable. + + +.. method:: Message.isheader(line) + + Returns a line's canonicalized fieldname (the dictionary key that will be used + to index it) if the line is a legal :rfc:`2822` header; otherwise returns + ``None`` (implying that parsing should stop here and the line be pushed back on + the input stream). It is sometimes useful to override this method in a + subclass. + + +.. method:: Message.islast(line) + + Return true if the given line is a delimiter on which Message should stop. The + delimiter line is consumed, and the file object's read location positioned + immediately after it. By default this method just checks that the line is + blank, but you can override it in a subclass. + + +.. method:: Message.iscomment(line) + + Return ``True`` if the given line should be ignored entirely, just skipped. By + default this is a stub that always returns ``False``, but you can override it in + a subclass. + + +.. method:: Message.getallmatchingheaders(name) + + Return a list of lines consisting of all headers matching *name*, if any. Each + physical line, whether it is a continuation line or not, is a separate list + item. Return the empty list if no header matches *name*. + + +.. method:: Message.getfirstmatchingheader(name) + + Return a list of lines comprising the first header matching *name*, and its + continuation line(s), if any. Return ``None`` if there is no header matching + *name*. + + +.. method:: Message.getrawheader(name) + + Return a single string consisting of the text after the colon in the first + header matching *name*. This includes leading whitespace, the trailing + linefeed, and internal linefeeds and whitespace if there any continuation + line(s) were present. Return ``None`` if there is no header matching *name*. + + +.. method:: Message.getheader(name[, default]) + + Like ``getrawheader(name)``, but strip leading and trailing whitespace. + Internal whitespace is not stripped. The optional *default* argument can be + used to specify a different default to be returned when there is no header + matching *name*. + + +.. method:: Message.get(name[, default]) + + An alias for :meth:`getheader`, to make the interface more compatible with + regular dictionaries. + + +.. method:: Message.getaddr(name) + + Return a pair ``(full name, email address)`` parsed from the string returned by + ``getheader(name)``. If no header matching *name* exists, return ``(None, + None)``; otherwise both the full name and the address are (possibly empty) + strings. + + Example: If *m*'s first :mailheader:`From` header contains the string + ``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair + ``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen + <jack@cwi.nl>'`` instead, it would yield the exact same result. + + +.. method:: Message.getaddrlist(name) + + This is similar to ``getaddr(list)``, but parses a header containing a list of + email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full + name, email address)`` pairs (even if there was only one address in the header). + If there is no header matching *name*, return an empty list. + + If multiple headers exist that match the named header (e.g. if there are several + :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines + the named headers contain are also parsed. + + +.. method:: Message.getdate(name) + + Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible + with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If + there is no header matching *name*, or it is unparsable, return ``None``. + + Date parsing appears to be a black art, and not all mailers adhere to the + standard. While it has been tested and found correct on a large collection of + email from many sources, it is still possible that this function may + occasionally yield an incorrect result. + + +.. method:: Message.getdate_tz(name) + + Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the + first 9 elements will make a tuple compatible with :func:`time.mktime`, and the + 10th is a number giving the offset of the date's timezone from UTC. Note that + fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is + no header matching *name*, or it is unparsable, return ``None``. + +:class:`Message` instances also support a limited mapping interface. In +particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError` +if there is no matching header; and ``len(m)``, ``m.get(name[, default])``, +``m.has_key(name)``, ``m.keys()``, ``m.values()`` ``m.items()``, and +``m.setdefault(name[, default])`` act as expected, with the one difference +that :meth:`setdefault` uses an empty string as the default value. +:class:`Message` instances also support the mapping writable interface ``m[name] += value`` and ``del m[name]``. :class:`Message` objects do not support the +:meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the +mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only +added in Python 2.2.) + +Finally, :class:`Message` instances have some public instance variables: + + +.. attribute:: Message.headers + + A list containing the entire set of header lines, in the order in which they + were read (except that setitem calls may disturb this order). Each line contains + a trailing newline. The blank line terminating the headers is not contained in + the list. + + +.. attribute:: Message.fp + + The file or file-like object passed at instantiation time. This can be used to + read the message content. + + +.. attribute:: Message.unixfrom + + The Unix ``From`` line, if the message had one, or an empty string. This is + needed to regenerate the message in some contexts, such as an ``mbox``\ -style + mailbox file. + + +.. _addresslist-objects: + +AddressList Objects +------------------- + +An :class:`AddressList` instance has the following methods: + + +.. method:: AddressList.__len__() + + Return the number of addresses in the address list. + + +.. method:: AddressList.__str__() + + Return a canonicalized string representation of the address list. Addresses are + rendered in "name" <host@domain> form, comma-separated. + + +.. method:: AddressList.__add__(alist) + + Return a new :class:`AddressList` instance that contains all addresses in both + :class:`AddressList` operands, with duplicates removed (set union). + + +.. method:: AddressList.__iadd__(alist) + + In-place version of :meth:`__add__`; turns this :class:`AddressList` instance + into the union of itself and the right-hand instance, *alist*. + + +.. method:: AddressList.__sub__(alist) + + Return a new :class:`AddressList` instance that contains every address in the + left-hand :class:`AddressList` operand that is not present in the right-hand + address operand (set difference). + + +.. method:: AddressList.__isub__(alist) + + In-place version of :meth:`__sub__`, removing addresses in this list which are + also in *alist*. + +Finally, :class:`AddressList` instances have one public instance variable: + + +.. attribute:: AddressList.addresslist + + A list of tuple string pairs, one per address. In each member, the first is the + canonicalized name part, the second is the actual route-address (``'@'``\ + -separated username-host.domain pair). + +.. rubric:: Footnotes + +.. [#] This module originally conformed to :rfc:`822`, hence the name. Since then, + :rfc:`2822` has been released as an update to :rfc:`822`. This module should be + considered :rfc:`2822`\ -conformant, especially in cases where the syntax or + semantics have changed since :rfc:`822`. + diff --git a/Doc/library/rlcompleter.rst b/Doc/library/rlcompleter.rst new file mode 100644 index 0000000..b882cb0 --- /dev/null +++ b/Doc/library/rlcompleter.rst @@ -0,0 +1,65 @@ + +:mod:`rlcompleter` --- Completion function for GNU readline +=========================================================== + +.. module:: rlcompleter + :synopsis: Python identifier completion, suitable for the GNU readline library. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`rlcompleter` module defines a completion function suitable for the +:mod:`readline` module by completing valid Python identifiers and keywords. + +When this module is imported on a Unix platform with the :mod:`readline` module +available, an instance of the :class:`Completer` class is automatically created +and its :meth:`complete` method is set as the :mod:`readline` completer. + +Example:: + + >>> import rlcompleter + >>> import readline + >>> readline.parse_and_bind("tab: complete") + >>> readline. <TAB PRESSED> + readline.__doc__ readline.get_line_buffer readline.read_init_file + readline.__file__ readline.insert_text readline.set_completer + readline.__name__ readline.parse_and_bind + >>> readline. + +The :mod:`rlcompleter` module is designed for use with Python's interactive +mode. A user can add the following lines to his or her initialization file +(identified by the :envvar:`PYTHONSTARTUP` environment variable) to get +automatic :kbd:`Tab` completion:: + + try: + import readline + except ImportError: + print "Module readline not available." + else: + import rlcompleter + readline.parse_and_bind("tab: complete") + +On platforms without :mod:`readline`, the :class:`Completer` class defined by +this module can still be used for custom purposes. + + +.. _completer-objects: + +Completer Objects +----------------- + +Completer objects have the following method: + + +.. method:: Completer.complete(text, state) + + Return the *state*th completion for *text*. + + If called for *text* that doesn't include a period character (``'.'``), it will + complete from names currently defined in :mod:`__main__`, :mod:`__builtin__` and + keywords (as defined by the :mod:`keyword` module). + + If called for a dotted name, it will try to evaluate anything without obvious + side-effects (functions will not be evaluated, but it can generate calls to + :meth:`__getattr__`) up to the last part, and find matches for the rest via the + :func:`dir` function. + diff --git a/Doc/library/robotparser.rst b/Doc/library/robotparser.rst new file mode 100644 index 0000000..1a66955 --- /dev/null +++ b/Doc/library/robotparser.rst @@ -0,0 +1,71 @@ + +:mod:`robotparser` --- Parser for robots.txt +============================================= + +.. module:: robotparser + :synopsis: Loads a robots.txt file and answers questions about fetchability of other URLs. +.. sectionauthor:: Skip Montanaro <skip@mojam.com> + + +.. index:: + single: WWW + single: World Wide Web + single: URL + single: robots.txt + +This module provides a single class, :class:`RobotFileParser`, which answers +questions about whether or not a particular user agent can fetch a URL on the +Web site that published the :file:`robots.txt` file. For more details on the +structure of :file:`robots.txt` files, see +http://www.robotstxt.org/wc/norobots.html. + + +.. class:: RobotFileParser() + + This class provides a set of methods to read, parse and answer questions about a + single :file:`robots.txt` file. + + + .. method:: RobotFileParser.set_url(url) + + Sets the URL referring to a :file:`robots.txt` file. + + + .. method:: RobotFileParser.read() + + Reads the :file:`robots.txt` URL and feeds it to the parser. + + + .. method:: RobotFileParser.parse(lines) + + Parses the lines argument. + + + .. method:: RobotFileParser.can_fetch(useragent, url) + + Returns ``True`` if the *useragent* is allowed to fetch the *url* according to + the rules contained in the parsed :file:`robots.txt` file. + + + .. method:: RobotFileParser.mtime() + + Returns the time the ``robots.txt`` file was last fetched. This is useful for + long-running web spiders that need to check for new ``robots.txt`` files + periodically. + + + .. method:: RobotFileParser.modified() + + Sets the time the ``robots.txt`` file was last fetched to the current time. + +The following example demonstrates basic use of the RobotFileParser class. :: + + >>> import robotparser + >>> rp = robotparser.RobotFileParser() + >>> rp.set_url("http://www.musi-cal.com/robots.txt") + >>> rp.read() + >>> rp.can_fetch("*", "http://www.musi-cal.com/cgi-bin/search?city=San+Francisco") + False + >>> rp.can_fetch("*", "http://www.musi-cal.com/") + True + diff --git a/Doc/library/runpy.rst b/Doc/library/runpy.rst new file mode 100644 index 0000000..8846973 --- /dev/null +++ b/Doc/library/runpy.rst @@ -0,0 +1,71 @@ +:mod:`runpy` --- Locating and executing Python modules +====================================================== + +.. module:: runpy + :synopsis: Locate and run Python modules without importing them first. +.. moduleauthor:: Nick Coghlan <ncoghlan@gmail.com> + + +.. versionadded:: 2.5 + +The :mod:`runpy` module is used to locate and run Python modules without +importing them first. Its main use is to implement the :option:`-m` command line +switch that allows scripts to be located using the Python module namespace +rather than the filesystem. + +When executed as a script, the module effectively operates as follows:: + + del sys.argv[0] # Remove the runpy module from the arguments + run_module(sys.argv[0], run_name="__main__", alter_sys=True) + +The :mod:`runpy` module provides a single function: + + +.. function:: run_module(mod_name[, init_globals] [, run_name][, alter_sys]) + + Execute the code of the specified module and return the resulting module globals + dictionary. The module's code is first located using the standard import + mechanism (refer to PEP 302 for details) and then executed in a fresh module + namespace. + + The optional dictionary argument *init_globals* may be used to pre-populate the + globals dictionary before the code is executed. The supplied dictionary will not + be modified. If any of the special global variables below are defined in the + supplied dictionary, those definitions are overridden by the ``run_module`` + function. + + The special global variables ``__name__``, ``__file__``, ``__loader__`` and + ``__builtins__`` are set in the globals dictionary before the module code is + executed. + + ``__name__`` is set to *run_name* if this optional argument is supplied, and the + *mod_name* argument otherwise. + + ``__loader__`` is set to the PEP 302 module loader used to retrieve the code for + the module (This loader may be a wrapper around the standard import mechanism). + + ``__file__`` is set to the name provided by the module loader. If the loader + does not make filename information available, this variable is set to ``None``. + + ``__builtins__`` is automatically initialised with a reference to the top level + namespace of the :mod:`__builtin__` module. + + If the argument *alter_sys* is supplied and evaluates to ``True``, then + ``sys.argv[0]`` is updated with the value of ``__file__`` and + ``sys.modules[__name__]`` is updated with a new module object for the module + being executed. Note that neither ``sys.argv[0]`` nor ``sys.modules[__name__]`` + are restored to their original values before the function returns -- if client + code needs these values preserved, it must either save them explicitly or + else avoid enabling the automatic alterations to :mod:`sys`. + + Note that this manipulation of :mod:`sys` is not thread-safe. Other threads may + see the partially initialised module, as well as the altered list of arguments. + It is recommended that the :mod:`sys` module be left alone when invoking this + function from threaded code. + + +.. seealso:: + + :pep:`338` - Executing modules as scripts + PEP written and implemented by Nick Coghlan. + diff --git a/Doc/library/sched.rst b/Doc/library/sched.rst new file mode 100644 index 0000000..bf3efbf --- /dev/null +++ b/Doc/library/sched.rst @@ -0,0 +1,104 @@ + +:mod:`sched` --- Event scheduler +================================ + +.. module:: sched + :synopsis: General purpose event scheduler. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +.. % LaTeXed and enhanced from comments in file + +.. index:: single: event scheduling + +The :mod:`sched` module defines a class which implements a general purpose event +scheduler: + + +.. class:: scheduler(timefunc, delayfunc) + + The :class:`scheduler` class defines a generic interface to scheduling events. + It needs two functions to actually deal with the "outside world" --- *timefunc* + should be callable without arguments, and return a number (the "time", in any + units whatsoever). The *delayfunc* function should be callable with one + argument, compatible with the output of *timefunc*, and should delay that many + time units. *delayfunc* will also be called with the argument ``0`` after each + event is run to allow other threads an opportunity to run in multi-threaded + applications. + +Example:: + + >>> import sched, time + >>> s=sched.scheduler(time.time, time.sleep) + >>> def print_time(): print "From print_time", time.time() + ... + >>> def print_some_times(): + ... print time.time() + ... s.enter(5, 1, print_time, ()) + ... s.enter(10, 1, print_time, ()) + ... s.run() + ... print time.time() + ... + >>> print_some_times() + 930343690.257 + From print_time 930343695.274 + From print_time 930343700.273 + 930343700.276 + + +.. _scheduler-objects: + +Scheduler Objects +----------------- + +:class:`scheduler` instances have the following methods: + + +.. method:: scheduler.enterabs(time, priority, action, argument) + + Schedule a new event. The *time* argument should be a numeric type compatible + with the return value of the *timefunc* function passed to the constructor. + Events scheduled for the same *time* will be executed in the order of their + *priority*. + + Executing the event means executing ``action(*argument)``. *argument* must be a + sequence holding the parameters for *action*. + + Return value is an event which may be used for later cancellation of the event + (see :meth:`cancel`). + + +.. method:: scheduler.enter(delay, priority, action, argument) + + Schedule an event for *delay* more time units. Other then the relative time, the + other arguments, the effect and the return value are the same as those for + :meth:`enterabs`. + + +.. method:: scheduler.cancel(event) + + Remove the event from the queue. If *event* is not an event currently in the + queue, this method will raise a :exc:`RuntimeError`. + + +.. method:: scheduler.empty() + + Return true if the event queue is empty. + + +.. method:: scheduler.run() + + Run all scheduled events. This function will wait (using the :func:`delayfunc` + function passed to the constructor) for the next event, then execute it and so + on until there are no more scheduled events. + + Either *action* or *delayfunc* can raise an exception. In either case, the + scheduler will maintain a consistent state and propagate the exception. If an + exception is raised by *action*, the event will not be attempted in future calls + to :meth:`run`. + + If a sequence of events takes longer to run than the time available before the + next event, the scheduler will simply fall behind. No events will be dropped; + the calling code is responsible for canceling events which are no longer + pertinent. + diff --git a/Doc/library/scrolledtext.rst b/Doc/library/scrolledtext.rst new file mode 100644 index 0000000..85456b9 --- /dev/null +++ b/Doc/library/scrolledtext.rst @@ -0,0 +1,32 @@ +:mod:`ScrolledText` --- Scrolled Text Widget +============================================ + +.. module:: ScrolledText + :platform: Tk + :synopsis: Text widget with a vertical scroll bar. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`ScrolledText` module provides a class of the same name which +implements a basic text widget which has a vertical scroll bar configured to do +the "right thing." Using the :class:`ScrolledText` class is a lot easier than +setting up a text widget and scroll bar directly. The constructor is the same +as that of the :class:`Tkinter.Text` class. + +The text widget and scrollbar are packed together in a :class:`Frame`, and the +methods of the :class:`Grid` and :class:`Pack` geometry managers are acquired +from the :class:`Frame` object. This allows the :class:`ScrolledText` widget to +be used directly to achieve most normal geometry management behavior. + +Should more specific control be necessary, the following attributes are +available: + + +.. attribute:: ScrolledText.frame + + The frame which surrounds the text and scroll bar widgets. + + +.. attribute:: ScrolledText.vbar + + The scroll bar widget. diff --git a/Doc/library/select.rst b/Doc/library/select.rst new file mode 100644 index 0000000..f68a0da --- /dev/null +++ b/Doc/library/select.rst @@ -0,0 +1,141 @@ + +:mod:`select` --- Waiting for I/O completion +============================================ + +.. module:: select + :synopsis: Wait for I/O completion on multiple streams. + + +This module provides access to the :cfunc:`select` and :cfunc:`poll` functions +available in most operating systems. Note that on Windows, it only works for +sockets; on other operating systems, it also works for other file types (in +particular, on Unix, it works on pipes). It cannot be used on regular files to +determine whether a file has grown since it was last read. + +The module defines the following: + + +.. exception:: error + + The exception raised when an error occurs. The accompanying value is a pair + containing the numeric error code from :cdata:`errno` and the corresponding + string, as would be printed by the C function :cfunc:`perror`. + + +.. function:: poll() + + (Not supported by all operating systems.) Returns a polling object, which + supports registering and unregistering file descriptors, and then polling them + for I/O events; see section :ref:`poll-objects` below for the methods supported + by polling objects. + + +.. function:: select(iwtd, owtd, ewtd[, timeout]) + + This is a straightforward interface to the Unix :cfunc:`select` system call. + The first three arguments are sequences of 'waitable objects': either + integers representing file descriptors or objects with a parameterless method + named :meth:`fileno` returning such an integer. The three sequences of + waitable objects are for input, output and 'exceptional conditions', + respectively. Empty sequences are allowed, but acceptance of three empty + sequences is platform-dependent. (It is known to work on Unix but not on + Windows.) The optional *timeout* argument specifies a time-out as a floating + point number in seconds. When the *timeout* argument is omitted the function + blocks until at least one file descriptor is ready. A time-out value of zero + specifies a poll and never blocks. + + The return value is a triple of lists of objects that are ready: subsets of the + first three arguments. When the time-out is reached without a file descriptor + becoming ready, three empty lists are returned. + + .. index:: + single: socket() (in module socket) + single: popen() (in module os) + + Among the acceptable object types in the sequences are Python file objects (e.g. + ``sys.stdin``, or objects returned by :func:`open` or :func:`os.popen`), socket + objects returned by :func:`socket.socket`. You may also define a :dfn:`wrapper` + class yourself, as long as it has an appropriate :meth:`fileno` method (that + really returns a file descriptor, not just a random integer). + + .. % + + .. note:: + + .. index:: single: WinSock + + File objects on Windows are not acceptable, but sockets are. On Windows, the + underlying :cfunc:`select` function is provided by the WinSock library, and does + not handle file descriptors that don't originate from WinSock. + + +.. _poll-objects: + +Polling Objects +--------------- + +The :cfunc:`poll` system call, supported on most Unix systems, provides better +scalability for network servers that service many, many clients at the same +time. :cfunc:`poll` scales better because the system call only requires listing +the file descriptors of interest, while :cfunc:`select` builds a bitmap, turns +on bits for the fds of interest, and then afterward the whole bitmap has to be +linearly scanned again. :cfunc:`select` is O(highest file descriptor), while +:cfunc:`poll` is O(number of file descriptors). + + +.. method:: poll.register(fd[, eventmask]) + + Register a file descriptor with the polling object. Future calls to the + :meth:`poll` method will then check whether the file descriptor has any pending + I/O events. *fd* can be either an integer, or an object with a :meth:`fileno` + method that returns an integer. File objects implement :meth:`fileno`, so they + can also be used as the argument. + + *eventmask* is an optional bitmask describing the type of events you want to + check for, and can be a combination of the constants :const:`POLLIN`, + :const:`POLLPRI`, and :const:`POLLOUT`, described in the table below. If not + specified, the default value used will check for all 3 types of events. + + +-------------------+------------------------------------------+ + | Constant | Meaning | + +===================+==========================================+ + | :const:`POLLIN` | There is data to read | + +-------------------+------------------------------------------+ + | :const:`POLLPRI` | There is urgent data to read | + +-------------------+------------------------------------------+ + | :const:`POLLOUT` | Ready for output: writing will not block | + +-------------------+------------------------------------------+ + | :const:`POLLERR` | Error condition of some sort | + +-------------------+------------------------------------------+ + | :const:`POLLHUP` | Hung up | + +-------------------+------------------------------------------+ + | :const:`POLLNVAL` | Invalid request: descriptor not open | + +-------------------+------------------------------------------+ + + Registering a file descriptor that's already registered is not an error, and has + the same effect as registering the descriptor exactly once. + + +.. method:: poll.unregister(fd) + + Remove a file descriptor being tracked by a polling object. Just like the + :meth:`register` method, *fd* can be an integer or an object with a + :meth:`fileno` method that returns an integer. + + Attempting to remove a file descriptor that was never registered causes a + :exc:`KeyError` exception to be raised. + + +.. method:: poll.poll([timeout]) + + Polls the set of registered file descriptors, and returns a possibly-empty list + containing ``(fd, event)`` 2-tuples for the descriptors that have events or + errors to report. *fd* is the file descriptor, and *event* is a bitmask with + bits set for the reported events for that descriptor --- :const:`POLLIN` for + waiting input, :const:`POLLOUT` to indicate that the descriptor can be written + to, and so forth. An empty list indicates that the call timed out and no file + descriptors had any events to report. If *timeout* is given, it specifies the + length of time in milliseconds which the system will wait for events before + returning. If *timeout* is omitted, negative, or :const:`None`, the call will + block until there is an event for this poll object. + diff --git a/Doc/library/sgmllib.rst b/Doc/library/sgmllib.rst new file mode 100644 index 0000000..c0ef1a2 --- /dev/null +++ b/Doc/library/sgmllib.rst @@ -0,0 +1,270 @@ + +:mod:`sgmllib` --- Simple SGML parser +===================================== + +.. module:: sgmllib + :synopsis: Only as much of an SGML parser as needed to parse HTML. + + +.. index:: single: SGML + +This module defines a class :class:`SGMLParser` which serves as the basis for +parsing text files formatted in SGML (Standard Generalized Mark-up Language). +In fact, it does not provide a full SGML parser --- it only parses SGML insofar +as it is used by HTML, and the module only exists as a base for the +:mod:`htmllib` module. Another HTML parser which supports XHTML and offers a +somewhat different interface is available in the :mod:`HTMLParser` module. + + +.. class:: SGMLParser() + + The :class:`SGMLParser` class is instantiated without arguments. The parser is + hardcoded to recognize the following constructs: + + * Opening and closing tags of the form ``<tag attr="value" ...>`` and + ``</tag>``, respectively. + + * Numeric character references of the form ``&#name;``. + + * Entity references of the form ``&name;``. + + * SGML comments of the form ``<!--text-->``. Note that spaces, tabs, and + newlines are allowed between the trailing ``>`` and the immediately preceding + ``--``. + +A single exception is defined as well: + + +.. exception:: SGMLParseError + + Exception raised by the :class:`SGMLParser` class when it encounters an error + while parsing. + + .. versionadded:: 2.1 + +:class:`SGMLParser` instances have the following methods: + + +.. method:: SGMLParser.reset() + + Reset the instance. Loses all unprocessed data. This is called implicitly at + instantiation time. + + +.. method:: SGMLParser.setnomoretags() + + Stop processing tags. Treat all following input as literal input (CDATA). + (This is only provided so the HTML tag ``<PLAINTEXT>`` can be implemented.) + + +.. method:: SGMLParser.setliteral() + + Enter literal mode (CDATA mode). + + +.. method:: SGMLParser.feed(data) + + Feed some text to the parser. It is processed insofar as it consists of + complete elements; incomplete data is buffered until more data is fed or + :meth:`close` is called. + + +.. method:: SGMLParser.close() + + Force processing of all buffered data as if it were followed by an end-of-file + mark. This method may be redefined by a derived class to define additional + processing at the end of the input, but the redefined version should always call + :meth:`close`. + + +.. method:: SGMLParser.get_starttag_text() + + Return the text of the most recently opened start tag. This should not normally + be needed for structured processing, but may be useful in dealing with HTML "as + deployed" or for re-generating input with minimal changes (whitespace between + attributes can be preserved, etc.). + + +.. method:: SGMLParser.handle_starttag(tag, method, attributes) + + This method is called to handle start tags for which either a :meth:`start_tag` + or :meth:`do_tag` method has been defined. The *tag* argument is the name of + the tag converted to lower case, and the *method* argument is the bound method + which should be used to support semantic interpretation of the start tag. The + *attributes* argument is a list of ``(name, value)`` pairs containing the + attributes found inside the tag's ``<>`` brackets. + + The *name* has been translated to lower case. Double quotes and backslashes in + the *value* have been interpreted, as well as known character references and + known entity references terminated by a semicolon (normally, entity references + can be terminated by any non-alphanumerical character, but this would break the + very common case of ``<A HREF="url?spam=1&eggs=2">`` when ``eggs`` is a valid + entity name). + + For instance, for the tag ``<A HREF="http://www.cwi.nl/">``, this method would + be called as ``unknown_starttag('a', [('href', 'http://www.cwi.nl/')])``. The + base implementation simply calls *method* with *attributes* as the only + argument. + + .. versionadded:: 2.5 + Handling of entity and character references within attribute values. + + +.. method:: SGMLParser.handle_endtag(tag, method) + + This method is called to handle endtags for which an :meth:`end_tag` method has + been defined. The *tag* argument is the name of the tag converted to lower + case, and the *method* argument is the bound method which should be used to + support semantic interpretation of the end tag. If no :meth:`end_tag` method is + defined for the closing element, this handler is not called. The base + implementation simply calls *method*. + + +.. method:: SGMLParser.handle_data(data) + + This method is called to process arbitrary data. It is intended to be + overridden by a derived class; the base class implementation does nothing. + + +.. method:: SGMLParser.handle_charref(ref) + + This method is called to process a character reference of the form ``&#ref;``. + The base implementation uses :meth:`convert_charref` to convert the reference to + a string. If that method returns a string, it is passed to :meth:`handle_data`, + otherwise ``unknown_charref(ref)`` is called to handle the error. + + .. versionchanged:: 2.5 + Use :meth:`convert_charref` instead of hard-coding the conversion. + + +.. method:: SGMLParser.convert_charref(ref) + + Convert a character reference to a string, or ``None``. *ref* is the reference + passed in as a string. In the base implementation, *ref* must be a decimal + number in the range 0-255. It converts the code point found using the + :meth:`convert_codepoint` method. If *ref* is invalid or out of range, this + method returns ``None``. This method is called by the default + :meth:`handle_charref` implementation and by the attribute value parser. + + .. versionadded:: 2.5 + + +.. method:: SGMLParser.convert_codepoint(codepoint) + + Convert a codepoint to a :class:`str` value. Encodings can be handled here if + appropriate, though the rest of :mod:`sgmllib` is oblivious on this matter. + + .. versionadded:: 2.5 + + +.. method:: SGMLParser.handle_entityref(ref) + + This method is called to process a general entity reference of the form + ``&ref;`` where *ref* is an general entity reference. It converts *ref* by + passing it to :meth:`convert_entityref`. If a translation is returned, it calls + the method :meth:`handle_data` with the translation; otherwise, it calls the + method ``unknown_entityref(ref)``. The default :attr:`entitydefs` defines + translations for ``&``, ``&apos``, ``>``, ``<``, and ``"``. + + .. versionchanged:: 2.5 + Use :meth:`convert_entityref` instead of hard-coding the conversion. + + +.. method:: SGMLParser.convert_entityref(ref) + + Convert a named entity reference to a :class:`str` value, or ``None``. The + resulting value will not be parsed. *ref* will be only the name of the entity. + The default implementation looks for *ref* in the instance (or class) variable + :attr:`entitydefs` which should be a mapping from entity names to corresponding + translations. If no translation is available for *ref*, this method returns + ``None``. This method is called by the default :meth:`handle_entityref` + implementation and by the attribute value parser. + + .. versionadded:: 2.5 + + +.. method:: SGMLParser.handle_comment(comment) + + This method is called when a comment is encountered. The *comment* argument is + a string containing the text between the ``<!--`` and ``-->`` delimiters, but + not the delimiters themselves. For example, the comment ``<!--text-->`` will + cause this method to be called with the argument ``'text'``. The default method + does nothing. + + +.. method:: SGMLParser.handle_decl(data) + + Method called when an SGML declaration is read by the parser. In practice, the + ``DOCTYPE`` declaration is the only thing observed in HTML, but the parser does + not discriminate among different (or broken) declarations. Internal subsets in + a ``DOCTYPE`` declaration are not supported. The *data* parameter will be the + entire contents of the declaration inside the ``<!``...\ ``>`` markup. The + default implementation does nothing. + + +.. method:: SGMLParser.report_unbalanced(tag) + + This method is called when an end tag is found which does not correspond to any + open element. + + +.. method:: SGMLParser.unknown_starttag(tag, attributes) + + This method is called to process an unknown start tag. It is intended to be + overridden by a derived class; the base class implementation does nothing. + + +.. method:: SGMLParser.unknown_endtag(tag) + + This method is called to process an unknown end tag. It is intended to be + overridden by a derived class; the base class implementation does nothing. + + +.. method:: SGMLParser.unknown_charref(ref) + + This method is called to process unresolvable numeric character references. + Refer to :meth:`handle_charref` to determine what is handled by default. It is + intended to be overridden by a derived class; the base class implementation does + nothing. + + +.. method:: SGMLParser.unknown_entityref(ref) + + This method is called to process an unknown entity reference. It is intended to + be overridden by a derived class; the base class implementation does nothing. + +Apart from overriding or extending the methods listed above, derived classes may +also define methods of the following form to define processing of specific tags. +Tag names in the input stream are case independent; the *tag* occurring in +method names must be in lower case: + + +.. method:: SGMLParser.start_tag(attributes) + :noindex: + + This method is called to process an opening tag *tag*. It has preference over + :meth:`do_tag`. The *attributes* argument has the same meaning as described for + :meth:`handle_starttag` above. + + +.. method:: SGMLParser.do_tag(attributes) + :noindex: + + This method is called to process an opening tag *tag* for which no + :meth:`start_tag` method is defined. The *attributes* argument has the same + meaning as described for :meth:`handle_starttag` above. + + +.. method:: SGMLParser.end_tag() + :noindex: + + This method is called to process a closing tag *tag*. + +Note that the parser maintains a stack of open elements for which no end tag has +been found yet. Only tags processed by :meth:`start_tag` are pushed on this +stack. Definition of an :meth:`end_tag` method is optional for these tags. For +tags processed by :meth:`do_tag` or by :meth:`unknown_tag`, no :meth:`end_tag` +method must be defined; if defined, it will not be used. If both +:meth:`start_tag` and :meth:`do_tag` methods exist for a tag, the +:meth:`start_tag` method takes precedence. + diff --git a/Doc/library/shelve.rst b/Doc/library/shelve.rst new file mode 100644 index 0000000..1776b7d --- /dev/null +++ b/Doc/library/shelve.rst @@ -0,0 +1,185 @@ + +:mod:`shelve` --- Python object persistence +=========================================== + +.. module:: shelve + :synopsis: Python object persistence. + + +.. index:: module: pickle + +A "shelf" is a persistent, dictionary-like object. The difference with "dbm" +databases is that the values (not the keys!) in a shelf can be essentially +arbitrary Python objects --- anything that the :mod:`pickle` module can handle. +This includes most class instances, recursive data types, and objects containing +lots of shared sub-objects. The keys are ordinary strings. + + +.. function:: open(filename[, flag='c'[, protocol=None[, writeback=False]]]) + + Open a persistent dictionary. The filename specified is the base filename for + the underlying database. As a side-effect, an extension may be added to the + filename and more than one file may be created. By default, the underlying + database file is opened for reading and writing. The optional *flag* parameter + has the same interpretation as the *flag* parameter of :func:`anydbm.open`. + + By default, version 0 pickles are used to serialize values. The version of the + pickle protocol can be specified with the *protocol* parameter. + + .. versionchanged:: 2.3 + The *protocol* parameter was added. + + By default, mutations to persistent-dictionary mutable entries are not + automatically written back. If the optional *writeback* parameter is set to + *True*, all entries accessed are cached in memory, and written back at close + time; this can make it handier to mutate mutable entries in the persistent + dictionary, but, if many entries are accessed, it can consume vast amounts of + memory for the cache, and it can make the close operation very slow since all + accessed entries are written back (there is no way to determine which accessed + entries are mutable, nor which ones were actually mutated). + +Shelve objects support all methods supported by dictionaries. This eases the +transition from dictionary based scripts to those requiring persistent storage. + +One additional method is supported: + + +.. method:: Shelf.sync() + + Write back all entries in the cache if the shelf was opened with *writeback* set + to *True*. Also empty the cache and synchronize the persistent dictionary on + disk, if feasible. This is called automatically when the shelf is closed with + :meth:`close`. + + +Restrictions +------------ + + .. index:: + module: dbm + module: gdbm + module: bsddb + +* The choice of which database package will be used (such as :mod:`dbm`, + :mod:`gdbm` or :mod:`bsddb`) depends on which interface is available. Therefore + it is not safe to open the database directly using :mod:`dbm`. The database is + also (unfortunately) subject to the limitations of :mod:`dbm`, if it is used --- + this means that (the pickled representation of) the objects stored in the + database should be fairly small, and in rare cases key collisions may cause the + database to refuse updates. + +* Depending on the implementation, closing a persistent dictionary may or may + not be necessary to flush changes to disk. The :meth:`__del__` method of the + :class:`Shelf` class calls the :meth:`close` method, so the programmer generally + need not do this explicitly. + +* The :mod:`shelve` module does not support *concurrent* read/write access to + shelved objects. (Multiple simultaneous read accesses are safe.) When a + program has a shelf open for writing, no other program should have it open for + reading or writing. Unix file locking can be used to solve this, but this + differs across Unix versions and requires knowledge about the database + implementation used. + + +.. class:: Shelf(dict[, protocol=None[, writeback=False]]) + + A subclass of :class:`UserDict.DictMixin` which stores pickled values in the + *dict* object. + + By default, version 0 pickles are used to serialize values. The version of the + pickle protocol can be specified with the *protocol* parameter. See the + :mod:`pickle` documentation for a discussion of the pickle protocols. + + .. versionchanged:: 2.3 + The *protocol* parameter was added. + + If the *writeback* parameter is ``True``, the object will hold a cache of all + entries accessed and write them back to the *dict* at sync and close times. + This allows natural operations on mutable entries, but can consume much more + memory and make sync and close take a long time. + + +.. class:: BsdDbShelf(dict[, protocol=None[, writeback=False]]) + + A subclass of :class:`Shelf` which exposes :meth:`first`, :meth:`next`, + :meth:`previous`, :meth:`last` and :meth:`set_location` which are available in + the :mod:`bsddb` module but not in other database modules. The *dict* object + passed to the constructor must support those methods. This is generally + accomplished by calling one of :func:`bsddb.hashopen`, :func:`bsddb.btopen` or + :func:`bsddb.rnopen`. The optional *protocol* and *writeback* parameters have + the same interpretation as for the :class:`Shelf` class. + + +.. class:: DbfilenameShelf(filename[, flag='c'[, protocol=None[, writeback=False]]]) + + A subclass of :class:`Shelf` which accepts a *filename* instead of a dict-like + object. The underlying file will be opened using :func:`anydbm.open`. By + default, the file will be created and opened for both read and write. The + optional *flag* parameter has the same interpretation as for the :func:`open` + function. The optional *protocol* and *writeback* parameters have the same + interpretation as for the :class:`Shelf` class. + + +Example +------- + +To summarize the interface (``key`` is a string, ``data`` is an arbitrary +object):: + + import shelve + + d = shelve.open(filename) # open -- file may get suffix added by low-level + # library + + d[key] = data # store data at key (overwrites old data if + # using an existing key) + data = d[key] # retrieve a COPY of data at key (raise KeyError if no + # such key) + del d[key] # delete data stored at key (raises KeyError + # if no such key) + flag = d.has_key(key) # true if the key exists + klist = d.keys() # a list of all existing keys (slow!) + + # as d was opened WITHOUT writeback=True, beware: + d['xx'] = range(4) # this works as expected, but... + d['xx'].append(5) # *this doesn't!* -- d['xx'] is STILL range(4)!!! + + # having opened d without writeback=True, you need to code carefully: + temp = d['xx'] # extracts the copy + temp.append(5) # mutates the copy + d['xx'] = temp # stores the copy right back, to persist it + + # or, d=shelve.open(filename,writeback=True) would let you just code + # d['xx'].append(5) and have it work as expected, BUT it would also + # consume more memory and make the d.close() operation slower. + + d.close() # close it + + +.. seealso:: + + Module :mod:`anydbm` + Generic interface to ``dbm``\ -style databases. + + Module :mod:`bsddb` + BSD ``db`` database interface. + + Module :mod:`dbhash` + Thin layer around the :mod:`bsddb` which provides an :func:`open` function like + the other database modules. + + Module :mod:`dbm` + Standard Unix database interface. + + Module :mod:`dumbdbm` + Portable implementation of the ``dbm`` interface. + + Module :mod:`gdbm` + GNU database interface, based on the ``dbm`` interface. + + Module :mod:`pickle` + Object serialization used by :mod:`shelve`. + + Module :mod:`cPickle` + High-performance version of :mod:`pickle`. + diff --git a/Doc/library/shlex.rst b/Doc/library/shlex.rst new file mode 100644 index 0000000..0ae77c1 --- /dev/null +++ b/Doc/library/shlex.rst @@ -0,0 +1,307 @@ + +:mod:`shlex` --- Simple lexical analysis +======================================== + +.. module:: shlex + :synopsis: Simple lexical analysis for Unix shell-like languages. +.. moduleauthor:: Eric S. Raymond <esr@snark.thyrsus.com> +.. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> +.. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com> +.. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> + + +.. versionadded:: 1.5.2 + +The :class:`shlex` class makes it easy to write lexical analyzers for simple +syntaxes resembling that of the Unix shell. This will often be useful for +writing minilanguages, (for example, in run control files for Python +applications) or for parsing quoted strings. + +.. note:: + + The :mod:`shlex` module currently does not support Unicode input. + +The :mod:`shlex` module defines the following functions: + + +.. function:: split(s[, comments[, posix]]) + + Split the string *s* using shell-like syntax. If *comments* is :const:`False` + (the default), the parsing of comments in the given string will be disabled + (setting the :attr:`commenters` member of the :class:`shlex` instance to the + empty string). This function operates in POSIX mode by default, but uses + non-POSIX mode if the *posix* argument is false. + + .. versionadded:: 2.3 + + .. versionchanged:: 2.6 + Added the *posix* parameter. + + .. note:: + + Since the :func:`split` function instantiates a :class:`shlex` instance, passing + ``None`` for *s* will read the string to split from standard input. + +The :mod:`shlex` module defines the following class: + + +.. class:: shlex([instream[, infile[, posix]]]) + + A :class:`shlex` instance or subclass instance is a lexical analyzer object. + The initialization argument, if present, specifies where to read characters + from. It must be a file-/stream-like object with :meth:`read` and + :meth:`readline` methods, or a string (strings are accepted since Python 2.3). + If no argument is given, input will be taken from ``sys.stdin``. The second + optional argument is a filename string, which sets the initial value of the + :attr:`infile` member. If the *instream* argument is omitted or equal to + ``sys.stdin``, this second argument defaults to "stdin". The *posix* argument + was introduced in Python 2.3, and defines the operational mode. When *posix* is + not true (default), the :class:`shlex` instance will operate in compatibility + mode. When operating in POSIX mode, :class:`shlex` will try to be as close as + possible to the POSIX shell parsing rules. + + +.. seealso:: + + Module :mod:`ConfigParser` + Parser for configuration files similar to the Windows :file:`.ini` files. + + +.. _shlex-objects: + +shlex Objects +------------- + +A :class:`shlex` instance has the following methods: + + +.. method:: shlex.get_token() + + Return a token. If tokens have been stacked using :meth:`push_token`, pop a + token off the stack. Otherwise, read one from the input stream. If reading + encounters an immediate end-of-file, :attr:`self.eof` is returned (the empty + string (``''``) in non-POSIX mode, and ``None`` in POSIX mode). + + +.. method:: shlex.push_token(str) + + Push the argument onto the token stack. + + +.. method:: shlex.read_token() + + Read a raw token. Ignore the pushback stack, and do not interpret source + requests. (This is not ordinarily a useful entry point, and is documented here + only for the sake of completeness.) + + +.. method:: shlex.sourcehook(filename) + + When :class:`shlex` detects a source request (see :attr:`source` below) this + method is given the following token as argument, and expected to return a tuple + consisting of a filename and an open file-like object. + + Normally, this method first strips any quotes off the argument. If the result + is an absolute pathname, or there was no previous source request in effect, or + the previous source was a stream (such as ``sys.stdin``), the result is left + alone. Otherwise, if the result is a relative pathname, the directory part of + the name of the file immediately before it on the source inclusion stack is + prepended (this behavior is like the way the C preprocessor handles ``#include + "file.h"``). + + The result of the manipulations is treated as a filename, and returned as the + first component of the tuple, with :func:`open` called on it to yield the second + component. (Note: this is the reverse of the order of arguments in instance + initialization!) + + This hook is exposed so that you can use it to implement directory search paths, + addition of file extensions, and other namespace hacks. There is no + corresponding 'close' hook, but a shlex instance will call the :meth:`close` + method of the sourced input stream when it returns EOF. + + For more explicit control of source stacking, use the :meth:`push_source` and + :meth:`pop_source` methods. + + +.. method:: shlex.push_source(stream[, filename]) + + Push an input source stream onto the input stack. If the filename argument is + specified it will later be available for use in error messages. This is the + same method used internally by the :meth:`sourcehook` method. + + .. versionadded:: 2.1 + + +.. method:: shlex.pop_source() + + Pop the last-pushed input source from the input stack. This is the same method + used internally when the lexer reaches EOF on a stacked input stream. + + .. versionadded:: 2.1 + + +.. method:: shlex.error_leader([file[, line]]) + + This method generates an error message leader in the format of a Unix C compiler + error label; the format is ``'"%s", line %d: '``, where the ``%s`` is replaced + with the name of the current source file and the ``%d`` with the current input + line number (the optional arguments can be used to override these). + + This convenience is provided to encourage :mod:`shlex` users to generate error + messages in the standard, parseable format understood by Emacs and other Unix + tools. + +Instances of :class:`shlex` subclasses have some public instance variables which +either control lexical analysis or can be used for debugging: + + +.. attribute:: shlex.commenters + + The string of characters that are recognized as comment beginners. All + characters from the comment beginner to end of line are ignored. Includes just + ``'#'`` by default. + + +.. attribute:: shlex.wordchars + + The string of characters that will accumulate into multi-character tokens. By + default, includes all ASCII alphanumerics and underscore. + + +.. attribute:: shlex.whitespace + + Characters that will be considered whitespace and skipped. Whitespace bounds + tokens. By default, includes space, tab, linefeed and carriage-return. + + +.. attribute:: shlex.escape + + Characters that will be considered as escape. This will be only used in POSIX + mode, and includes just ``'\'`` by default. + + .. versionadded:: 2.3 + + +.. attribute:: shlex.quotes + + Characters that will be considered string quotes. The token accumulates until + the same quote is encountered again (thus, different quote types protect each + other as in the shell.) By default, includes ASCII single and double quotes. + + +.. attribute:: shlex.escapedquotes + + Characters in :attr:`quotes` that will interpret escape characters defined in + :attr:`escape`. This is only used in POSIX mode, and includes just ``'"'`` by + default. + + .. versionadded:: 2.3 + + +.. attribute:: shlex.whitespace_split + + If ``True``, tokens will only be split in whitespaces. This is useful, for + example, for parsing command lines with :class:`shlex`, getting tokens in a + similar way to shell arguments. + + .. versionadded:: 2.3 + + +.. attribute:: shlex.infile + + The name of the current input file, as initially set at class instantiation time + or stacked by later source requests. It may be useful to examine this when + constructing error messages. + + +.. attribute:: shlex.instream + + The input stream from which this :class:`shlex` instance is reading characters. + + +.. attribute:: shlex.source + + This member is ``None`` by default. If you assign a string to it, that string + will be recognized as a lexical-level inclusion request similar to the + ``source`` keyword in various shells. That is, the immediately following token + will opened as a filename and input taken from that stream until EOF, at which + point the :meth:`close` method of that stream will be called and the input + source will again become the original input stream. Source requests may be + stacked any number of levels deep. + + +.. attribute:: shlex.debug + + If this member is numeric and ``1`` or more, a :class:`shlex` instance will + print verbose progress output on its behavior. If you need to use this, you can + read the module source code to learn the details. + + +.. attribute:: shlex.lineno + + Source line number (count of newlines seen so far plus one). + + +.. attribute:: shlex.token + + The token buffer. It may be useful to examine this when catching exceptions. + + +.. attribute:: shlex.eof + + Token used to determine end of file. This will be set to the empty string + (``''``), in non-POSIX mode, and to ``None`` in POSIX mode. + + .. versionadded:: 2.3 + + +.. _shlex-parsing-rules: + +Parsing Rules +------------- + +When operating in non-POSIX mode, :class:`shlex` will try to obey to the +following rules. + +* Quote characters are not recognized within words (``Do"Not"Separate`` is + parsed as the single word ``Do"Not"Separate``); + +* Escape characters are not recognized; + +* Enclosing characters in quotes preserve the literal value of all characters + within the quotes; + +* Closing quotes separate words (``"Do"Separate`` is parsed as ``"Do"`` and + ``Separate``); + +* If :attr:`whitespace_split` is ``False``, any character not declared to be a + word character, whitespace, or a quote will be returned as a single-character + token. If it is ``True``, :class:`shlex` will only split words in whitespaces; + +* EOF is signaled with an empty string (``''``); + +* It's not possible to parse empty strings, even if quoted. + +When operating in POSIX mode, :class:`shlex` will try to obey to the following +parsing rules. + +* Quotes are stripped out, and do not separate words (``"Do"Not"Separate"`` is + parsed as the single word ``DoNotSeparate``); + +* Non-quoted escape characters (e.g. ``'\'``) preserve the literal value of the + next character that follows; + +* Enclosing characters in quotes which are not part of :attr:`escapedquotes` + (e.g. ``"'"``) preserve the literal value of all characters within the quotes; + +* Enclosing characters in quotes which are part of :attr:`escapedquotes` (e.g. + ``'"'``) preserves the literal value of all characters within the quotes, with + the exception of the characters mentioned in :attr:`escape`. The escape + characters retain its special meaning only when followed by the quote in use, or + the escape character itself. Otherwise the escape character will be considered a + normal character. + +* EOF is signaled with a :const:`None` value; + +* Quoted empty strings (``''``) are allowed; + diff --git a/Doc/library/shutil.rst b/Doc/library/shutil.rst new file mode 100644 index 0000000..ef0758d --- /dev/null +++ b/Doc/library/shutil.rst @@ -0,0 +1,171 @@ + +:mod:`shutil` --- High-level file operations +============================================ + +.. module:: shutil + :synopsis: High-level file operations, including copying. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. % partly based on the docstrings + +.. index:: + single: file; copying + single: copying files + +The :mod:`shutil` module offers a number of high-level operations on files and +collections of files. In particular, functions are provided which support file +copying and removal. + +**Caveat:** On MacOS, the resource fork and other metadata are not used. For +file copies, this means that resources will be lost and file type and creator +codes will not be correct. + + +.. function:: copyfile(src, dst) + + Copy the contents of the file named *src* to a file named *dst*. The + destination location must be writable; otherwise, an :exc:`IOError` exception + will be raised. If *dst* already exists, it will be replaced. Special files + such as character or block devices and pipes cannot be copied with this + function. *src* and *dst* are path names given as strings. + + +.. function:: copyfileobj(fsrc, fdst[, length]) + + Copy the contents of the file-like object *fsrc* to the file-like object *fdst*. + The integer *length*, if given, is the buffer size. In particular, a negative + *length* value means to copy the data without looping over the source data in + chunks; by default the data is read in chunks to avoid uncontrolled memory + consumption. Note that if the current file position of the *fsrc* object is not + 0, only the contents from the current file position to the end of the file will + be copied. + + +.. function:: copymode(src, dst) + + Copy the permission bits from *src* to *dst*. The file contents, owner, and + group are unaffected. *src* and *dst* are path names given as strings. + + +.. function:: copystat(src, dst) + + Copy the permission bits, last access time, last modification time, and flags + from *src* to *dst*. The file contents, owner, and group are unaffected. *src* + and *dst* are path names given as strings. + + +.. function:: copy(src, dst) + + Copy the file *src* to the file or directory *dst*. If *dst* is a directory, a + file with the same basename as *src* is created (or overwritten) in the + directory specified. Permission bits are copied. *src* and *dst* are path + names given as strings. + + +.. function:: copy2(src, dst) + + Similar to :func:`copy`, but last access time and last modification time are + copied as well. This is similar to the Unix command :program:`cp -p`. + + +.. function:: copytree(src, dst[, symlinks]) + + Recursively copy an entire directory tree rooted at *src*. The destination + directory, named by *dst*, must not already exist; it will be created as well as + missing parent directories. Permissions and times of directories are copied with + :func:`copystat`, individual files are copied using :func:`copy2`. If + *symlinks* is true, symbolic links in the source tree are represented as + symbolic links in the new tree; if false or omitted, the contents of the linked + files are copied to the new tree. If exception(s) occur, an :exc:`Error` is + raised with a list of reasons. + + The source code for this should be considered an example rather than a tool. + + .. versionchanged:: 2.3 + :exc:`Error` is raised if any exceptions occur during copying, rather than + printing a message. + + .. versionchanged:: 2.5 + Create intermediate directories needed to create *dst*, rather than raising an + error. Copy permissions and times of directories using :func:`copystat`. + + +.. function:: rmtree(path[, ignore_errors[, onerror]]) + + .. index:: single: directory; deleting + + Delete an entire directory tree (*path* must point to a directory). If + *ignore_errors* is true, errors resulting from failed removals will be ignored; + if false or omitted, such errors are handled by calling a handler specified by + *onerror* or, if that is omitted, they raise an exception. + + If *onerror* is provided, it must be a callable that accepts three parameters: + *function*, *path*, and *excinfo*. The first parameter, *function*, is the + function which raised the exception; it will be :func:`os.listdir`, + :func:`os.remove` or :func:`os.rmdir`. The second parameter, *path*, will be + the path name passed to *function*. The third parameter, *excinfo*, will be the + exception information return by :func:`sys.exc_info`. Exceptions raised by + *onerror* will not be caught. + + +.. function:: move(src, dst) + + Recursively move a file or directory to another location. + + If the destination is on our current filesystem, then simply use rename. + Otherwise, copy src to the dst and then remove src. + + .. versionadded:: 2.3 + + +.. exception:: Error + + This exception collects exceptions that raised during a mult-file operation. For + :func:`copytree`, the exception argument is a list of 3-tuples (*srcname*, + *dstname*, *exception*). + + .. versionadded:: 2.3 + + +.. _shutil-example: + +Example +------- + +This example is the implementation of the :func:`copytree` function, described +above, with the docstring omitted. It demonstrates many of the other functions +provided by this module. :: + + def copytree(src, dst, symlinks=False): + names = os.listdir(src) + os.makedirs(dst) + errors = [] + for name in names: + srcname = os.path.join(src, name) + dstname = os.path.join(dst, name) + try: + if symlinks and os.path.islink(srcname): + linkto = os.readlink(srcname) + os.symlink(linkto, dstname) + elif os.path.isdir(srcname): + copytree(srcname, dstname, symlinks) + else: + copy2(srcname, dstname) + # XXX What about devices, sockets etc.? + except (IOError, os.error) as why: + errors.append((srcname, dstname, str(why))) + # catch the Error from the recursive copytree so that we can + # continue with other files + except Error as err: + errors.extend(err.args[0]) + try: + copystat(src, dst) + except WindowsError: + # can't copy file access times on Windows + pass + except OSError as why: + errors.extend((src, dst, str(why))) + if errors: + raise Error, errors + diff --git a/Doc/library/signal.rst b/Doc/library/signal.rst new file mode 100644 index 0000000..54cce53 --- /dev/null +++ b/Doc/library/signal.rst @@ -0,0 +1,157 @@ + +:mod:`signal` --- Set handlers for asynchronous events +====================================================== + +.. module:: signal + :synopsis: Set handlers for asynchronous events. + + +This module provides mechanisms to use signal handlers in Python. Some general +rules for working with signals and their handlers: + +* A handler for a particular signal, once set, remains installed until it is + explicitly reset (Python emulates the BSD style interface regardless of the + underlying implementation), with the exception of the handler for + :const:`SIGCHLD`, which follows the underlying implementation. + +* There is no way to "block" signals temporarily from critical sections (since + this is not supported by all Unix flavors). + +* Although Python signal handlers are called asynchronously as far as the Python + user is concerned, they can only occur between the "atomic" instructions of the + Python interpreter. This means that signals arriving during long calculations + implemented purely in C (such as regular expression matches on large bodies of + text) may be delayed for an arbitrary amount of time. + +* When a signal arrives during an I/O operation, it is possible that the I/O + operation raises an exception after the signal handler returns. This is + dependent on the underlying Unix system's semantics regarding interrupted system + calls. + +* Because the C signal handler always returns, it makes little sense to catch + synchronous errors like :const:`SIGFPE` or :const:`SIGSEGV`. + +* Python installs a small number of signal handlers by default: :const:`SIGPIPE` + is ignored (so write errors on pipes and sockets can be reported as ordinary + Python exceptions) and :const:`SIGINT` is translated into a + :exc:`KeyboardInterrupt` exception. All of these can be overridden. + +* Some care must be taken if both signals and threads are used in the same + program. The fundamental thing to remember in using signals and threads + simultaneously is: always perform :func:`signal` operations in the main thread + of execution. Any thread can perform an :func:`alarm`, :func:`getsignal`, or + :func:`pause`; only the main thread can set a new signal handler, and the main + thread will be the only one to receive signals (this is enforced by the Python + :mod:`signal` module, even if the underlying thread implementation supports + sending signals to individual threads). This means that signals can't be used + as a means of inter-thread communication. Use locks instead. + +The variables defined in the :mod:`signal` module are: + + +.. data:: SIG_DFL + + This is one of two standard signal handling options; it will simply perform the + default function for the signal. For example, on most systems the default + action for :const:`SIGQUIT` is to dump core and exit, while the default action + for :const:`SIGCLD` is to simply ignore it. + + +.. data:: SIG_IGN + + This is another standard signal handler, which will simply ignore the given + signal. + + +.. data:: SIG* + + All the signal numbers are defined symbolically. For example, the hangup signal + is defined as :const:`signal.SIGHUP`; the variable names are identical to the + names used in C programs, as found in ``<signal.h>``. The Unix man page for + ':cfunc:`signal`' lists the existing signals (on some systems this is + :manpage:`signal(2)`, on others the list is in :manpage:`signal(7)`). Note that + not all systems define the same set of signal names; only those names defined by + the system are defined by this module. + + +.. data:: NSIG + + One more than the number of the highest signal number. + +The :mod:`signal` module defines the following functions: + + +.. function:: alarm(time) + + If *time* is non-zero, this function requests that a :const:`SIGALRM` signal be + sent to the process in *time* seconds. Any previously scheduled alarm is + canceled (only one alarm can be scheduled at any time). The returned value is + then the number of seconds before any previously set alarm was to have been + delivered. If *time* is zero, no alarm is scheduled, and any scheduled alarm is + canceled. If the return value is zero, no alarm is currently scheduled. (See + the Unix man page :manpage:`alarm(2)`.) Availability: Unix. + + +.. function:: getsignal(signalnum) + + Return the current signal handler for the signal *signalnum*. The returned value + may be a callable Python object, or one of the special values + :const:`signal.SIG_IGN`, :const:`signal.SIG_DFL` or :const:`None`. Here, + :const:`signal.SIG_IGN` means that the signal was previously ignored, + :const:`signal.SIG_DFL` means that the default way of handling the signal was + previously in use, and ``None`` means that the previous signal handler was not + installed from Python. + + +.. function:: pause() + + Cause the process to sleep until a signal is received; the appropriate handler + will then be called. Returns nothing. Not on Windows. (See the Unix man page + :manpage:`signal(2)`.) + + +.. function:: signal(signalnum, handler) + + Set the handler for signal *signalnum* to the function *handler*. *handler* can + be a callable Python object taking two arguments (see below), or one of the + special values :const:`signal.SIG_IGN` or :const:`signal.SIG_DFL`. The previous + signal handler will be returned (see the description of :func:`getsignal` + above). (See the Unix man page :manpage:`signal(2)`.) + + When threads are enabled, this function can only be called from the main thread; + attempting to call it from other threads will cause a :exc:`ValueError` + exception to be raised. + + The *handler* is called with two arguments: the signal number and the current + stack frame (``None`` or a frame object; for a description of frame objects, see + the reference manual section on the standard type hierarchy or see the attribute + descriptions in the :mod:`inspect` module). + + +.. _signal-example: + +Example +------- + +Here is a minimal example program. It uses the :func:`alarm` function to limit +the time spent waiting to open a file; this is useful if the file is for a +serial device that may not be turned on, which would normally cause the +:func:`os.open` to hang indefinitely. The solution is to set a 5-second alarm +before opening the file; if the operation takes too long, the alarm signal will +be sent, and the handler raises an exception. :: + + import signal, os + + def handler(signum, frame): + print 'Signal handler called with signal', signum + raise IOError, "Couldn't open device!" + + # Set the signal handler and a 5-second alarm + signal.signal(signal.SIGALRM, handler) + signal.alarm(5) + + # This open() may hang indefinitely + fd = os.open('/dev/ttyS0', os.O_RDWR) + + signal.alarm(0) # Disable the alarm + diff --git a/Doc/library/simplehttpserver.rst b/Doc/library/simplehttpserver.rst new file mode 100644 index 0000000..766253e --- /dev/null +++ b/Doc/library/simplehttpserver.rst @@ -0,0 +1,86 @@ + +:mod:`SimpleHTTPServer` --- Simple HTTP request handler +======================================================= + +.. module:: SimpleHTTPServer + :synopsis: This module provides a basic request handler for HTTP servers. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`SimpleHTTPServer` module defines a request-handler class, +interface-compatible with :class:`BaseHTTPServer.BaseHTTPRequestHandler`, that +serves files only from a base directory. + +The :mod:`SimpleHTTPServer` module defines the following class: + + +.. class:: SimpleHTTPRequestHandler(request, client_address, server) + + This class is used to serve files from the current directory and below, directly + mapping the directory structure to HTTP requests. + + A lot of the work, such as parsing the request, is done by the base class + :class:`BaseHTTPServer.BaseHTTPRequestHandler`. This class implements the + :func:`do_GET` and :func:`do_HEAD` functions. + +The :class:`SimpleHTTPRequestHandler` defines the following member variables: + + +.. attribute:: SimpleHTTPRequestHandler.server_version + + This will be ``"SimpleHTTP/" + __version__``, where ``__version__`` is defined + in the module. + + +.. attribute:: SimpleHTTPRequestHandler.extensions_map + + A dictionary mapping suffixes into MIME types. The default is signified by an + empty string, and is considered to be ``application/octet-stream``. The mapping + is used case-insensitively, and so should contain only lower-cased keys. + +The :class:`SimpleHTTPRequestHandler` defines the following methods: + + +.. method:: SimpleHTTPRequestHandler.do_HEAD() + + This method serves the ``'HEAD'`` request type: it sends the headers it would + send for the equivalent ``GET`` request. See the :meth:`do_GET` method for a + more complete explanation of the possible headers. + + +.. method:: SimpleHTTPRequestHandler.do_GET() + + The request is mapped to a local file by interpreting the request as a path + relative to the current working directory. + + If the request was mapped to a directory, the directory is checked for a file + named ``index.html`` or ``index.htm`` (in that order). If found, the file's + contents are returned; otherwise a directory listing is generated by calling the + :meth:`list_directory` method. This method uses :func:`os.listdir` to scan the + directory, and returns a ``404`` error response if the :func:`listdir` fails. + + If the request was mapped to a file, it is opened and the contents are returned. + Any :exc:`IOError` exception in opening the requested file is mapped to a + ``404``, ``'File not found'`` error. Otherwise, the content type is guessed by + calling the :meth:`guess_type` method, which in turn uses the *extensions_map* + variable. + + A ``'Content-type:'`` header with the guessed content type is output, followed + by a ``'Content-Length:'`` header with the file's size and a + ``'Last-Modified:'`` header with the file's modification time. + + Then follows a blank line signifying the end of the headers, and then the + contents of the file are output. If the file's MIME type starts with ``text/`` + the file is opened in text mode; otherwise binary mode is used. + + For example usage, see the implementation of the :func:`test` function. + + .. versionadded:: 2.5 + The ``'Last-Modified'`` header. + + +.. seealso:: + + Module :mod:`BaseHTTPServer` + Base class implementation for Web server and request handler. + diff --git a/Doc/library/simplexmlrpcserver.rst b/Doc/library/simplexmlrpcserver.rst new file mode 100644 index 0000000..51ce8d8 --- /dev/null +++ b/Doc/library/simplexmlrpcserver.rst @@ -0,0 +1,232 @@ + +:mod:`SimpleXMLRPCServer` --- Basic XML-RPC server +================================================== + +.. module:: SimpleXMLRPCServer + :synopsis: Basic XML-RPC server implementation. +.. moduleauthor:: Brian Quinlan <brianq@activestate.com> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. versionadded:: 2.2 + +The :mod:`SimpleXMLRPCServer` module provides a basic server framework for +XML-RPC servers written in Python. Servers can either be free standing, using +:class:`SimpleXMLRPCServer`, or embedded in a CGI environment, using +:class:`CGIXMLRPCRequestHandler`. + + +.. class:: SimpleXMLRPCServer(addr[, requestHandler[, logRequests[, allow_none[, encoding]]]]) + + Create a new server instance. This class provides methods for registration of + functions that can be called by the XML-RPC protocol. The *requestHandler* + parameter should be a factory for request handler instances; it defaults to + :class:`SimpleXMLRPCRequestHandler`. The *addr* and *requestHandler* parameters + are passed to the :class:`SocketServer.TCPServer` constructor. If *logRequests* + is true (the default), requests will be logged; setting this parameter to false + will turn off logging. The *allow_none* and *encoding* parameters are passed + on to :mod:`xmlrpclib` and control the XML-RPC responses that will be returned + from the server. The *bind_and_activate* parameter controls whether + :meth:`server_bind` and :meth:`server_activate` are called immediately by the + constructor; it defaults to true. Setting it to false allows code to manipulate + the *allow_reuse_address* class variable before the address is bound. + + .. versionchanged:: 2.5 + The *allow_none* and *encoding* parameters were added. + + .. versionchanged:: 2.6 + The *bind_and_activate* parameter was added. + + +.. class:: CGIXMLRPCRequestHandler([allow_none[, encoding]]) + + Create a new instance to handle XML-RPC requests in a CGI environment. The + *allow_none* and *encoding* parameters are passed on to :mod:`xmlrpclib` and + control the XML-RPC responses that will be returned from the server. + + .. versionadded:: 2.3 + + .. versionchanged:: 2.5 + The *allow_none* and *encoding* parameters were added. + + +.. class:: SimpleXMLRPCRequestHandler() + + Create a new request handler instance. This request handler supports ``POST`` + requests and modifies logging so that the *logRequests* parameter to the + :class:`SimpleXMLRPCServer` constructor parameter is honored. + + +.. _simple-xmlrpc-servers: + +SimpleXMLRPCServer Objects +-------------------------- + +The :class:`SimpleXMLRPCServer` class is based on +:class:`SocketServer.TCPServer` and provides a means of creating simple, stand +alone XML-RPC servers. + + +.. method:: SimpleXMLRPCServer.register_function(function[, name]) + + Register a function that can respond to XML-RPC requests. If *name* is given, + it will be the method name associated with *function*, otherwise + ``function.__name__`` will be used. *name* can be either a normal or Unicode + string, and may contain characters not legal in Python identifiers, including + the period character. + + +.. method:: SimpleXMLRPCServer.register_instance(instance[, allow_dotted_names]) + + Register an object which is used to expose method names which have not been + registered using :meth:`register_function`. If *instance* contains a + :meth:`_dispatch` method, it is called with the requested method name and the + parameters from the request. Its API is ``def _dispatch(self, method, params)`` + (note that *params* does not represent a variable argument list). If it calls + an underlying function to perform its task, that function is called as + ``func(*params)``, expanding the parameter list. The return value from + :meth:`_dispatch` is returned to the client as the result. If *instance* does + not have a :meth:`_dispatch` method, it is searched for an attribute matching + the name of the requested method. + + If the optional *allow_dotted_names* argument is true and the instance does not + have a :meth:`_dispatch` method, then if the requested method name contains + periods, each component of the method name is searched for individually, with + the effect that a simple hierarchical search is performed. The value found from + this search is then called with the parameters from the request, and the return + value is passed back to the client. + + .. warning:: + + Enabling the *allow_dotted_names* option allows intruders to access your + module's global variables and may allow intruders to execute arbitrary code on + your machine. Only use this option on a secure, closed network. + + .. versionchanged:: 2.3.5, 2.4.1 + *allow_dotted_names* was added to plug a security hole; prior versions are + insecure. + + +.. method:: SimpleXMLRPCServer.register_introspection_functions() + + Registers the XML-RPC introspection functions ``system.listMethods``, + ``system.methodHelp`` and ``system.methodSignature``. + + .. versionadded:: 2.3 + + +.. method:: SimpleXMLRPCServer.register_multicall_functions() + + Registers the XML-RPC multicall function system.multicall. + + +.. attribute:: SimpleXMLRPCServer.rpc_paths + + An attribute value that must be a tuple listing valid path portions of the URL + for receiving XML-RPC requests. Requests posted to other paths will result in a + 404 "no such page" HTTP error. If this tuple is empty, all paths will be + considered valid. The default value is ``('/', '/RPC2')``. + + .. versionadded:: 2.5 + +Example:: + + from SimpleXMLRPCServer import SimpleXMLRPCServer + + # Create server + server = SimpleXMLRPCServer(("localhost", 8000)) + server.register_introspection_functions() + + # Register pow() function; this will use the value of + # pow.__name__ as the name, which is just 'pow'. + server.register_function(pow) + + # Register a function under a different name + def adder_function(x,y): + return x + y + server.register_function(adder_function, 'add') + + # Register an instance; all the methods of the instance are + # published as XML-RPC methods (in this case, just 'div'). + class MyFuncs: + def div(self, x, y): + return x // y + + server.register_instance(MyFuncs()) + + # Run the server's main loop + server.serve_forever() + +The following client code will call the methods made available by the preceding +server:: + + import xmlrpclib + + s = xmlrpclib.Server('http://localhost:8000') + print s.pow(2,3) # Returns 2**3 = 8 + print s.add(2,3) # Returns 5 + print s.div(5,2) # Returns 5//2 = 2 + + # Print list of available methods + print s.system.listMethods() + + +CGIXMLRPCRequestHandler +----------------------- + +The :class:`CGIXMLRPCRequestHandler` class can be used to handle XML-RPC +requests sent to Python CGI scripts. + + +.. method:: CGIXMLRPCRequestHandler.register_function(function[, name]) + + Register a function that can respond to XML-RPC requests. If *name* is given, + it will be the method name associated with function, otherwise + *function.__name__* will be used. *name* can be either a normal or Unicode + string, and may contain characters not legal in Python identifiers, including + the period character. + + +.. method:: CGIXMLRPCRequestHandler.register_instance(instance) + + Register an object which is used to expose method names which have not been + registered using :meth:`register_function`. If instance contains a + :meth:`_dispatch` method, it is called with the requested method name and the + parameters from the request; the return value is returned to the client as the + result. If instance does not have a :meth:`_dispatch` method, it is searched + for an attribute matching the name of the requested method; if the requested + method name contains periods, each component of the method name is searched for + individually, with the effect that a simple hierarchical search is performed. + The value found from this search is then called with the parameters from the + request, and the return value is passed back to the client. + + +.. method:: CGIXMLRPCRequestHandler.register_introspection_functions() + + Register the XML-RPC introspection functions ``system.listMethods``, + ``system.methodHelp`` and ``system.methodSignature``. + + +.. method:: CGIXMLRPCRequestHandler.register_multicall_functions() + + Register the XML-RPC multicall function ``system.multicall``. + + +.. method:: CGIXMLRPCRequestHandler.handle_request([request_text = None]) + + Handle a XML-RPC request. If *request_text* is given, it should be the POST + data provided by the HTTP server, otherwise the contents of stdin will be used. + +Example:: + + class MyFuncs: + def div(self, x, y) : return x // y + + + handler = CGIXMLRPCRequestHandler() + handler.register_function(pow) + handler.register_function(lambda x,y: x+y, 'add') + handler.register_introspection_functions() + handler.register_instance(MyFuncs()) + handler.handle_request() + diff --git a/Doc/library/site.rst b/Doc/library/site.rst new file mode 100644 index 0000000..4e54900 --- /dev/null +++ b/Doc/library/site.rst @@ -0,0 +1,87 @@ + +:mod:`site` --- Site-specific configuration hook +================================================ + +.. module:: site + :synopsis: A standard way to reference site-specific modules. + + +**This module is automatically imported during initialization.** The automatic +import can be suppressed using the interpreter's :option:`-S` option. + +.. index:: triple: module; search; path + +Importing this module will append site-specific paths to the module search path. + +.. index:: + pair: site-python; directory + pair: site-packages; directory + +It starts by constructing up to four directories from a head and a tail part. +For the head part, it uses ``sys.prefix`` and ``sys.exec_prefix``; empty heads +are skipped. For the tail part, it uses the empty string and then +:file:`lib/site-packages` (on Windows) or +:file:`lib/python|version|/site-packages` and then :file:`lib/site-python` (on +Unix and Macintosh). For each of the distinct head-tail combinations, it sees +if it refers to an existing directory, and if so, adds it to ``sys.path`` and +also inspects the newly added path for configuration files. + +A path configuration file is a file whose name has the form :file:`package.pth` +and exists in one of the four directories mentioned above; its contents are +additional items (one per line) to be added to ``sys.path``. Non-existing items +are never added to ``sys.path``, but no check is made that the item refers to a +directory (rather than a file). No item is added to ``sys.path`` more than +once. Blank lines and lines beginning with ``#`` are skipped. Lines starting +with ``import`` (followed by space or tab) are executed. + +.. versionchanged:: 2.6 + A space or tab is now required after the import keyword. + +.. index:: + single: package + triple: path; configuration; file + +For example, suppose ``sys.prefix`` and ``sys.exec_prefix`` are set to +:file:`/usr/local`. The Python X.Y library is then installed in +:file:`/usr/local/lib/python{X.Y}` (where only the first three characters of +``sys.version`` are used to form the installation path name). Suppose this has +a subdirectory :file:`/usr/local/lib/python{X.Y}/site-packages` with three +subsubdirectories, :file:`foo`, :file:`bar` and :file:`spam`, and two path +configuration files, :file:`foo.pth` and :file:`bar.pth`. Assume +:file:`foo.pth` contains the following:: + + # foo package configuration + + foo + bar + bletch + +and :file:`bar.pth` contains:: + + # bar package configuration + + bar + +Then the following directories are added to ``sys.path``, in this order:: + + /usr/local/lib/python2.3/site-packages/bar + /usr/local/lib/python2.3/site-packages/foo + +Note that :file:`bletch` is omitted because it doesn't exist; the :file:`bar` +directory precedes the :file:`foo` directory because :file:`bar.pth` comes +alphabetically before :file:`foo.pth`; and :file:`spam` is omitted because it is +not mentioned in either path configuration file. + +.. index:: module: sitecustomize + +After these path manipulations, an attempt is made to import a module named +:mod:`sitecustomize`, which can perform arbitrary site-specific customizations. +If this import fails with an :exc:`ImportError` exception, it is silently +ignored. + +.. index:: module: sitecustomize + +Note that for some non-Unix systems, ``sys.prefix`` and ``sys.exec_prefix`` are +empty, and the path manipulations are skipped; however the import of +:mod:`sitecustomize` is still attempted. + diff --git a/Doc/library/smtpd.rst b/Doc/library/smtpd.rst new file mode 100644 index 0000000..8927a64 --- /dev/null +++ b/Doc/library/smtpd.rst @@ -0,0 +1,72 @@ +:mod:`smtpd` --- SMTP Server +============================ + +.. module:: smtpd + :synopsis: A SMTP server implementation in Python. + +.. moduleauthor:: Barry Warsaw <barry@zope.com> +.. sectionauthor:: Moshe Zadka <moshez@moshez.org> + + + + +This module offers several classes to implement SMTP servers. One is a generic +do-nothing implementation, which can be overridden, while the other two offer +specific mail-sending strategies. + + +SMTPServer Objects +------------------ + + +.. class:: SMTPServer(localaddr, remoteaddr) + + Create a new :class:`SMTPServer` object, which binds to local address + *localaddr*. It will treat *remoteaddr* as an upstream SMTP relayer. It + inherits from :class:`asyncore.dispatcher`, and so will insert itself into + :mod:`asyncore`'s event loop on instantiation. + + +.. method:: SMTPServer.process_message(peer, mailfrom, rcpttos, data) + + Raise :exc:`NotImplementedError` exception. Override this in subclasses to do + something useful with this message. Whatever was passed in the constructor as + *remoteaddr* will be available as the :attr:`_remoteaddr` attribute. *peer* is + the remote host's address, *mailfrom* is the envelope originator, *rcpttos* are + the envelope recipients and *data* is a string containing the contents of the + e-mail (which should be in :rfc:`2822` format). + + +DebuggingServer Objects +----------------------- + + +.. class:: DebuggingServer(localaddr, remoteaddr) + + Create a new debugging server. Arguments are as per :class:`SMTPServer`. + Messages will be discarded, and printed on stdout. + + +PureProxy Objects +----------------- + + +.. class:: PureProxy(localaddr, remoteaddr) + + Create a new pure proxy server. Arguments are as per :class:`SMTPServer`. + Everything will be relayed to *remoteaddr*. Note that running this has a good + chance to make you into an open relay, so please be careful. + + +MailmanProxy Objects +-------------------- + + +.. class:: MailmanProxy(localaddr, remoteaddr) + + Create a new pure proxy server. Arguments are as per :class:`SMTPServer`. + Everything will be relayed to *remoteaddr*, unless local mailman configurations + knows about an address, in which case it will be handled via mailman. Note that + running this has a good chance to make you into an open relay, so please be + careful. + diff --git a/Doc/library/smtplib.rst b/Doc/library/smtplib.rst new file mode 100644 index 0000000..fd898ca --- /dev/null +++ b/Doc/library/smtplib.rst @@ -0,0 +1,347 @@ + +:mod:`smtplib` --- SMTP protocol client +======================================= + +.. module:: smtplib + :synopsis: SMTP protocol client (requires sockets). +.. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com> + + +.. index:: + pair: SMTP; protocol + single: Simple Mail Transfer Protocol + +The :mod:`smtplib` module defines an SMTP client session object that can be used +to send mail to any Internet machine with an SMTP or ESMTP listener daemon. For +details of SMTP and ESMTP operation, consult :rfc:`821` (Simple Mail Transfer +Protocol) and :rfc:`1869` (SMTP Service Extensions). + + +.. class:: SMTP([host[, port[, local_hostname[, timeout]]]]) + + A :class:`SMTP` instance encapsulates an SMTP connection. It has methods that + support a full repertoire of SMTP and ESMTP operations. If the optional host and + port parameters are given, the SMTP :meth:`connect` method is called with those + parameters during initialization. An :exc:`SMTPConnectError` is raised if the + specified host doesn't respond correctly. The optional *timeout* parameter + specifies a timeout in seconds for the connection attempt (if not specified, or + passed as None, the global default timeout setting will be used). + + For normal use, you should only require the initialization/connect, + :meth:`sendmail`, and :meth:`quit` methods. An example is included below. + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. class:: SMTP_SSL([host[, port[, local_hostname[, keyfile[, certfile[, timeout]]]]]]) + + A :class:`SMTP_SSL` instance behaves exactly the same as instances of + :class:`SMTP`. :class:`SMTP_SSL` should be used for situations where SSL is + required from the beginning of the connection and using :meth:`starttls` is not + appropriate. If *host* is not specified, the local host is used. If *port* is + omitted, the standard SMTP-over-SSL port (465) is used. *keyfile* and *certfile* + are also optional, and can contain a PEM formatted private key and certificate + chain file for the SSL connection. The optional *timeout* parameter specifies a + timeout in seconds for the connection attempt (if not specified, or passed as + None, the global default timeout setting will be used). + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. class:: LMTP([host[, port[, local_hostname]]]) + + The LMTP protocol, which is very similar to ESMTP, is heavily based on the + standard SMTP client. It's common to use Unix sockets for LMTP, so our connect() + method must support that as well as a regular host:port server. To specify a + Unix socket, you must use an absolute path for *host*, starting with a '/'. + + Authentication is supported, using the regular SMTP mechanism. When using a Unix + socket, LMTP generally don't support or require any authentication, but your + mileage might vary. + + .. versionadded:: 2.6 + +A nice selection of exceptions is defined as well: + + +.. exception:: SMTPException + + Base exception class for all exceptions raised by this module. + + +.. exception:: SMTPServerDisconnected + + This exception is raised when the server unexpectedly disconnects, or when an + attempt is made to use the :class:`SMTP` instance before connecting it to a + server. + + +.. exception:: SMTPResponseException + + Base class for all exceptions that include an SMTP error code. These exceptions + are generated in some instances when the SMTP server returns an error code. The + error code is stored in the :attr:`smtp_code` attribute of the error, and the + :attr:`smtp_error` attribute is set to the error message. + + +.. exception:: SMTPSenderRefused + + Sender address refused. In addition to the attributes set by on all + :exc:`SMTPResponseException` exceptions, this sets 'sender' to the string that + the SMTP server refused. + + +.. exception:: SMTPRecipientsRefused + + All recipient addresses refused. The errors for each recipient are accessible + through the attribute :attr:`recipients`, which is a dictionary of exactly the + same sort as :meth:`SMTP.sendmail` returns. + + +.. exception:: SMTPDataError + + The SMTP server refused to accept the message data. + + +.. exception:: SMTPConnectError + + Error occurred during establishment of a connection with the server. + + +.. exception:: SMTPHeloError + + The server refused our ``HELO`` message. + + +.. exception:: SMTPAuthenticationError + + SMTP authentication went wrong. Most probably the server didn't accept the + username/password combination provided. + + +.. seealso:: + + :rfc:`821` - Simple Mail Transfer Protocol + Protocol definition for SMTP. This document covers the model, operating + procedure, and protocol details for SMTP. + + :rfc:`1869` - SMTP Service Extensions + Definition of the ESMTP extensions for SMTP. This describes a framework for + extending SMTP with new commands, supporting dynamic discovery of the commands + provided by the server, and defines a few additional commands. + + +.. _smtp-objects: + +SMTP Objects +------------ + +An :class:`SMTP` instance has the following methods: + + +.. method:: SMTP.set_debuglevel(level) + + Set the debug output level. A true value for *level* results in debug messages + for connection and for all messages sent to and received from the server. + + +.. method:: SMTP.connect([host[, port]]) + + Connect to a host on a given port. The defaults are to connect to the local + host at the standard SMTP port (25). If the hostname ends with a colon (``':'``) + followed by a number, that suffix will be stripped off and the number + interpreted as the port number to use. This method is automatically invoked by + the constructor if a host is specified during instantiation. + + +.. method:: SMTP.docmd(cmd, [, argstring]) + + Send a command *cmd* to the server. The optional argument *argstring* is simply + concatenated to the command, separated by a space. + + This returns a 2-tuple composed of a numeric response code and the actual + response line (multiline responses are joined into one long line.) + + In normal operation it should not be necessary to call this method explicitly. + It is used to implement other methods and may be useful for testing private + extensions. + + If the connection to the server is lost while waiting for the reply, + :exc:`SMTPServerDisconnected` will be raised. + + +.. method:: SMTP.helo([hostname]) + + Identify yourself to the SMTP server using ``HELO``. The hostname argument + defaults to the fully qualified domain name of the local host. + + In normal operation it should not be necessary to call this method explicitly. + It will be implicitly called by the :meth:`sendmail` when necessary. + + +.. method:: SMTP.ehlo([hostname]) + + Identify yourself to an ESMTP server using ``EHLO``. The hostname argument + defaults to the fully qualified domain name of the local host. Examine the + response for ESMTP option and store them for use by :meth:`has_extn`. + + Unless you wish to use :meth:`has_extn` before sending mail, it should not be + necessary to call this method explicitly. It will be implicitly called by + :meth:`sendmail` when necessary. + + +.. method:: SMTP.has_extn(name) + + Return :const:`True` if *name* is in the set of SMTP service extensions returned + by the server, :const:`False` otherwise. Case is ignored. + + +.. method:: SMTP.verify(address) + + Check the validity of an address on this server using SMTP ``VRFY``. Returns a + tuple consisting of code 250 and a full :rfc:`822` address (including human + name) if the user address is valid. Otherwise returns an SMTP error code of 400 + or greater and an error string. + + .. note:: + + Many sites disable SMTP ``VRFY`` in order to foil spammers. + + +.. method:: SMTP.login(user, password) + + Log in on an SMTP server that requires authentication. The arguments are the + username and the password to authenticate with. If there has been no previous + ``EHLO`` or ``HELO`` command this session, this method tries ESMTP ``EHLO`` + first. This method will return normally if the authentication was successful, or + may raise the following exceptions: + + :exc:`SMTPHeloError` + The server didn't reply properly to the ``HELO`` greeting. + + :exc:`SMTPAuthenticationError` + The server didn't accept the username/password combination. + + :exc:`SMTPException` + No suitable authentication method was found. + + +.. method:: SMTP.starttls([keyfile[, certfile]]) + + Put the SMTP connection in TLS (Transport Layer Security) mode. All SMTP + commands that follow will be encrypted. You should then call :meth:`ehlo` + again. + + If *keyfile* and *certfile* are provided, these are passed to the :mod:`socket` + module's :func:`ssl` function. + + +.. method:: SMTP.sendmail(from_addr, to_addrs, msg[, mail_options, rcpt_options]) + + Send mail. The required arguments are an :rfc:`822` from-address string, a list + of :rfc:`822` to-address strings (a bare string will be treated as a list with 1 + address), and a message string. The caller may pass a list of ESMTP options + (such as ``8bitmime``) to be used in ``MAIL FROM`` commands as *mail_options*. + ESMTP options (such as ``DSN`` commands) that should be used with all ``RCPT`` + commands can be passed as *rcpt_options*. (If you need to use different ESMTP + options to different recipients you have to use the low-level methods such as + :meth:`mail`, :meth:`rcpt` and :meth:`data` to send the message.) + + .. note:: + + The *from_addr* and *to_addrs* parameters are used to construct the message + envelope used by the transport agents. The :class:`SMTP` does not modify the + message headers in any way. + + If there has been no previous ``EHLO`` or ``HELO`` command this session, this + method tries ESMTP ``EHLO`` first. If the server does ESMTP, message size and + each of the specified options will be passed to it (if the option is in the + feature set the server advertises). If ``EHLO`` fails, ``HELO`` will be tried + and ESMTP options suppressed. + + This method will return normally if the mail is accepted for at least one + recipient. Otherwise it will throw an exception. That is, if this method does + not throw an exception, then someone should get your mail. If this method does + not throw an exception, it returns a dictionary, with one entry for each + recipient that was refused. Each entry contains a tuple of the SMTP error code + and the accompanying error message sent by the server. + + This method may raise the following exceptions: + + :exc:`SMTPRecipientsRefused` + All recipients were refused. Nobody got the mail. The :attr:`recipients` + attribute of the exception object is a dictionary with information about the + refused recipients (like the one returned when at least one recipient was + accepted). + + :exc:`SMTPHeloError` + The server didn't reply properly to the ``HELO`` greeting. + + :exc:`SMTPSenderRefused` + The server didn't accept the *from_addr*. + + :exc:`SMTPDataError` + The server replied with an unexpected error code (other than a refusal of a + recipient). + + Unless otherwise noted, the connection will be open even after an exception is + raised. + + +.. method:: SMTP.quit() + + Terminate the SMTP session and close the connection. + +Low-level methods corresponding to the standard SMTP/ESMTP commands ``HELP``, +``RSET``, ``NOOP``, ``MAIL``, ``RCPT``, and ``DATA`` are also supported. +Normally these do not need to be called directly, so they are not documented +here. For details, consult the module code. + + +.. _smtp-example: + +SMTP Example +------------ + +This example prompts the user for addresses needed in the message envelope ('To' +and 'From' addresses), and the message to be delivered. Note that the headers +to be included with the message must be included in the message as entered; this +example doesn't do any processing of the :rfc:`822` headers. In particular, the +'To' and 'From' addresses must be included in the message headers explicitly. :: + + import smtplib + + def raw_input(prompt): + import sys + sys.stdout.write(prompt) + sys.stdout.flush() + return sys.stdin.readline() + + def prompt(prompt): + return raw_input(prompt).strip() + + fromaddr = prompt("From: ") + toaddrs = prompt("To: ").split() + print "Enter message, end with ^D (Unix) or ^Z (Windows):" + + # Add the From: and To: headers at the start! + msg = ("From: %s\r\nTo: %s\r\n\r\n" + % (fromaddr, ", ".join(toaddrs))) + while 1: + try: + line = raw_input() + except EOFError: + break + if not line: + break + msg = msg + line + + print "Message length is " + repr(len(msg)) + + server = smtplib.SMTP('localhost') + server.set_debuglevel(1) + server.sendmail(fromaddr, toaddrs, msg) + server.quit() + diff --git a/Doc/library/sndhdr.rst b/Doc/library/sndhdr.rst new file mode 100644 index 0000000..90d71a9 --- /dev/null +++ b/Doc/library/sndhdr.rst @@ -0,0 +1,42 @@ + +:mod:`sndhdr` --- Determine type of sound file +============================================== + +.. module:: sndhdr + :synopsis: Determine type of a sound file. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. % Based on comments in the module source file. + +.. index:: + single: A-LAW + single: u-LAW + +The :mod:`sndhdr` provides utility functions which attempt to determine the type +of sound data which is in a file. When these functions are able to determine +what type of sound data is stored in a file, they return a tuple ``(type, +sampling_rate, channels, frames, bits_per_sample)``. The value for *type* +indicates the data type and will be one of the strings ``'aifc'``, ``'aiff'``, +``'au'``, ``'hcom'``, ``'sndr'``, ``'sndt'``, ``'voc'``, ``'wav'``, ``'8svx'``, +``'sb'``, ``'ub'``, or ``'ul'``. The *sampling_rate* will be either the actual +value or ``0`` if unknown or difficult to decode. Similarly, *channels* will be +either the number of channels or ``0`` if it cannot be determined or if the +value is difficult to decode. The value for *frames* will be either the number +of frames or ``-1``. The last item in the tuple, *bits_per_sample*, will either +be the sample size in bits or ``'A'`` for A-LAW or ``'U'`` for u-LAW. + + +.. function:: what(filename) + + Determines the type of sound data stored in the file *filename* using + :func:`whathdr`. If it succeeds, returns a tuple as described above, otherwise + ``None`` is returned. + + +.. function:: whathdr(filename) + + Determines the type of sound data stored in a file based on the file header. + The name of the file is given by *filename*. This function returns a tuple as + described above on success, or ``None``. + diff --git a/Doc/library/socket.rst b/Doc/library/socket.rst new file mode 100644 index 0000000..0ec4461 --- /dev/null +++ b/Doc/library/socket.rst @@ -0,0 +1,941 @@ + +:mod:`socket` --- Low-level networking interface +================================================ + +.. module:: socket + :synopsis: Low-level networking interface. + + +This module provides access to the BSD *socket* interface. It is available on +all modern Unix systems, Windows, MacOS, BeOS, OS/2, and probably additional +platforms. + +.. note:: + + Some behavior may be platform dependent, since calls are made to the operating + system socket APIs. + +For an introduction to socket programming (in C), see the following papers: An +Introductory 4.3BSD Interprocess Communication Tutorial, by Stuart Sechrest and +An Advanced 4.3BSD Interprocess Communication Tutorial, by Samuel J. Leffler et +al, both in the UNIX Programmer's Manual, Supplementary Documents 1 (sections +PS1:7 and PS1:8). The platform-specific reference material for the various +socket-related system calls are also a valuable source of information on the +details of socket semantics. For Unix, refer to the manual pages; for Windows, +see the WinSock (or Winsock 2) specification. For IPv6-ready APIs, readers may +want to refer to :rfc:`2553` titled Basic Socket Interface Extensions for IPv6. + +.. index:: object: socket + +The Python interface is a straightforward transliteration of the Unix system +call and library interface for sockets to Python's object-oriented style: the +:func:`socket` function returns a :dfn:`socket object` whose methods implement +the various socket system calls. Parameter types are somewhat higher-level than +in the C interface: as with :meth:`read` and :meth:`write` operations on Python +files, buffer allocation on receive operations is automatic, and buffer length +is implicit on send operations. + +Socket addresses are represented as follows: A single string is used for the +:const:`AF_UNIX` address family. A pair ``(host, port)`` is used for the +:const:`AF_INET` address family, where *host* is a string representing either a +hostname in Internet domain notation like ``'daring.cwi.nl'`` or an IPv4 address +like ``'100.50.200.5'``, and *port* is an integral port number. For +:const:`AF_INET6` address family, a four-tuple ``(host, port, flowinfo, +scopeid)`` is used, where *flowinfo* and *scopeid* represents ``sin6_flowinfo`` +and ``sin6_scope_id`` member in :const:`struct sockaddr_in6` in C. For +:mod:`socket` module methods, *flowinfo* and *scopeid* can be omitted just for +backward compatibility. Note, however, omission of *scopeid* can cause problems +in manipulating scoped IPv6 addresses. Other address families are currently not +supported. The address format required by a particular socket object is +automatically selected based on the address family specified when the socket +object was created. + +For IPv4 addresses, two special forms are accepted instead of a host address: +the empty string represents :const:`INADDR_ANY`, and the string +``'<broadcast>'`` represents :const:`INADDR_BROADCAST`. The behavior is not +available for IPv6 for backward compatibility, therefore, you may want to avoid +these if you intend to support IPv6 with your Python programs. + +If you use a hostname in the *host* portion of IPv4/v6 socket address, the +program may show a nondeterministic behavior, as Python uses the first address +returned from the DNS resolution. The socket address will be resolved +differently into an actual IPv4/v6 address, depending on the results from DNS +resolution and/or the host configuration. For deterministic behavior use a +numeric address in *host* portion. + +.. versionadded:: 2.5 + AF_NETLINK sockets are represented as pairs ``pid, groups``. + +All errors raise exceptions. The normal exceptions for invalid argument types +and out-of-memory conditions can be raised; errors related to socket or address +semantics raise the error :exc:`socket.error`. + +Non-blocking mode is supported through :meth:`setblocking`. A generalization of +this based on timeouts is supported through :meth:`settimeout`. + +The module :mod:`socket` exports the following constants and functions: + + +.. exception:: error + + .. index:: module: errno + + This exception is raised for socket-related errors. The accompanying value is + either a string telling what went wrong or a pair ``(errno, string)`` + representing an error returned by a system call, similar to the value + accompanying :exc:`os.error`. See the module :mod:`errno`, which contains names + for the error codes defined by the underlying operating system. + + +.. exception:: herror + + This exception is raised for address-related errors, i.e. for functions that use + *h_errno* in the C API, including :func:`gethostbyname_ex` and + :func:`gethostbyaddr`. + + The accompanying value is a pair ``(h_errno, string)`` representing an error + returned by a library call. *string* represents the description of *h_errno*, as + returned by the :cfunc:`hstrerror` C function. + + +.. exception:: gaierror + + This exception is raised for address-related errors, for :func:`getaddrinfo` and + :func:`getnameinfo`. The accompanying value is a pair ``(error, string)`` + representing an error returned by a library call. *string* represents the + description of *error*, as returned by the :cfunc:`gai_strerror` C function. The + *error* value will match one of the :const:`EAI_\*` constants defined in this + module. + + +.. exception:: timeout + + This exception is raised when a timeout occurs on a socket which has had + timeouts enabled via a prior call to :meth:`settimeout`. The accompanying value + is a string whose value is currently always "timed out". + + .. versionadded:: 2.3 + + +.. data:: AF_UNIX + AF_INET + AF_INET6 + + These constants represent the address (and protocol) families, used for the + first argument to :func:`socket`. If the :const:`AF_UNIX` constant is not + defined then this protocol is unsupported. + + +.. data:: SOCK_STREAM + SOCK_DGRAM + SOCK_RAW + SOCK_RDM + SOCK_SEQPACKET + + These constants represent the socket types, used for the second argument to + :func:`socket`. (Only :const:`SOCK_STREAM` and :const:`SOCK_DGRAM` appear to be + generally useful.) + + +.. data:: SO_* + SOMAXCONN + MSG_* + SOL_* + IPPROTO_* + IPPORT_* + INADDR_* + IP_* + IPV6_* + EAI_* + AI_* + NI_* + TCP_* + + Many constants of these forms, documented in the Unix documentation on sockets + and/or the IP protocol, are also defined in the socket module. They are + generally used in arguments to the :meth:`setsockopt` and :meth:`getsockopt` + methods of socket objects. In most cases, only those symbols that are defined + in the Unix header files are defined; for a few symbols, default values are + provided. + + +.. data:: has_ipv6 + + This constant contains a boolean value which indicates if IPv6 is supported on + this platform. + + .. versionadded:: 2.3 + + +.. function:: create_connection(address[, timeout]) + + Connects to the *address* received (as usual, a ``(host, port)`` pair), with an + optional timeout for the connection. Specially useful for higher-level + protocols, it is not normally used directly from application-level code. + Passing the optional *timeout* parameter will set the timeout on the socket + instance (if it is not given or ``None``, the global default timeout setting is + used). + + .. versionadded:: 2.6 + + +.. function:: getaddrinfo(host, port[, family[, socktype[, proto[, flags]]]]) + + Resolves the *host*/*port* argument, into a sequence of 5-tuples that contain + all the necessary argument for the sockets manipulation. *host* is a domain + name, a string representation of IPv4/v6 address or ``None``. *port* is a string + service name (like ``'http'``), a numeric port number or ``None``. + + The rest of the arguments are optional and must be numeric if specified. For + *host* and *port*, by passing either an empty string or ``None``, you can pass + ``NULL`` to the C API. The :func:`getaddrinfo` function returns a list of + 5-tuples with the following structure: + + ``(family, socktype, proto, canonname, sockaddr)`` + + *family*, *socktype*, *proto* are all integer and are meant to be passed to the + :func:`socket` function. *canonname* is a string representing the canonical name + of the *host*. It can be a numeric IPv4/v6 address when :const:`AI_CANONNAME` is + specified for a numeric *host*. *sockaddr* is a tuple describing a socket + address, as described above. See the source for the :mod:`httplib` and other + library modules for a typical usage of the function. + + .. versionadded:: 2.2 + + +.. function:: getfqdn([name]) + + Return a fully qualified domain name for *name*. If *name* is omitted or empty, + it is interpreted as the local host. To find the fully qualified name, the + hostname returned by :func:`gethostbyaddr` is checked, then aliases for the + host, if available. The first name which includes a period is selected. In + case no fully qualified domain name is available, the hostname as returned by + :func:`gethostname` is returned. + + .. versionadded:: 2.0 + + +.. function:: gethostbyname(hostname) + + Translate a host name to IPv4 address format. The IPv4 address is returned as a + string, such as ``'100.50.200.5'``. If the host name is an IPv4 address itself + it is returned unchanged. See :func:`gethostbyname_ex` for a more complete + interface. :func:`gethostbyname` does not support IPv6 name resolution, and + :func:`getaddrinfo` should be used instead for IPv4/v6 dual stack support. + + +.. function:: gethostbyname_ex(hostname) + + Translate a host name to IPv4 address format, extended interface. Return a + triple ``(hostname, aliaslist, ipaddrlist)`` where *hostname* is the primary + host name responding to the given *ip_address*, *aliaslist* is a (possibly + empty) list of alternative host names for the same address, and *ipaddrlist* is + a list of IPv4 addresses for the same interface on the same host (often but not + always a single address). :func:`gethostbyname_ex` does not support IPv6 name + resolution, and :func:`getaddrinfo` should be used instead for IPv4/v6 dual + stack support. + + +.. function:: gethostname() + + Return a string containing the hostname of the machine where the Python + interpreter is currently executing. If you want to know the current machine's IP + address, you may want to use ``gethostbyname(gethostname())``. This operation + assumes that there is a valid address-to-host mapping for the host, and the + assumption does not always hold. Note: :func:`gethostname` doesn't always return + the fully qualified domain name; use ``getfqdn()`` (see above). + + +.. function:: gethostbyaddr(ip_address) + + Return a triple ``(hostname, aliaslist, ipaddrlist)`` where *hostname* is the + primary host name responding to the given *ip_address*, *aliaslist* is a + (possibly empty) list of alternative host names for the same address, and + *ipaddrlist* is a list of IPv4/v6 addresses for the same interface on the same + host (most likely containing only a single address). To find the fully qualified + domain name, use the function :func:`getfqdn`. :func:`gethostbyaddr` supports + both IPv4 and IPv6. + + +.. function:: getnameinfo(sockaddr, flags) + + Translate a socket address *sockaddr* into a 2-tuple ``(host, port)``. Depending + on the settings of *flags*, the result can contain a fully-qualified domain name + or numeric address representation in *host*. Similarly, *port* can contain a + string port name or a numeric port number. + + .. versionadded:: 2.2 + + +.. function:: getprotobyname(protocolname) + + Translate an Internet protocol name (for example, ``'icmp'``) to a constant + suitable for passing as the (optional) third argument to the :func:`socket` + function. This is usually only needed for sockets opened in "raw" mode + (:const:`SOCK_RAW`); for the normal socket modes, the correct protocol is chosen + automatically if the protocol is omitted or zero. + + +.. function:: getservbyname(servicename[, protocolname]) + + Translate an Internet service name and protocol name to a port number for that + service. The optional protocol name, if given, should be ``'tcp'`` or + ``'udp'``, otherwise any protocol will match. + + +.. function:: getservbyport(port[, protocolname]) + + Translate an Internet port number and protocol name to a service name for that + service. The optional protocol name, if given, should be ``'tcp'`` or + ``'udp'``, otherwise any protocol will match. + + +.. function:: socket([family[, type[, proto]]]) + + Create a new socket using the given address family, socket type and protocol + number. The address family should be :const:`AF_INET` (the default), + :const:`AF_INET6` or :const:`AF_UNIX`. The socket type should be + :const:`SOCK_STREAM` (the default), :const:`SOCK_DGRAM` or perhaps one of the + other ``SOCK_`` constants. The protocol number is usually zero and may be + omitted in that case. + + +.. function:: ssl(sock[, keyfile, certfile]) + + Initiate a SSL connection over the socket *sock*. *keyfile* is the name of a PEM + formatted file that contains your private key. *certfile* is a PEM formatted + certificate chain file. On success, a new :class:`SSLObject` is returned. + + .. warning:: + + This does not do any certificate verification! + + +.. function:: socketpair([family[, type[, proto]]]) + + Build a pair of connected socket objects using the given address family, socket + type, and protocol number. Address family, socket type, and protocol number are + as for the :func:`socket` function above. The default family is :const:`AF_UNIX` + if defined on the platform; otherwise, the default is :const:`AF_INET`. + Availability: Unix. + + .. versionadded:: 2.4 + + +.. function:: fromfd(fd, family, type[, proto]) + + Duplicate the file descriptor *fd* (an integer as returned by a file object's + :meth:`fileno` method) and build a socket object from the result. Address + family, socket type and protocol number are as for the :func:`socket` function + above. The file descriptor should refer to a socket, but this is not checked --- + subsequent operations on the object may fail if the file descriptor is invalid. + This function is rarely needed, but can be used to get or set socket options on + a socket passed to a program as standard input or output (such as a server + started by the Unix inet daemon). The socket is assumed to be in blocking mode. + Availability: Unix. + + +.. function:: ntohl(x) + + Convert 32-bit positive integers from network to host byte order. On machines + where the host byte order is the same as network byte order, this is a no-op; + otherwise, it performs a 4-byte swap operation. + + +.. function:: ntohs(x) + + Convert 16-bit positive integers from network to host byte order. On machines + where the host byte order is the same as network byte order, this is a no-op; + otherwise, it performs a 2-byte swap operation. + + +.. function:: htonl(x) + + Convert 32-bit positive integers from host to network byte order. On machines + where the host byte order is the same as network byte order, this is a no-op; + otherwise, it performs a 4-byte swap operation. + + +.. function:: htons(x) + + Convert 16-bit positive integers from host to network byte order. On machines + where the host byte order is the same as network byte order, this is a no-op; + otherwise, it performs a 2-byte swap operation. + + +.. function:: inet_aton(ip_string) + + Convert an IPv4 address from dotted-quad string format (for example, + '123.45.67.89') to 32-bit packed binary format, as a string four characters in + length. This is useful when conversing with a program that uses the standard C + library and needs objects of type :ctype:`struct in_addr`, which is the C type + for the 32-bit packed binary this function returns. + + If the IPv4 address string passed to this function is invalid, + :exc:`socket.error` will be raised. Note that exactly what is valid depends on + the underlying C implementation of :cfunc:`inet_aton`. + + :func:`inet_aton` does not support IPv6, and :func:`getnameinfo` should be used + instead for IPv4/v6 dual stack support. + + +.. function:: inet_ntoa(packed_ip) + + Convert a 32-bit packed IPv4 address (a string four characters in length) to its + standard dotted-quad string representation (for example, '123.45.67.89'). This + is useful when conversing with a program that uses the standard C library and + needs objects of type :ctype:`struct in_addr`, which is the C type for the + 32-bit packed binary data this function takes as an argument. + + If the string passed to this function is not exactly 4 bytes in length, + :exc:`socket.error` will be raised. :func:`inet_ntoa` does not support IPv6, and + :func:`getnameinfo` should be used instead for IPv4/v6 dual stack support. + + +.. function:: inet_pton(address_family, ip_string) + + Convert an IP address from its family-specific string format to a packed, binary + format. :func:`inet_pton` is useful when a library or network protocol calls for + an object of type :ctype:`struct in_addr` (similar to :func:`inet_aton`) or + :ctype:`struct in6_addr`. + + Supported values for *address_family* are currently :const:`AF_INET` and + :const:`AF_INET6`. If the IP address string *ip_string* is invalid, + :exc:`socket.error` will be raised. Note that exactly what is valid depends on + both the value of *address_family* and the underlying implementation of + :cfunc:`inet_pton`. + + Availability: Unix (maybe not all platforms). + + .. versionadded:: 2.3 + + +.. function:: inet_ntop(address_family, packed_ip) + + Convert a packed IP address (a string of some number of characters) to its + standard, family-specific string representation (for example, ``'7.10.0.5'`` or + ``'5aef:2b::8'``) :func:`inet_ntop` is useful when a library or network protocol + returns an object of type :ctype:`struct in_addr` (similar to :func:`inet_ntoa`) + or :ctype:`struct in6_addr`. + + Supported values for *address_family* are currently :const:`AF_INET` and + :const:`AF_INET6`. If the string *packed_ip* is not the correct length for the + specified address family, :exc:`ValueError` will be raised. A + :exc:`socket.error` is raised for errors from the call to :func:`inet_ntop`. + + Availability: Unix (maybe not all platforms). + + .. versionadded:: 2.3 + + +.. function:: getdefaulttimeout() + + Return the default timeout in floating seconds for new socket objects. A value + of ``None`` indicates that new socket objects have no timeout. When the socket + module is first imported, the default is ``None``. + + .. versionadded:: 2.3 + + +.. function:: setdefaulttimeout(timeout) + + Set the default timeout in floating seconds for new socket objects. A value of + ``None`` indicates that new socket objects have no timeout. When the socket + module is first imported, the default is ``None``. + + .. versionadded:: 2.3 + + +.. data:: SocketType + + This is a Python type object that represents the socket object type. It is the + same as ``type(socket(...))``. + + +.. seealso:: + + Module :mod:`SocketServer` + Classes that simplify writing network servers. + + +.. _socket-objects: + +Socket Objects +-------------- + +Socket objects have the following methods. Except for :meth:`makefile` these +correspond to Unix system calls applicable to sockets. + + +.. method:: socket.accept() + + Accept a connection. The socket must be bound to an address and listening for + connections. The return value is a pair ``(conn, address)`` where *conn* is a + *new* socket object usable to send and receive data on the connection, and + *address* is the address bound to the socket on the other end of the connection. + + +.. method:: socket.bind(address) + + Bind the socket to *address*. The socket must not already be bound. (The format + of *address* depends on the address family --- see above.) + + .. note:: + + This method has historically accepted a pair of parameters for :const:`AF_INET` + addresses instead of only a tuple. This was never intentional and is no longer + available in Python 2.0 and later. + + +.. method:: socket.close() + + Close the socket. All future operations on the socket object will fail. The + remote end will receive no more data (after queued data is flushed). Sockets are + automatically closed when they are garbage-collected. + + +.. method:: socket.connect(address) + + Connect to a remote socket at *address*. (The format of *address* depends on the + address family --- see above.) + + .. note:: + + This method has historically accepted a pair of parameters for :const:`AF_INET` + addresses instead of only a tuple. This was never intentional and is no longer + available in Python 2.0 and later. + + +.. method:: socket.connect_ex(address) + + Like ``connect(address)``, but return an error indicator instead of raising an + exception for errors returned by the C-level :cfunc:`connect` call (other + problems, such as "host not found," can still raise exceptions). The error + indicator is ``0`` if the operation succeeded, otherwise the value of the + :cdata:`errno` variable. This is useful to support, for example, asynchronous + connects. + + .. note:: + + This method has historically accepted a pair of parameters for :const:`AF_INET` + addresses instead of only a tuple. This was never intentional and is no longer + available in Python 2.0 and later. + + +.. method:: socket.fileno() + + Return the socket's file descriptor (a small integer). This is useful with + :func:`select.select`. + + Under Windows the small integer returned by this method cannot be used where a + file descriptor can be used (such as :func:`os.fdopen`). Unix does not have + this limitation. + + +.. method:: socket.getpeername() + + Return the remote address to which the socket is connected. This is useful to + find out the port number of a remote IPv4/v6 socket, for instance. (The format + of the address returned depends on the address family --- see above.) On some + systems this function is not supported. + + +.. method:: socket.getsockname() + + Return the socket's own address. This is useful to find out the port number of + an IPv4/v6 socket, for instance. (The format of the address returned depends on + the address family --- see above.) + + +.. method:: socket.getsockopt(level, optname[, buflen]) + + Return the value of the given socket option (see the Unix man page + :manpage:`getsockopt(2)`). The needed symbolic constants (:const:`SO_\*` etc.) + are defined in this module. If *buflen* is absent, an integer option is assumed + and its integer value is returned by the function. If *buflen* is present, it + specifies the maximum length of the buffer used to receive the option in, and + this buffer is returned as a string. It is up to the caller to decode the + contents of the buffer (see the optional built-in module :mod:`struct` for a way + to decode C structures encoded as strings). + + +.. method:: socket.listen(backlog) + + Listen for connections made to the socket. The *backlog* argument specifies the + maximum number of queued connections and should be at least 1; the maximum value + is system-dependent (usually 5). + + +.. method:: socket.makefile([mode[, bufsize]]) + + .. index:: single: I/O control; buffering + + Return a :dfn:`file object` associated with the socket. (File objects are + described in :ref:`bltin-file-objects`.) The file object + references a :cfunc:`dup`\ ped version of the socket file descriptor, so the + file object and socket object may be closed or garbage-collected independently. + The socket must be in blocking mode (it can not have a timeout). The optional + *mode* and *bufsize* arguments are interpreted the same way as by the built-in + :func:`file` function; see :ref:`built-in-funcs` for more information. + + +.. method:: socket.recv(bufsize[, flags]) + + Receive data from the socket. The return value is a string representing the + data received. The maximum amount of data to be received at once is specified + by *bufsize*. See the Unix manual page :manpage:`recv(2)` for the meaning of + the optional argument *flags*; it defaults to zero. + + .. note:: + + For best match with hardware and network realities, the value of *bufsize* + should be a relatively small power of 2, for example, 4096. + + +.. method:: socket.recvfrom(bufsize[, flags]) + + Receive data from the socket. The return value is a pair ``(string, address)`` + where *string* is a string representing the data received and *address* is the + address of the socket sending the data. See the Unix manual page + :manpage:`recv(2)` for the meaning of the optional argument *flags*; it defaults + to zero. (The format of *address* depends on the address family --- see above.) + + +.. method:: socket.recvfrom_into(buffer[, nbytes[, flags]]) + + Receive data from the socket, writing it into *buffer* instead of creating a + new string. The return value is a pair ``(nbytes, address)`` where *nbytes* is + the number of bytes received and *address* is the address of the socket sending + the data. See the Unix manual page :manpage:`recv(2)` for the meaning of the + optional argument *flags*; it defaults to zero. (The format of *address* + depends on the address family --- see above.) + + .. versionadded:: 2.5 + + +.. method:: socket.recv_into(buffer[, nbytes[, flags]]) + + Receive up to *nbytes* bytes from the socket, storing the data into a buffer + rather than creating a new string. If *nbytes* is not specified (or 0), + receive up to the size available in the given buffer. See the Unix manual page + :manpage:`recv(2)` for the meaning of the optional argument *flags*; it defaults + to zero. + + .. versionadded:: 2.5 + + +.. method:: socket.send(string[, flags]) + + Send data to the socket. The socket must be connected to a remote socket. The + optional *flags* argument has the same meaning as for :meth:`recv` above. + Returns the number of bytes sent. Applications are responsible for checking that + all data has been sent; if only some of the data was transmitted, the + application needs to attempt delivery of the remaining data. + + +.. method:: socket.sendall(string[, flags]) + + Send data to the socket. The socket must be connected to a remote socket. The + optional *flags* argument has the same meaning as for :meth:`recv` above. + Unlike :meth:`send`, this method continues to send data from *string* until + either all data has been sent or an error occurs. ``None`` is returned on + success. On error, an exception is raised, and there is no way to determine how + much data, if any, was successfully sent. + + +.. method:: socket.sendto(string[, flags], address) + + Send data to the socket. The socket should not be connected to a remote socket, + since the destination socket is specified by *address*. The optional *flags* + argument has the same meaning as for :meth:`recv` above. Return the number of + bytes sent. (The format of *address* depends on the address family --- see + above.) + + +.. method:: socket.setblocking(flag) + + Set blocking or non-blocking mode of the socket: if *flag* is 0, the socket is + set to non-blocking, else to blocking mode. Initially all sockets are in + blocking mode. In non-blocking mode, if a :meth:`recv` call doesn't find any + data, or if a :meth:`send` call can't immediately dispose of the data, a + :exc:`error` exception is raised; in blocking mode, the calls block until they + can proceed. ``s.setblocking(0)`` is equivalent to ``s.settimeout(0)``; + ``s.setblocking(1)`` is equivalent to ``s.settimeout(None)``. + + +.. method:: socket.settimeout(value) + + Set a timeout on blocking socket operations. The *value* argument can be a + nonnegative float expressing seconds, or ``None``. If a float is given, + subsequent socket operations will raise an :exc:`timeout` exception if the + timeout period *value* has elapsed before the operation has completed. Setting + a timeout of ``None`` disables timeouts on socket operations. + ``s.settimeout(0.0)`` is equivalent to ``s.setblocking(0)``; + ``s.settimeout(None)`` is equivalent to ``s.setblocking(1)``. + + .. versionadded:: 2.3 + + +.. method:: socket.gettimeout() + + Return the timeout in floating seconds associated with socket operations, or + ``None`` if no timeout is set. This reflects the last call to + :meth:`setblocking` or :meth:`settimeout`. + + .. versionadded:: 2.3 + +Some notes on socket blocking and timeouts: A socket object can be in one of +three modes: blocking, non-blocking, or timeout. Sockets are always created in +blocking mode. In blocking mode, operations block until complete. In +non-blocking mode, operations fail (with an error that is unfortunately +system-dependent) if they cannot be completed immediately. In timeout mode, +operations fail if they cannot be completed within the timeout specified for the +socket. The :meth:`setblocking` method is simply a shorthand for certain +:meth:`settimeout` calls. + +Timeout mode internally sets the socket in non-blocking mode. The blocking and +timeout modes are shared between file descriptors and socket objects that refer +to the same network endpoint. A consequence of this is that file objects +returned by the :meth:`makefile` method must only be used when the socket is in +blocking mode; in timeout or non-blocking mode file operations that cannot be +completed immediately will fail. + +Note that the :meth:`connect` operation is subject to the timeout setting, and +in general it is recommended to call :meth:`settimeout` before calling +:meth:`connect`. + + +.. method:: socket.setsockopt(level, optname, value) + + .. index:: module: struct + + Set the value of the given socket option (see the Unix manual page + :manpage:`setsockopt(2)`). The needed symbolic constants are defined in the + :mod:`socket` module (:const:`SO_\*` etc.). The value can be an integer or a + string representing a buffer. In the latter case it is up to the caller to + ensure that the string contains the proper bits (see the optional built-in + module :mod:`struct` for a way to encode C structures as strings). + + +.. method:: socket.shutdown(how) + + Shut down one or both halves of the connection. If *how* is :const:`SHUT_RD`, + further receives are disallowed. If *how* is :const:`SHUT_WR`, further sends + are disallowed. If *how* is :const:`SHUT_RDWR`, further sends and receives are + disallowed. + +Note that there are no methods :meth:`read` or :meth:`write`; use :meth:`recv` +and :meth:`send` without *flags* argument instead. + +Socket objects also have these (read-only) attributes that correspond to the +values given to the :class:`socket` constructor. + + +.. attribute:: socket.family + + The socket family. + + .. versionadded:: 2.5 + + +.. attribute:: socket.type + + The socket type. + + .. versionadded:: 2.5 + + +.. attribute:: socket.proto + + The socket protocol. + + .. versionadded:: 2.5 + + +.. _ssl-objects: + +SSL Objects +----------- + +SSL objects have the following methods. + + +.. method:: SSL.write(s) + + Writes the string *s* to the on the object's SSL connection. The return value is + the number of bytes written. + + +.. method:: SSL.read([n]) + + If *n* is provided, read *n* bytes from the SSL connection, otherwise read until + EOF. The return value is a string of the bytes read. + + +.. method:: SSL.server() + + Returns a string describing the server's certificate. Useful for debugging + purposes; do not parse the content of this string because its format can't be + parsed unambiguously. + + +.. method:: SSL.issuer() + + Returns a string describing the issuer of the server's certificate. Useful for + debugging purposes; do not parse the content of this string because its format + can't be parsed unambiguously. + + +.. _socket-example: + +Example +------- + +Here are four minimal example programs using the TCP/IP protocol: a server that +echoes all data that it receives back (servicing only one client), and a client +using it. Note that a server must perform the sequence :func:`socket`, +:meth:`bind`, :meth:`listen`, :meth:`accept` (possibly repeating the +:meth:`accept` to service more than one client), while a client only needs the +sequence :func:`socket`, :meth:`connect`. Also note that the server does not +:meth:`send`/:meth:`recv` on the socket it is listening on but on the new +socket returned by :meth:`accept`. + +The first two examples support IPv4 only. :: + + # Echo server program + import socket + + HOST = '' # Symbolic name meaning the local host + PORT = 50007 # Arbitrary non-privileged port + s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + s.bind((HOST, PORT)) + s.listen(1) + conn, addr = s.accept() + print 'Connected by', addr + while 1: + data = conn.recv(1024) + if not data: break + conn.send(data) + conn.close() + +:: + + # Echo client program + import socket + + HOST = 'daring.cwi.nl' # The remote host + PORT = 50007 # The same port as used by the server + s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + s.connect((HOST, PORT)) + s.send('Hello, world') + data = s.recv(1024) + s.close() + print 'Received', repr(data) + +The next two examples are identical to the above two, but support both IPv4 and +IPv6. The server side will listen to the first address family available (it +should listen to both instead). On most of IPv6-ready systems, IPv6 will take +precedence and the server may not accept IPv4 traffic. The client side will try +to connect to the all addresses returned as a result of the name resolution, and +sends traffic to the first one connected successfully. :: + + # Echo server program + import socket + import sys + + HOST = '' # Symbolic name meaning the local host + PORT = 50007 # Arbitrary non-privileged port + s = None + for res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC, socket.SOCK_STREAM, 0, socket.AI_PASSIVE): + af, socktype, proto, canonname, sa = res + try: + s = socket.socket(af, socktype, proto) + except socket.error as msg: + s = None + continue + try: + s.bind(sa) + s.listen(1) + except socket.error as msg: + s.close() + s = None + continue + break + if s is None: + print 'could not open socket' + sys.exit(1) + conn, addr = s.accept() + print 'Connected by', addr + while 1: + data = conn.recv(1024) + if not data: break + conn.send(data) + conn.close() + +:: + + # Echo client program + import socket + import sys + + HOST = 'daring.cwi.nl' # The remote host + PORT = 50007 # The same port as used by the server + s = None + for res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC, socket.SOCK_STREAM): + af, socktype, proto, canonname, sa = res + try: + s = socket.socket(af, socktype, proto) + except socket.error as msg: + s = None + continue + try: + s.connect(sa) + except socket.error as msg: + s.close() + s = None + continue + break + if s is None: + print 'could not open socket' + sys.exit(1) + s.send('Hello, world') + data = s.recv(1024) + s.close() + print 'Received', repr(data) + +This example connects to an SSL server, prints the server and issuer's +distinguished names, sends some bytes, and reads part of the response:: + + import socket + + s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + s.connect(('www.verisign.com', 443)) + + ssl_sock = socket.ssl(s) + + print repr(ssl_sock.server()) + print repr(ssl_sock.issuer()) + + # Set a simple HTTP request -- use httplib in actual code. + ssl_sock.write("""GET / HTTP/1.0\r + Host: www.verisign.com\r\n\r\n""") + + # Read a chunk of data. Will not necessarily + # read all the data returned by the server. + data = ssl_sock.read() + + # Note that you need to close the underlying socket, not the SSL object. + del ssl_sock + s.close() + +At this writing, this SSL example prints the following output (line breaks +inserted for readability):: + + '/C=US/ST=California/L=Mountain View/ + O=VeriSign, Inc./OU=Production Services/ + OU=Terms of use at www.verisign.com/rpa (c)00/ + CN=www.verisign.com' + '/O=VeriSign Trust Network/OU=VeriSign, Inc./ + OU=VeriSign International Server CA - Class 3/ + OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD.(c)97 VeriSign' + diff --git a/Doc/library/socketserver.rst b/Doc/library/socketserver.rst new file mode 100644 index 0000000..96fae6b --- /dev/null +++ b/Doc/library/socketserver.rst @@ -0,0 +1,295 @@ + +:mod:`SocketServer` --- A framework for network servers +======================================================= + +.. module:: SocketServer + :synopsis: A framework for network servers. + + +The :mod:`SocketServer` module simplifies the task of writing network servers. + +There are four basic server classes: :class:`TCPServer` uses the Internet TCP +protocol, which provides for continuous streams of data between the client and +server. :class:`UDPServer` uses datagrams, which are discrete packets of +information that may arrive out of order or be lost while in transit. The more +infrequently used :class:`UnixStreamServer` and :class:`UnixDatagramServer` +classes are similar, but use Unix domain sockets; they're not available on +non-Unix platforms. For more details on network programming, consult a book +such as +W. Richard Steven's UNIX Network Programming or Ralph Davis's Win32 Network +Programming. + +These four classes process requests :dfn:`synchronously`; each request must be +completed before the next request can be started. This isn't suitable if each +request takes a long time to complete, because it requires a lot of computation, +or because it returns a lot of data which the client is slow to process. The +solution is to create a separate process or thread to handle each request; the +:class:`ForkingMixIn` and :class:`ThreadingMixIn` mix-in classes can be used to +support asynchronous behaviour. + +Creating a server requires several steps. First, you must create a request +handler class by subclassing the :class:`BaseRequestHandler` class and +overriding its :meth:`handle` method; this method will process incoming +requests. Second, you must instantiate one of the server classes, passing it +the server's address and the request handler class. Finally, call the +:meth:`handle_request` or :meth:`serve_forever` method of the server object to +process one or many requests. + +When inheriting from :class:`ThreadingMixIn` for threaded connection behavior, +you should explicitly declare how you want your threads to behave on an abrupt +shutdown. The :class:`ThreadingMixIn` class defines an attribute +*daemon_threads*, which indicates whether or not the server should wait for +thread termination. You should set the flag explicitly if you would like threads +to behave autonomously; the default is :const:`False`, meaning that Python will +not exit until all threads created by :class:`ThreadingMixIn` have exited. + +Server classes have the same external methods and attributes, no matter what +network protocol they use: + + +Server Creation Notes +--------------------- + +There are five classes in an inheritance diagram, four of which represent +synchronous servers of four types:: + + +------------+ + | BaseServer | + +------------+ + | + v + +-----------+ +------------------+ + | TCPServer |------->| UnixStreamServer | + +-----------+ +------------------+ + | + v + +-----------+ +--------------------+ + | UDPServer |------->| UnixDatagramServer | + +-----------+ +--------------------+ + +Note that :class:`UnixDatagramServer` derives from :class:`UDPServer`, not from +:class:`UnixStreamServer` --- the only difference between an IP and a Unix +stream server is the address family, which is simply repeated in both Unix +server classes. + +Forking and threading versions of each type of server can be created using the +:class:`ForkingMixIn` and :class:`ThreadingMixIn` mix-in classes. For instance, +a threading UDP server class is created as follows:: + + class ThreadingUDPServer(ThreadingMixIn, UDPServer): pass + +The mix-in class must come first, since it overrides a method defined in +:class:`UDPServer`. Setting the various member variables also changes the +behavior of the underlying server mechanism. + +To implement a service, you must derive a class from :class:`BaseRequestHandler` +and redefine its :meth:`handle` method. You can then run various versions of +the service by combining one of the server classes with your request handler +class. The request handler class must be different for datagram or stream +services. This can be hidden by using the handler subclasses +:class:`StreamRequestHandler` or :class:`DatagramRequestHandler`. + +Of course, you still have to use your head! For instance, it makes no sense to +use a forking server if the service contains state in memory that can be +modified by different requests, since the modifications in the child process +would never reach the initial state kept in the parent process and passed to +each child. In this case, you can use a threading server, but you will probably +have to use locks to protect the integrity of the shared data. + +On the other hand, if you are building an HTTP server where all data is stored +externally (for instance, in the file system), a synchronous class will +essentially render the service "deaf" while one request is being handled -- +which may be for a very long time if a client is slow to receive all the data it +has requested. Here a threading or forking server is appropriate. + +In some cases, it may be appropriate to process part of a request synchronously, +but to finish processing in a forked child depending on the request data. This +can be implemented by using a synchronous server and doing an explicit fork in +the request handler class :meth:`handle` method. + +Another approach to handling multiple simultaneous requests in an environment +that supports neither threads nor :func:`fork` (or where these are too expensive +or inappropriate for the service) is to maintain an explicit table of partially +finished requests and to use :func:`select` to decide which request to work on +next (or whether to handle a new incoming request). This is particularly +important for stream services where each client can potentially be connected for +a long time (if threads or subprocesses cannot be used). + +.. % XXX should data and methods be intermingled, or separate? +.. % how should the distinction between class and instance variables be +.. % drawn? + + +Server Objects +-------------- + + +.. function:: fileno() + + Return an integer file descriptor for the socket on which the server is + listening. This function is most commonly passed to :func:`select.select`, to + allow monitoring multiple servers in the same process. + + +.. function:: handle_request() + + Process a single request. This function calls the following methods in order: + :meth:`get_request`, :meth:`verify_request`, and :meth:`process_request`. If + the user-provided :meth:`handle` method of the handler class raises an + exception, the server's :meth:`handle_error` method will be called. + + +.. function:: serve_forever() + + Handle an infinite number of requests. This simply calls :meth:`handle_request` + inside an infinite loop. + + +.. data:: address_family + + The family of protocols to which the server's socket belongs. + :const:`socket.AF_INET` and :const:`socket.AF_UNIX` are two possible values. + + +.. data:: RequestHandlerClass + + The user-provided request handler class; an instance of this class is created + for each request. + + +.. data:: server_address + + The address on which the server is listening. The format of addresses varies + depending on the protocol family; see the documentation for the socket module + for details. For Internet protocols, this is a tuple containing a string giving + the address, and an integer port number: ``('127.0.0.1', 80)``, for example. + + +.. data:: socket + + The socket object on which the server will listen for incoming requests. + +The server classes support the following class variables: + +.. % XXX should class variables be covered before instance variables, or +.. % vice versa? + + +.. data:: allow_reuse_address + + Whether the server will allow the reuse of an address. This defaults to + :const:`False`, and can be set in subclasses to change the policy. + + +.. data:: request_queue_size + + The size of the request queue. If it takes a long time to process a single + request, any requests that arrive while the server is busy are placed into a + queue, up to :attr:`request_queue_size` requests. Once the queue is full, + further requests from clients will get a "Connection denied" error. The default + value is usually 5, but this can be overridden by subclasses. + + +.. data:: socket_type + + The type of socket used by the server; :const:`socket.SOCK_STREAM` and + :const:`socket.SOCK_DGRAM` are two possible values. + +There are various server methods that can be overridden by subclasses of base +server classes like :class:`TCPServer`; these methods aren't useful to external +users of the server object. + +.. % should the default implementations of these be documented, or should +.. % it be assumed that the user will look at SocketServer.py? + + +.. function:: finish_request() + + Actually processes the request by instantiating :attr:`RequestHandlerClass` and + calling its :meth:`handle` method. + + +.. function:: get_request() + + Must accept a request from the socket, and return a 2-tuple containing the *new* + socket object to be used to communicate with the client, and the client's + address. + + +.. function:: handle_error(request, client_address) + + This function is called if the :attr:`RequestHandlerClass`'s :meth:`handle` + method raises an exception. The default action is to print the traceback to + standard output and continue handling further requests. + + +.. function:: process_request(request, client_address) + + Calls :meth:`finish_request` to create an instance of the + :attr:`RequestHandlerClass`. If desired, this function can create a new process + or thread to handle the request; the :class:`ForkingMixIn` and + :class:`ThreadingMixIn` classes do this. + +.. % Is there any point in documenting the following two functions? +.. % What would the purpose of overriding them be: initializing server +.. % instance variables, adding new network families? + + +.. function:: server_activate() + + Called by the server's constructor to activate the server. The default behavior + just :meth:`listen`\ s to the server's socket. May be overridden. + + +.. function:: server_bind() + + Called by the server's constructor to bind the socket to the desired address. + May be overridden. + + +.. function:: verify_request(request, client_address) + + Must return a Boolean value; if the value is :const:`True`, the request will be + processed, and if it's :const:`False`, the request will be denied. This function + can be overridden to implement access controls for a server. The default + implementation always returns :const:`True`. + + +RequestHandler Objects +---------------------- + +The request handler class must define a new :meth:`handle` method, and can +override any of the following methods. A new instance is created for each +request. + + +.. function:: finish() + + Called after the :meth:`handle` method to perform any clean-up actions required. + The default implementation does nothing. If :meth:`setup` or :meth:`handle` + raise an exception, this function will not be called. + + +.. function:: handle() + + This function must do all the work required to service a request. The default + implementation does nothing. Several instance attributes are available to it; + the request is available as :attr:`self.request`; the client address as + :attr:`self.client_address`; and the server instance as :attr:`self.server`, in + case it needs access to per-server information. + + The type of :attr:`self.request` is different for datagram or stream services. + For stream services, :attr:`self.request` is a socket object; for datagram + services, :attr:`self.request` is a string. However, this can be hidden by using + the request handler subclasses :class:`StreamRequestHandler` or + :class:`DatagramRequestHandler`, which override the :meth:`setup` and + :meth:`finish` methods, and provide :attr:`self.rfile` and :attr:`self.wfile` + attributes. :attr:`self.rfile` and :attr:`self.wfile` can be read or written, + respectively, to get the request data or return data to the client. + + +.. function:: setup() + + Called before the :meth:`handle` method to perform any initialization actions + required. The default implementation does nothing. + diff --git a/Doc/library/someos.rst b/Doc/library/someos.rst new file mode 100644 index 0000000..5ee96bc --- /dev/null +++ b/Doc/library/someos.rst @@ -0,0 +1,23 @@ + +.. _someos: + +********************************** +Optional Operating System Services +********************************** + +The modules described in this chapter provide interfaces to operating system +features that are available on selected operating systems only. The interfaces +are generally modeled after the Unix or C interfaces but they are available on +some other systems as well (e.g. Windows or NT). Here's an overview: + + +.. toctree:: + + select.rst + thread.rst + threading.rst + dummy_thread.rst + dummy_threading.rst + mmap.rst + readline.rst + rlcompleter.rst diff --git a/Doc/library/spwd.rst b/Doc/library/spwd.rst new file mode 100644 index 0000000..6cbe925 --- /dev/null +++ b/Doc/library/spwd.rst @@ -0,0 +1,74 @@ + +:mod:`spwd` --- The shadow password database +============================================ + +.. module:: spwd + :platform: Unix + :synopsis: The shadow password database (getspnam() and friends). + + +.. versionadded:: 2.5 + +This module provides access to the Unix shadow password database. It is +available on various Unix versions. + +You must have enough privileges to access the shadow password database (this +usually means you have to be root). + +Shadow password database entries are reported as a tuple-like object, whose +attributes correspond to the members of the ``spwd`` structure (Attribute field +below, see ``<shadow.h>``): + ++-------+---------------+---------------------------------+ +| Index | Attribute | Meaning | ++=======+===============+=================================+ +| 0 | ``sp_nam`` | Login name | ++-------+---------------+---------------------------------+ +| 1 | ``sp_pwd`` | Encrypted password | ++-------+---------------+---------------------------------+ +| 2 | ``sp_lstchg`` | Date of last change | ++-------+---------------+---------------------------------+ +| 3 | ``sp_min`` | Minimal number of days between | +| | | changes | ++-------+---------------+---------------------------------+ +| 4 | ``sp_max`` | Maximum number of days between | +| | | changes | ++-------+---------------+---------------------------------+ +| 5 | ``sp_warn`` | Number of days before password | +| | | expires to warn user about it | ++-------+---------------+---------------------------------+ +| 6 | ``sp_inact`` | Number of days after password | +| | | expires until account is | +| | | blocked | ++-------+---------------+---------------------------------+ +| 7 | ``sp_expire`` | Number of days since 1970-01-01 | +| | | until account is disabled | ++-------+---------------+---------------------------------+ +| 8 | ``sp_flag`` | Reserved | ++-------+---------------+---------------------------------+ + +The sp_nam and sp_pwd items are strings, all others are integers. +:exc:`KeyError` is raised if the entry asked for cannot be found. + +It defines the following items: + + +.. function:: getspnam(name) + + Return the shadow password database entry for the given user name. + + +.. function:: getspall() + + Return a list of all available shadow password database entries, in arbitrary + order. + + +.. seealso:: + + Module :mod:`grp` + An interface to the group database, similar to this. + + Module :mod:`pwd` + An interface to the normal password database, similar to this. + diff --git a/Doc/library/sqlite3.rst b/Doc/library/sqlite3.rst new file mode 100644 index 0000000..707092b --- /dev/null +++ b/Doc/library/sqlite3.rst @@ -0,0 +1,689 @@ + +:mod:`sqlite3` --- DB-API 2.0 interface for SQLite databases +============================================================ + +.. module:: sqlite3 + :synopsis: A DB-API 2.0 implementation using SQLite 3.x. +.. sectionauthor:: Gerhard Häring <gh@ghaering.de> + + +.. versionadded:: 2.5 + +SQLite is a C library that provides a lightweight disk-based database that +doesn't require a separate server process and allows accessing the database +using a nonstandard variant of the SQL query language. Some applications can use +SQLite for internal data storage. It's also possible to prototype an +application using SQLite and then port the code to a larger database such as +PostgreSQL or Oracle. + +pysqlite was written by Gerhard Häring and provides a SQL interface compliant +with the DB-API 2.0 specification described by :pep:`249`. + +To use the module, you must first create a :class:`Connection` object that +represents the database. Here the data will be stored in the +:file:`/tmp/example` file:: + + conn = sqlite3.connect('/tmp/example') + +You can also supply the special name ``:memory:`` to create a database in RAM. + +Once you have a :class:`Connection`, you can create a :class:`Cursor` object +and call its :meth:`execute` method to perform SQL commands:: + + c = conn.cursor() + + # Create table + c.execute('''create table stocks + (date text, trans text, symbol text, + qty real, price real)''') + + # Insert a row of data + c.execute("""insert into stocks + values ('2006-01-05','BUY','RHAT',100,35.14)""") + + # Save (commit) the changes + conn.commit() + + # We can also close the cursor if we are done with it + c.close() + +Usually your SQL operations will need to use values from Python variables. You +shouldn't assemble your query using Python's string operations because doing so +is insecure; it makes your program vulnerable to an SQL injection attack. + +Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder +wherever you want to use a value, and then provide a tuple of values as the +second argument to the cursor's :meth:`execute` method. (Other database modules +may use a different placeholder, such as ``%s`` or ``:1``.) For example:: + + # Never do this -- insecure! + symbol = 'IBM' + c.execute("... where symbol = '%s'" % symbol) + + # Do this instead + t = (symbol,) + c.execute('select * from stocks where symbol=?', t) + + # Larger example + for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), + ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), + ('2006-04-06', 'SELL', 'IBM', 500, 53.00), + ): + c.execute('insert into stocks values (?,?,?,?,?)', t) + +To retrieve data after executing a SELECT statement, you can either treat the +cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a +single matching row, or call :meth:`fetchall` to get a list of the matching +rows. + +This example uses the iterator form:: + + >>> c = conn.cursor() + >>> c.execute('select * from stocks order by price') + >>> for row in c: + ... print row + ... + (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) + (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) + (u'2006-04-06', u'SELL', u'IBM', 500, 53.0) + (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0) + >>> + + +.. seealso:: + + http://www.pysqlite.org + The pysqlite web page. + + http://www.sqlite.org + The SQLite web page; the documentation describes the syntax and the available + data types for the supported SQL dialect. + + :pep:`249` - Database API Specification 2.0 + PEP written by Marc-André Lemburg. + + +.. _sqlite3-module-contents: + +Module functions and constants +------------------------------ + + +.. data:: PARSE_DECLTYPES + + This constant is meant to be used with the *detect_types* parameter of the + :func:`connect` function. + + Setting it makes the :mod:`sqlite3` module parse the declared type for each + column it returns. It will parse out the first word of the declared type, i. e. + for "integer primary key", it will parse out "integer". Then for that column, it + will look into the converters dictionary and use the converter function + registered for that type there. Converter names are case-sensitive! + + +.. data:: PARSE_COLNAMES + + This constant is meant to be used with the *detect_types* parameter of the + :func:`connect` function. + + Setting this makes the SQLite interface parse the column name for each column it + returns. It will look for a string formed [mytype] in there, and then decide + that 'mytype' is the type of the column. It will try to find an entry of + 'mytype' in the converters dictionary and then use the converter function found + there to return the value. The column name found in :attr:`cursor.description` + is only the first word of the column name, i. e. if you use something like + ``'as "x [datetime]"'`` in your SQL, then we will parse out everything until the + first blank for the column name: the column name would simply be "x". + + +.. function:: connect(database[, timeout, isolation_level, detect_types, factory]) + + Opens a connection to the SQLite database file *database*. You can use + ``":memory:"`` to open a database connection to a database that resides in RAM + instead of on disk. + + When a database is accessed by multiple connections, and one of the processes + modifies the database, the SQLite database is locked until that transaction is + committed. The *timeout* parameter specifies how long the connection should wait + for the lock to go away until raising an exception. The default for the timeout + parameter is 5.0 (five seconds). + + For the *isolation_level* parameter, please see the + :attr:`Connection.isolation_level` property of :class:`Connection` objects. + + SQLite natively supports only the types TEXT, INTEGER, FLOAT, BLOB and NULL. If + you want to use other types you must add support for them yourself. The + *detect_types* parameter and the using custom **converters** registered with the + module-level :func:`register_converter` function allow you to easily do that. + + *detect_types* defaults to 0 (i. e. off, no type detection), you can set it to + any combination of :const:`PARSE_DECLTYPES` and :const:`PARSE_COLNAMES` to turn + type detection on. + + By default, the :mod:`sqlite3` module uses its :class:`Connection` class for the + connect call. You can, however, subclass the :class:`Connection` class and make + :func:`connect` use your class instead by providing your class for the *factory* + parameter. + + Consult the section :ref:`sqlite3-types` of this manual for details. + + The :mod:`sqlite3` module internally uses a statement cache to avoid SQL parsing + overhead. If you want to explicitly set the number of statements that are cached + for the connection, you can set the *cached_statements* parameter. The currently + implemented default is to cache 100 statements. + + +.. function:: register_converter(typename, callable) + + Registers a callable to convert a bytestring from the database into a custom + Python type. The callable will be invoked for all database values that are of + the type *typename*. Confer the parameter *detect_types* of the :func:`connect` + function for how the type detection works. Note that the case of *typename* and + the name of the type in your query must match! + + +.. function:: register_adapter(type, callable) + + Registers a callable to convert the custom Python type *type* into one of + SQLite's supported types. The callable *callable* accepts as single parameter + the Python value, and must return a value of the following types: int, long, + float, str (UTF-8 encoded), unicode or buffer. + + +.. function:: complete_statement(sql) + + Returns :const:`True` if the string *sql* contains one or more complete SQL + statements terminated by semicolons. It does not verify that the SQL is + syntactically correct, only that there are no unclosed string literals and the + statement is terminated by a semicolon. + + This can be used to build a shell for SQLite, as in the following example: + + + .. literalinclude:: ../includes/sqlite3/complete_statement.py + + +.. function:: enable_callback_tracebacks(flag) + + By default you will not get any tracebacks in user-defined functions, + aggregates, converters, authorizer callbacks etc. If you want to debug them, you + can call this function with *flag* as True. Afterwards, you will get tracebacks + from callbacks on ``sys.stderr``. Use :const:`False` to disable the feature + again. + + +.. _sqlite3-connection-objects: + +Connection Objects +------------------ + +A :class:`Connection` instance has the following attributes and methods: + +.. attribute:: Connection.isolation_level + + Get or set the current isolation level. None for autocommit mode or one of + "DEFERRED", "IMMEDIATE" or "EXLUSIVE". See section + :ref:`sqlite3-controlling-transactions` for a more detailed explanation. + + +.. method:: Connection.cursor([cursorClass]) + + The cursor method accepts a single optional parameter *cursorClass*. If + supplied, this must be a custom cursor class that extends + :class:`sqlite3.Cursor`. + + +.. method:: Connection.execute(sql, [parameters]) + + This is a nonstandard shortcut that creates an intermediate cursor object by + calling the cursor method, then calls the cursor's :meth:`execute` method with + the parameters given. + + +.. method:: Connection.executemany(sql, [parameters]) + + This is a nonstandard shortcut that creates an intermediate cursor object by + calling the cursor method, then calls the cursor's :meth:`executemany` method + with the parameters given. + + +.. method:: Connection.executescript(sql_script) + + This is a nonstandard shortcut that creates an intermediate cursor object by + calling the cursor method, then calls the cursor's :meth:`executescript` method + with the parameters given. + + +.. method:: Connection.create_function(name, num_params, func) + + Creates a user-defined function that you can later use from within SQL + statements under the function name *name*. *num_params* is the number of + parameters the function accepts, and *func* is a Python callable that is called + as the SQL function. + + The function can return any of the types supported by SQLite: unicode, str, int, + long, float, buffer and None. + + Example: + + .. literalinclude:: ../includes/sqlite3/md5func.py + + +.. method:: Connection.create_aggregate(name, num_params, aggregate_class) + + Creates a user-defined aggregate function. + + The aggregate class must implement a ``step`` method, which accepts the number + of parameters *num_params*, and a ``finalize`` method which will return the + final result of the aggregate. + + The ``finalize`` method can return any of the types supported by SQLite: + unicode, str, int, long, float, buffer and None. + + Example: + + .. literalinclude:: ../includes/sqlite3/mysumaggr.py + + +.. method:: Connection.create_collation(name, callable) + + Creates a collation with the specified *name* and *callable*. The callable will + be passed two string arguments. It should return -1 if the first is ordered + lower than the second, 0 if they are ordered equal and 1 if the first is ordered + higher than the second. Note that this controls sorting (ORDER BY in SQL) so + your comparisons don't affect other SQL operations. + + Note that the callable will get its parameters as Python bytestrings, which will + normally be encoded in UTF-8. + + The following example shows a custom collation that sorts "the wrong way": + + .. literalinclude:: ../includes/sqlite3/collation_reverse.py + + To remove a collation, call ``create_collation`` with None as callable:: + + con.create_collation("reverse", None) + + +.. method:: Connection.interrupt() + + You can call this method from a different thread to abort any queries that might + be executing on the connection. The query will then abort and the caller will + get an exception. + + +.. method:: Connection.set_authorizer(authorizer_callback) + + This routine registers a callback. The callback is invoked for each attempt to + access a column of a table in the database. The callback should return + :const:`SQLITE_OK` if access is allowed, :const:`SQLITE_DENY` if the entire SQL + statement should be aborted with an error and :const:`SQLITE_IGNORE` if the + column should be treated as a NULL value. These constants are available in the + :mod:`sqlite3` module. + + The first argument to the callback signifies what kind of operation is to be + authorized. The second and third argument will be arguments or :const:`None` + depending on the first argument. The 4th argument is the name of the database + ("main", "temp", etc.) if applicable. The 5th argument is the name of the + inner-most trigger or view that is responsible for the access attempt or + :const:`None` if this access attempt is directly from input SQL code. + + Please consult the SQLite documentation about the possible values for the first + argument and the meaning of the second and third argument depending on the first + one. All necessary constants are available in the :mod:`sqlite3` module. + + +.. attribute:: Connection.row_factory + + You can change this attribute to a callable that accepts the cursor and the + original row as a tuple and will return the real result row. This way, you can + implement more advanced ways of returning results, such as returning an object + that can also access columns by name. + + Example: + + .. literalinclude:: ../includes/sqlite3/row_factory.py + + If returning a tuple doesn't suffice and you want name-based access to + columns, you should consider setting :attr:`row_factory` to the + highly-optimized :class:`sqlite3.Row` type. :class:`Row` provides both + index-based and case-insensitive name-based access to columns with almost no + memory overhead. It will probably be better than your own custom + dictionary-based approach or even a db_row based solution. + + .. % XXX what's a db_row-based solution? + + +.. attribute:: Connection.text_factory + + Using this attribute you can control what objects are returned for the TEXT data + type. By default, this attribute is set to :class:`unicode` and the + :mod:`sqlite3` module will return Unicode objects for TEXT. If you want to + return bytestrings instead, you can set it to :class:`str`. + + For efficiency reasons, there's also a way to return Unicode objects only for + non-ASCII data, and bytestrings otherwise. To activate it, set this attribute to + :const:`sqlite3.OptimizedUnicode`. + + You can also set it to any other callable that accepts a single bytestring + parameter and returns the resulting object. + + See the following example code for illustration: + + .. literalinclude:: ../includes/sqlite3/text_factory.py + + +.. attribute:: Connection.total_changes + + Returns the total number of database rows that have been modified, inserted, or + deleted since the database connection was opened. + + +.. _sqlite3-cursor-objects: + +Cursor Objects +-------------- + +A :class:`Cursor` instance has the following attributes and methods: + + +.. method:: Cursor.execute(sql, [parameters]) + + Executes a SQL statement. The SQL statement may be parametrized (i. e. + placeholders instead of SQL literals). The :mod:`sqlite3` module supports two + kinds of placeholders: question marks (qmark style) and named placeholders + (named style). + + This example shows how to use parameters with qmark style: + + .. literalinclude:: ../includes/sqlite3/execute_1.py + + This example shows how to use the named style: + + .. literalinclude:: ../includes/sqlite3/execute_2.py + + :meth:`execute` will only execute a single SQL statement. If you try to execute + more than one statement with it, it will raise a Warning. Use + :meth:`executescript` if you want to execute multiple SQL statements with one + call. + + +.. method:: Cursor.executemany(sql, seq_of_parameters) + + Executes a SQL command against all parameter sequences or mappings found in the + sequence *sql*. The :mod:`sqlite3` module also allows using an iterator yielding + parameters instead of a sequence. + + .. literalinclude:: ../includes/sqlite3/executemany_1.py + + Here's a shorter example using a generator: + + .. literalinclude:: ../includes/sqlite3/executemany_2.py + + +.. method:: Cursor.executescript(sql_script) + + This is a nonstandard convenience method for executing multiple SQL statements + at once. It issues a COMMIT statement first, then executes the SQL script it + gets as a parameter. + + *sql_script* can be a bytestring or a Unicode string. + + Example: + + .. literalinclude:: ../includes/sqlite3/executescript.py + + +.. attribute:: Cursor.rowcount + + Although the :class:`Cursor` class of the :mod:`sqlite3` module implements this + attribute, the database engine's own support for the determination of "rows + affected"/"rows selected" is quirky. + + For ``SELECT`` statements, :attr:`rowcount` is always None because we cannot + determine the number of rows a query produced until all rows were fetched. + + For ``DELETE`` statements, SQLite reports :attr:`rowcount` as 0 if you make a + ``DELETE FROM table`` without any condition. + + For :meth:`executemany` statements, the number of modifications are summed up + into :attr:`rowcount`. + + As required by the Python DB API Spec, the :attr:`rowcount` attribute "is -1 in + case no executeXX() has been performed on the cursor or the rowcount of the last + operation is not determinable by the interface". + + +.. _sqlite3-types: + +SQLite and Python types +----------------------- + + +Introduction +^^^^^^^^^^^^ + +SQLite natively supports the following types: NULL, INTEGER, REAL, TEXT, BLOB. + +The following Python types can thus be sent to SQLite without any problem: + ++------------------------+-------------+ +| Python type | SQLite type | ++========================+=============+ +| ``None`` | NULL | ++------------------------+-------------+ +| ``int`` | INTEGER | ++------------------------+-------------+ +| ``long`` | INTEGER | ++------------------------+-------------+ +| ``float`` | REAL | ++------------------------+-------------+ +| ``str (UTF8-encoded)`` | TEXT | ++------------------------+-------------+ +| ``unicode`` | TEXT | ++------------------------+-------------+ +| ``buffer`` | BLOB | ++------------------------+-------------+ + +This is how SQLite types are converted to Python types by default: + ++-------------+---------------------------------------------+ +| SQLite type | Python type | ++=============+=============================================+ +| ``NULL`` | None | ++-------------+---------------------------------------------+ +| ``INTEGER`` | int or long, depending on size | ++-------------+---------------------------------------------+ +| ``REAL`` | float | ++-------------+---------------------------------------------+ +| ``TEXT`` | depends on text_factory, unicode by default | ++-------------+---------------------------------------------+ +| ``BLOB`` | buffer | ++-------------+---------------------------------------------+ + +The type system of the :mod:`sqlite3` module is extensible in two ways: you can +store additional Python types in a SQLite database via object adaptation, and +you can let the :mod:`sqlite3` module convert SQLite types to different Python +types via converters. + + +Using adapters to store additional Python types in SQLite databases +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +As described before, SQLite supports only a limited set of types natively. To +use other Python types with SQLite, you must **adapt** them to one of the +sqlite3 module's supported types for SQLite: one of NoneType, int, long, float, +str, unicode, buffer. + +The :mod:`sqlite3` module uses Python object adaptation, as described in +:pep:`246` for this. The protocol to use is :class:`PrepareProtocol`. + +There are two ways to enable the :mod:`sqlite3` module to adapt a custom Python +type to one of the supported ones. + + +Letting your object adapt itself +"""""""""""""""""""""""""""""""" + +This is a good approach if you write the class yourself. Let's suppose you have +a class like this:: + + class Point(object): + def __init__(self, x, y): + self.x, self.y = x, y + +Now you want to store the point in a single SQLite column. First you'll have to +choose one of the supported types first to be used for representing the point. +Let's just use str and separate the coordinates using a semicolon. Then you need +to give your class a method ``__conform__(self, protocol)`` which must return +the converted value. The parameter *protocol* will be :class:`PrepareProtocol`. + +.. literalinclude:: ../includes/sqlite3/adapter_point_1.py + + +Registering an adapter callable +""""""""""""""""""""""""""""""" + +The other possibility is to create a function that converts the type to the +string representation and register the function with :meth:`register_adapter`. + +.. note:: + + The type/class to adapt must be a new-style class, i. e. it must have + :class:`object` as one of its bases. + +.. literalinclude:: ../includes/sqlite3/adapter_point_2.py + +The :mod:`sqlite3` module has two default adapters for Python's built-in +:class:`datetime.date` and :class:`datetime.datetime` types. Now let's suppose +we want to store :class:`datetime.datetime` objects not in ISO representation, +but as a Unix timestamp. + +.. literalinclude:: ../includes/sqlite3/adapter_datetime.py + + +Converting SQLite values to custom Python types +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Writing an adapter lets you send custom Python types to SQLite. But to make it +really useful we need to make the Python to SQLite to Python roundtrip work. + +Enter converters. + +Let's go back to the :class:`Point` class. We stored the x and y coordinates +separated via semicolons as strings in SQLite. + +First, we'll define a converter function that accepts the string as a parameter +and constructs a :class:`Point` object from it. + +.. note:: + + Converter functions **always** get called with a string, no matter under which + data type you sent the value to SQLite. + +.. note:: + + Converter names are looked up in a case-sensitive manner. + +:: + + def convert_point(s): + x, y = map(float, s.split(";")) + return Point(x, y) + +Now you need to make the :mod:`sqlite3` module know that what you select from +the database is actually a point. There are two ways of doing this: + +* Implicitly via the declared type + +* Explicitly via the column name + +Both ways are described in section :ref:`sqlite3-module-contents`, in the entries +for the constants :const:`PARSE_DECLTYPES` and :const:`PARSE_COLNAMES`. + +The following example illustrates both approaches. + +.. literalinclude:: ../includes/sqlite3/converter_point.py + + +Default adapters and converters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There are default adapters for the date and datetime types in the datetime +module. They will be sent as ISO dates/ISO timestamps to SQLite. + +The default converters are registered under the name "date" for +:class:`datetime.date` and under the name "timestamp" for +:class:`datetime.datetime`. + +This way, you can use date/timestamps from Python without any additional +fiddling in most cases. The format of the adapters is also compatible with the +experimental SQLite date/time functions. + +The following example demonstrates this. + +.. literalinclude:: ../includes/sqlite3/pysqlite_datetime.py + + +.. _sqlite3-controlling-transactions: + +Controlling Transactions +------------------------ + +By default, the :mod:`sqlite3` module opens transactions implicitly before a +Data Modification Language (DML) statement (i.e. INSERT/UPDATE/DELETE/REPLACE), +and commits transactions implicitly before a non-DML, non-query statement (i. e. +anything other than SELECT/INSERT/UPDATE/DELETE/REPLACE). + +So if you are within a transaction and issue a command like ``CREATE TABLE +...``, ``VACUUM``, ``PRAGMA``, the :mod:`sqlite3` module will commit implicitly +before executing that command. There are two reasons for doing that. The first +is that some of these commands don't work within transactions. The other reason +is that pysqlite needs to keep track of the transaction state (if a transaction +is active or not). + +You can control which kind of "BEGIN" statements pysqlite implicitly executes +(or none at all) via the *isolation_level* parameter to the :func:`connect` +call, or via the :attr:`isolation_level` property of connections. + +If you want **autocommit mode**, then set :attr:`isolation_level` to None. + +Otherwise leave it at its default, which will result in a plain "BEGIN" +statement, or set it to one of SQLite's supported isolation levels: DEFERRED, +IMMEDIATE or EXCLUSIVE. + +As the :mod:`sqlite3` module needs to keep track of the transaction state, you +should not use ``OR ROLLBACK`` or ``ON CONFLICT ROLLBACK`` in your SQL. Instead, +catch the :exc:`IntegrityError` and call the :meth:`rollback` method of the +connection yourself. + + +Using pysqlite efficiently +-------------------------- + + +Using shortcut methods +^^^^^^^^^^^^^^^^^^^^^^ + +Using the nonstandard :meth:`execute`, :meth:`executemany` and +:meth:`executescript` methods of the :class:`Connection` object, your code can +be written more concisely because you don't have to create the (often +superfluous) :class:`Cursor` objects explicitly. Instead, the :class:`Cursor` +objects are created implicitly and these shortcut methods return the cursor +objects. This way, you can execute a SELECT statement and iterate over it +directly using only a single call on the :class:`Connection` object. + +.. literalinclude:: ../includes/sqlite3/shortcut_methods.py + + +Accessing columns by name instead of by index +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +One useful feature of the :mod:`sqlite3` module is the builtin +:class:`sqlite3.Row` class designed to be used as a row factory. + +Rows wrapped with this class can be accessed both by index (like tuples) and +case-insensitively by name: + +.. literalinclude:: ../includes/sqlite3/rowclass.py + diff --git a/Doc/library/stat.rst b/Doc/library/stat.rst new file mode 100644 index 0000000..430bb23 --- /dev/null +++ b/Doc/library/stat.rst @@ -0,0 +1,167 @@ + +:mod:`stat` --- Interpreting :func:`stat` results +================================================= + +.. module:: stat + :synopsis: Utilities for interpreting the results of os.stat(), os.lstat() and os.fstat(). +.. sectionauthor:: Skip Montanaro <skip@automatrix.com> + + +The :mod:`stat` module defines constants and functions for interpreting the +results of :func:`os.stat`, :func:`os.fstat` and :func:`os.lstat` (if they +exist). For complete details about the :cfunc:`stat`, :cfunc:`fstat` and +:cfunc:`lstat` calls, consult the documentation for your system. + +The :mod:`stat` module defines the following functions to test for specific file +types: + + +.. function:: S_ISDIR(mode) + + Return non-zero if the mode is from a directory. + + +.. function:: S_ISCHR(mode) + + Return non-zero if the mode is from a character special device file. + + +.. function:: S_ISBLK(mode) + + Return non-zero if the mode is from a block special device file. + + +.. function:: S_ISREG(mode) + + Return non-zero if the mode is from a regular file. + + +.. function:: S_ISFIFO(mode) + + Return non-zero if the mode is from a FIFO (named pipe). + + +.. function:: S_ISLNK(mode) + + Return non-zero if the mode is from a symbolic link. + + +.. function:: S_ISSOCK(mode) + + Return non-zero if the mode is from a socket. + +Two additional functions are defined for more general manipulation of the file's +mode: + + +.. function:: S_IMODE(mode) + + Return the portion of the file's mode that can be set by :func:`os.chmod`\ + ---that is, the file's permission bits, plus the sticky bit, set-group-id, and + set-user-id bits (on systems that support them). + + +.. function:: S_IFMT(mode) + + Return the portion of the file's mode that describes the file type (used by the + :func:`S_IS\*` functions above). + +Normally, you would use the :func:`os.path.is\*` functions for testing the type +of a file; the functions here are useful when you are doing multiple tests of +the same file and wish to avoid the overhead of the :cfunc:`stat` system call +for each test. These are also useful when checking for information about a file +that isn't handled by :mod:`os.path`, like the tests for block and character +devices. + +All the variables below are simply symbolic indexes into the 10-tuple returned +by :func:`os.stat`, :func:`os.fstat` or :func:`os.lstat`. + + +.. data:: ST_MODE + + Inode protection mode. + + +.. data:: ST_INO + + Inode number. + + +.. data:: ST_DEV + + Device inode resides on. + + +.. data:: ST_NLINK + + Number of links to the inode. + + +.. data:: ST_UID + + User id of the owner. + + +.. data:: ST_GID + + Group id of the owner. + + +.. data:: ST_SIZE + + Size in bytes of a plain file; amount of data waiting on some special files. + + +.. data:: ST_ATIME + + Time of last access. + + +.. data:: ST_MTIME + + Time of last modification. + + +.. data:: ST_CTIME + + The "ctime" as reported by the operating system. On some systems (like Unix) is + the time of the last metadata change, and, on others (like Windows), is the + creation time (see platform documentation for details). + +The interpretation of "file size" changes according to the file type. For plain +files this is the size of the file in bytes. For FIFOs and sockets under most +flavors of Unix (including Linux in particular), the "size" is the number of +bytes waiting to be read at the time of the call to :func:`os.stat`, +:func:`os.fstat`, or :func:`os.lstat`; this can sometimes be useful, especially +for polling one of these special files after a non-blocking open. The meaning +of the size field for other character and block devices varies more, depending +on the implementation of the underlying system call. + +Example:: + + import os, sys + from stat import * + + def walktree(top, callback): + '''recursively descend the directory tree rooted at top, + calling the callback function for each regular file''' + + for f in os.listdir(top): + pathname = os.path.join(top, f) + mode = os.stat(pathname)[ST_MODE] + if S_ISDIR(mode): + # It's a directory, recurse into it + walktree(pathname, callback) + elif S_ISREG(mode): + # It's a file, call the callback function + callback(pathname) + else: + # Unknown file type, print a message + print 'Skipping %s' % pathname + + def visitfile(file): + print 'visiting', file + + if __name__ == '__main__': + walktree(sys.argv[1], visitfile) + diff --git a/Doc/library/statvfs.rst b/Doc/library/statvfs.rst new file mode 100644 index 0000000..6ec7c38 --- /dev/null +++ b/Doc/library/statvfs.rst @@ -0,0 +1,67 @@ + +:mod:`statvfs` --- Constants used with :func:`os.statvfs` +========================================================= + +.. module:: statvfs + :synopsis: Constants for interpreting the result of os.statvfs(). +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +.. % LaTeX'ed from comments in module + +The :mod:`statvfs` module defines constants so interpreting the result if +:func:`os.statvfs`, which returns a tuple, can be made without remembering +"magic numbers." Each of the constants defined in this module is the *index* of +the entry in the tuple returned by :func:`os.statvfs` that contains the +specified information. + + +.. data:: F_BSIZE + + Preferred file system block size. + + +.. data:: F_FRSIZE + + Fundamental file system block size. + + +.. data:: F_BLOCKS + + Total number of blocks in the filesystem. + + +.. data:: F_BFREE + + Total number of free blocks. + + +.. data:: F_BAVAIL + + Free blocks available to non-super user. + + +.. data:: F_FILES + + Total number of file nodes. + + +.. data:: F_FFREE + + Total number of free file nodes. + + +.. data:: F_FAVAIL + + Free nodes available to non-super user. + + +.. data:: F_FLAG + + Flags. System dependent: see :cfunc:`statvfs` man page. + + +.. data:: F_NAMEMAX + + Maximum file name length. + diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst new file mode 100644 index 0000000..34c943c --- /dev/null +++ b/Doc/library/stdtypes.rst @@ -0,0 +1,2409 @@ +.. XXX: reference/datamodel and this have quite a few overlaps! + + +.. _bltin-types: + +************** +Built-in Types +************** + +The following sections describe the standard types that are built into the +interpreter. + +.. note:: + + Historically (until release 2.2), Python's built-in types have differed from + user-defined types because it was not possible to use the built-in types as the + basis for object-oriented inheritance. This limitation no longer + exists. + +.. index:: pair: built-in; types + +The principal built-in types are numerics, sequences, mappings, files, classes, +instances and exceptions. + +.. index:: statement: print + +Some operations are supported by several object types; in particular, +practically all objects can be compared, tested for truth value, and converted +to a string (with the :func:`repr` function or the slightly different +:func:`str` function). The latter function is implicitly used when an object is +written by the :func:`print` function. + + +.. _truth: + +Truth Value Testing +=================== + +.. index:: + statement: if + statement: while + pair: truth; value + pair: Boolean; operations + single: false + +Any object can be tested for truth value, for use in an :keyword:`if` or +:keyword:`while` condition or as operand of the Boolean operations below. The +following values are considered false: + + .. index:: single: None (Built-in object) + +* ``None`` + + .. index:: single: False (Built-in object) + +* ``False`` + +* zero of any numeric type, for example, ``0``, ``0L``, ``0.0``, ``0j``. + +* any empty sequence, for example, ``''``, ``()``, ``[]``. + +* any empty mapping, for example, ``{}``. + +* instances of user-defined classes, if the class defines a :meth:`__bool__` or + :meth:`__len__` method, when that method returns the integer zero or + :class:`bool` value ``False``. [#]_ + +.. index:: single: true + +All other values are considered true --- so objects of many types are always +true. + +.. index:: + operator: or + operator: and + single: False + single: True + +Operations and built-in functions that have a Boolean result always return ``0`` +or ``False`` for false and ``1`` or ``True`` for true, unless otherwise stated. +(Important exception: the Boolean operations ``or`` and ``and`` always return +one of their operands.) + + +.. _boolean: + +Boolean Operations --- :keyword:`and`, :keyword:`or`, :keyword:`not` +==================================================================== + +.. index:: pair: Boolean; operations + +These are the Boolean operations, ordered by ascending priority: + ++-------------+---------------------------------+-------+ +| Operation | Result | Notes | ++=============+=================================+=======+ +| ``x or y`` | if *x* is false, then *y*, else | \(1) | +| | *x* | | ++-------------+---------------------------------+-------+ +| ``x and y`` | if *x* is false, then *x*, else | \(2) | +| | *y* | | ++-------------+---------------------------------+-------+ +| ``not x`` | if *x* is false, then ``True``, | \(3) | +| | else ``False`` | | ++-------------+---------------------------------+-------+ + +.. index:: + operator: and + operator: or + operator: not + +Notes: + +(1) + This is a short-circuit operator, so it only evaluates the second + argument if the first one is :const:`False`. + +(2) + This is a short-circuit operator, so it only evaluates the second + argument if the first one is :const:`True`. + +(3) + ``not`` has a lower priority than non-Boolean operators, so ``not a == b`` is + interpreted as ``not (a == b)``, and ``a == not b`` is a syntax error. + + +.. _stdcomparisons: + +Comparisons +=========== + +.. index:: pair: chaining; comparisons + +Comparison operations are supported by all objects. They all have the same +priority (which is higher than that of the Boolean operations). Comparisons can +be chained arbitrarily; for example, ``x < y <= z`` is equivalent to ``x < y and +y <= z``, except that *y* is evaluated only once (but in both cases *z* is not +evaluated at all when ``x < y`` is found to be false). + +This table summarizes the comparison operations: + ++------------+-------------------------+-------+ +| Operation | Meaning | Notes | ++============+=========================+=======+ +| ``<`` | strictly less than | | ++------------+-------------------------+-------+ +| ``<=`` | less than or equal | | ++------------+-------------------------+-------+ +| ``>`` | strictly greater than | | ++------------+-------------------------+-------+ +| ``>=`` | greater than or equal | | ++------------+-------------------------+-------+ +| ``==`` | equal | | ++------------+-------------------------+-------+ +| ``!=`` | not equal | | ++------------+-------------------------+-------+ +| ``is`` | object identity | | ++------------+-------------------------+-------+ +| ``is not`` | negated object identity | | ++------------+-------------------------+-------+ + +.. index:: + pair: operator; comparison + operator: == + operator: is + operator: is not + +.. % XXX *All* others have funny characters < ! > + +.. index:: + pair: object; numeric + pair: objects; comparing + +Objects of different types, except different numeric types and different string +types, never compare equal; such objects are ordered consistently but +arbitrarily (so that sorting a heterogeneous array yields a consistent result). +Furthermore, some types (for example, file objects) support only a degenerate +notion of comparison where any two objects of that type are unequal. Again, +such objects are ordered arbitrarily but consistently. The ``<``, ``<=``, ``>`` +and ``>=`` operators will raise a :exc:`TypeError` exception when any operand is +a complex number. + +.. index:: single: __cmp__() (instance method) + +Instances of a class normally compare as non-equal unless the class defines the +:meth:`__cmp__` method. Refer to :ref:`customization`) for information on the +use of this method to effect object comparisons. + +**Implementation note:** Objects of different types except numbers are ordered +by their type names; objects of the same types that don't support proper +comparison are ordered by their address. + +.. index:: + operator: in + operator: not in + +Two more operations with the same syntactic priority, ``in`` and ``not in``, are +supported only by sequence types (below). + + +.. _typesnumeric: + +Numeric Types --- :class:`int`, :class:`float`, :class:`long`, :class:`complex` +=============================================================================== + +.. index:: + object: numeric + object: Boolean + object: integer + object: long integer + object: floating point + object: complex number + pair: C; language + +There are four distinct numeric types: :dfn:`plain integers`, :dfn:`long +integers`, :dfn:`floating point numbers`, and :dfn:`complex numbers`. In +addition, Booleans are a subtype of plain integers. Plain integers (also just +called :dfn:`integers`) are implemented using :ctype:`long` in C, which gives +them at least 32 bits of precision (``sys.maxint`` is always set to the maximum +plain integer value for the current platform, the minimum value is +``-sys.maxint - 1``). Long integers have unlimited precision. Floating point +numbers are implemented using :ctype:`double` in C. All bets on their precision +are off unless you happen to know the machine you are working with. + +Complex numbers have a real and imaginary part, which are each implemented using +:ctype:`double` in C. To extract these parts from a complex number *z*, use +``z.real`` and ``z.imag``. + +.. index:: + pair: numeric; literals + pair: integer; literals + triple: long; integer; literals + pair: floating point; literals + pair: complex number; literals + pair: hexadecimal; literals + pair: octal; literals + +Numbers are created by numeric literals or as the result of built-in functions +and operators. Unadorned integer literals (including hex and octal numbers) +yield plain integers unless the value they denote is too large to be represented +as a plain integer, in which case they yield a long integer. Integer literals +with an ``'L'`` or ``'l'`` suffix yield long integers (``'L'`` is preferred +because ``1l`` looks too much like eleven!). Numeric literals containing a +decimal point or an exponent sign yield floating point numbers. Appending +``'j'`` or ``'J'`` to a numeric literal yields a complex number with a zero real +part. A complex numeric literal is the sum of a real and an imaginary part. + +.. index:: + single: arithmetic + builtin: int + builtin: long + builtin: float + builtin: complex + +Python fully supports mixed arithmetic: when a binary arithmetic operator has +operands of different numeric types, the operand with the "narrower" type is +widened to that of the other, where plain integer is narrower than long integer +is narrower than floating point is narrower than complex. Comparisons between +numbers of mixed type use the same rule. [#]_ The constructors :func:`int`, +:func:`long`, :func:`float`, and :func:`complex` can be used to produce numbers +of a specific type. + +All numeric types (except complex) support the following operations, sorted by +ascending priority (operations in the same box have the same priority; all +numeric operations have a higher priority than comparison operations): + ++--------------------+---------------------------------+--------+ +| Operation | Result | Notes | ++====================+=================================+========+ +| ``x + y`` | sum of *x* and *y* | | ++--------------------+---------------------------------+--------+ +| ``x - y`` | difference of *x* and *y* | | ++--------------------+---------------------------------+--------+ +| ``x * y`` | product of *x* and *y* | | ++--------------------+---------------------------------+--------+ +| ``x / y`` | quotient of *x* and *y* | \(1) | ++--------------------+---------------------------------+--------+ +| ``x // y`` | (floored) quotient of *x* and | \(5) | +| | *y* | | ++--------------------+---------------------------------+--------+ +| ``x % y`` | remainder of ``x / y`` | \(4) | ++--------------------+---------------------------------+--------+ +| ``-x`` | *x* negated | | ++--------------------+---------------------------------+--------+ +| ``+x`` | *x* unchanged | | ++--------------------+---------------------------------+--------+ +| ``abs(x)`` | absolute value or magnitude of | | +| | *x* | | ++--------------------+---------------------------------+--------+ +| ``int(x)`` | *x* converted to integer | \(2) | ++--------------------+---------------------------------+--------+ +| ``long(x)`` | *x* converted to long integer | \(2) | ++--------------------+---------------------------------+--------+ +| ``float(x)`` | *x* converted to floating point | | ++--------------------+---------------------------------+--------+ +| ``complex(re,im)`` | a complex number with real part | | +| | *re*, imaginary part *im*. | | +| | *im* defaults to zero. | | ++--------------------+---------------------------------+--------+ +| ``c.conjugate()`` | conjugate of the complex number | | +| | *c* | | ++--------------------+---------------------------------+--------+ +| ``divmod(x, y)`` | the pair ``(x // y, x % y)`` | (3)(4) | ++--------------------+---------------------------------+--------+ +| ``pow(x, y)`` | *x* to the power *y* | | ++--------------------+---------------------------------+--------+ +| ``x ** y`` | *x* to the power *y* | | ++--------------------+---------------------------------+--------+ + +.. index:: + triple: operations on; numeric; types + single: conjugate() (complex number method) + +Notes: + +(1) + .. index:: + pair: integer; division + triple: long; integer; division + + For (plain or long) integer division, the result is an integer. The result is + always rounded towards minus infinity: 1/2 is 0, (-1)/2 is -1, 1/(-2) is -1, and + (-1)/(-2) is 0. Note that the result is a long integer if either operand is a + long integer, regardless of the numeric value. + +(2) + .. index:: + module: math + single: floor() (in module math) + single: ceil() (in module math) + pair: numeric; conversions + pair: C; language + + Conversion from floating point to (long or plain) integer may round or truncate + as in C; see functions :func:`floor` and :func:`ceil` in the :mod:`math` module + for well-defined conversions. + +(3) + See :ref:`built-in-funcs` for a full description. + +(4) + Complex floor division operator, modulo operator, and :func:`divmod`. + + .. deprecated:: 2.3 + Instead convert to float using :func:`abs` if appropriate. + +(5) + Also referred to as integer division. The resultant value is a whole integer, + though the result's type is not necessarily int. + +.. % XXXJH exceptions: overflow (when? what operations?) zerodivision + + +.. _bitstring-ops: + +Bit-string Operations on Integer Types +-------------------------------------- + +.. _bit-string-operations: + +Plain and long integer types support additional operations that make sense only +for bit-strings. Negative numbers are treated as their 2's complement value +(for long integers, this assumes a sufficiently large number of bits that no +overflow occurs during the operation). + +The priorities of the binary bit-wise operations are all lower than the numeric +operations and higher than the comparisons; the unary operation ``~`` has the +same priority as the other unary numeric operations (``+`` and ``-``). + +This table lists the bit-string operations sorted in ascending priority +(operations in the same box have the same priority): + ++------------+--------------------------------+----------+ +| Operation | Result | Notes | ++============+================================+==========+ +| ``x | y`` | bitwise :dfn:`or` of *x* and | | +| | *y* | | ++------------+--------------------------------+----------+ +| ``x ^ y`` | bitwise :dfn:`exclusive or` of | | +| | *x* and *y* | | ++------------+--------------------------------+----------+ +| ``x & y`` | bitwise :dfn:`and` of *x* and | | +| | *y* | | ++------------+--------------------------------+----------+ +| ``x << n`` | *x* shifted left by *n* bits | (1), (2) | ++------------+--------------------------------+----------+ +| ``x >> n`` | *x* shifted right by *n* bits | (1), (3) | ++------------+--------------------------------+----------+ +| ``~x`` | the bits of *x* inverted | | ++------------+--------------------------------+----------+ + +.. index:: + triple: operations on; integer; types + pair: bit-string; operations + pair: shifting; operations + pair: masking; operations + +Notes: + +(1) + Negative shift counts are illegal and cause a :exc:`ValueError` to be raised. + +(2) + A left shift by *n* bits is equivalent to multiplication by ``pow(2, n)`` + without overflow check. + +(3) + A right shift by *n* bits is equivalent to division by ``pow(2, n)`` without + overflow check. + + +.. _typeiter: + +Iterator Types +============== + +.. versionadded:: 2.2 + +.. index:: + single: iterator protocol + single: protocol; iterator + single: sequence; iteration + single: container; iteration over + +Python supports a concept of iteration over containers. This is implemented +using two distinct methods; these are used to allow user-defined classes to +support iteration. Sequences, described below in more detail, always support +the iteration methods. + +One method needs to be defined for container objects to provide iteration +support: + + +.. method:: container.__iter__() + + Return an iterator object. The object is required to support the iterator + protocol described below. If a container supports different types of + iteration, additional methods can be provided to specifically request + iterators for those iteration types. (An example of an object supporting + multiple forms of iteration would be a tree structure which supports both + breadth-first and depth-first traversal.) This method corresponds to the + :attr:`tp_iter` slot of the type structure for Python objects in the Python/C + API. + +The iterator objects themselves are required to support the following two +methods, which together form the :dfn:`iterator protocol`: + + +.. method:: iterator.__iter__() + + Return the iterator object itself. This is required to allow both containers + and iterators to be used with the :keyword:`for` and :keyword:`in` statements. + This method corresponds to the :attr:`tp_iter` slot of the type structure for + Python objects in the Python/C API. + + +.. method:: iterator.next() + + Return the next item from the container. If there are no further items, raise + the :exc:`StopIteration` exception. This method corresponds to the + :attr:`tp_iternext` slot of the type structure for Python objects in the + Python/C API. + +Python defines several iterator objects to support iteration over general and +specific sequence types, dictionaries, and other more specialized forms. The +specific types are not important beyond their implementation of the iterator +protocol. + +The intention of the protocol is that once an iterator's :meth:`__next__` method +raises :exc:`StopIteration`, it will continue to do so on subsequent calls. +Implementations that do not obey this property are deemed broken. (This +constraint was added in Python 2.3; in Python 2.2, various iterators are broken +according to this rule.) + +Python's generators provide a convenient way to implement the iterator protocol. +If a container object's :meth:`__iter__` method is implemented as a generator, +it will automatically return an iterator object (technically, a generator +object) supplying the :meth:`__iter__` and :meth:`__next__` methods. + + +.. _typesseq: + +Sequence Types --- :class:`str`, :class:`unicode`, :class:`list`, :class:`tuple`, :class:`buffer`, :class:`range` +================================================================================================================= + +There are six sequence types: strings, Unicode strings, lists, tuples, buffers, +and range objects. +(For other containers see the built in :class:`dict`, :class:`list`, +:class:`set`, and :class:`tuple` classes, and the :mod:`collections` +module.) + + +.. index:: + object: sequence + object: string + object: tuple + object: list + object: buffer + object: range + +String literals are written in single or double quotes: ``'xyzzy'``, +``"frobozz"``. See :ref:`strings` for more about string literals. In addition +to the functionality described here, there are also string-specific methods +described in the :ref:`string-methods` section. Lists are constructed with +square brackets, separating items with commas: ``[a, b, c]``. Tuples are +constructed by the comma operator (not within square brackets), with or without +enclosing parentheses, but an empty tuple must have the enclosing parentheses, +such as ``a, b, c`` or ``()``. A single item tuple must have a trailing comma, +such as ``(d,)``. + +Buffer objects are not directly supported by Python syntax, but can be created +by calling the builtin function :func:`buffer`. They don't support +concatenation or repetition. + +Objects of type range are similar to buffers in that there is no specific syntax to +create them, but they are created using the :func:`range` function. They don't +support slicing, concatenation or repetition, and using ``in``, ``not in``, +:func:`min` or :func:`max` on them is inefficient. + +Most sequence types support the following operations. The ``in`` and ``not in`` +operations have the same priorities as the comparison operations. The ``+`` and +``*`` operations have the same priority as the corresponding numeric operations. +[#]_ + +This table lists the sequence operations sorted in ascending priority +(operations in the same box have the same priority). In the table, *s* and *t* +are sequences of the same type; *n*, *i* and *j* are integers: + ++------------------+--------------------------------+----------+ +| Operation | Result | Notes | ++==================+================================+==========+ +| ``x in s`` | ``True`` if an item of *s* is | \(1) | +| | equal to *x*, else ``False`` | | ++------------------+--------------------------------+----------+ +| ``x not in s`` | ``False`` if an item of *s* is | \(1) | +| | equal to *x*, else ``True`` | | ++------------------+--------------------------------+----------+ +| ``s + t`` | the concatenation of *s* and | \(6) | +| | *t* | | ++------------------+--------------------------------+----------+ +| ``s * n, n * s`` | *n* shallow copies of *s* | \(2) | +| | concatenated | | ++------------------+--------------------------------+----------+ +| ``s[i]`` | *i*'th item of *s*, origin 0 | \(3) | ++------------------+--------------------------------+----------+ +| ``s[i:j]`` | slice of *s* from *i* to *j* | (3), (4) | ++------------------+--------------------------------+----------+ +| ``s[i:j:k]`` | slice of *s* from *i* to *j* | (3), (5) | +| | with step *k* | | ++------------------+--------------------------------+----------+ +| ``len(s)`` | length of *s* | | ++------------------+--------------------------------+----------+ +| ``min(s)`` | smallest item of *s* | | ++------------------+--------------------------------+----------+ +| ``max(s)`` | largest item of *s* | | ++------------------+--------------------------------+----------+ + +Sequence types also support comparisons. In particular, tuples and lists +are compared lexicographically by comparing corresponding +elements. This means that to compare equal, every element must compare +equal and the two sequences must be of the same type and have the same +length. (For full details see :ref:`comparisons` in the language +reference.) + +.. index:: + triple: operations on; sequence; types + builtin: len + builtin: min + builtin: max + pair: concatenation; operation + pair: repetition; operation + pair: subscript; operation + pair: slice; operation + pair: extended slice; operation + operator: in + operator: not in + +Notes: + +(1) + When *s* is a string or Unicode string object the ``in`` and ``not in`` + operations act like a substring test. In Python versions before 2.3, *x* had to + be a string of length 1. In Python 2.3 and beyond, *x* may be a string of any + length. + +(2) + Values of *n* less than ``0`` are treated as ``0`` (which yields an empty + sequence of the same type as *s*). Note also that the copies are shallow; + nested structures are not copied. This often haunts new Python programmers; + consider:: + + >>> lists = [[]] * 3 + >>> lists + [[], [], []] + >>> lists[0].append(3) + >>> lists + [[3], [3], [3]] + + What has happened is that ``[[]]`` is a one-element list containing an empty + list, so all three elements of ``[[]] * 3`` are (pointers to) this single empty + list. Modifying any of the elements of ``lists`` modifies this single list. + You can create a list of different lists this way:: + + >>> lists = [[] for i in range(3)] + >>> lists[0].append(3) + >>> lists[1].append(5) + >>> lists[2].append(7) + >>> lists + [[3], [5], [7]] + +(3) + If *i* or *j* is negative, the index is relative to the end of the string: + ``len(s) + i`` or ``len(s) + j`` is substituted. But note that ``-0`` is still + ``0``. + +(4) + The slice of *s* from *i* to *j* is defined as the sequence of items with index + *k* such that ``i <= k < j``. If *i* or *j* is greater than ``len(s)``, use + ``len(s)``. If *i* is omitted or ``None``, use ``0``. If *j* is omitted or + ``None``, use ``len(s)``. If *i* is greater than or equal to *j*, the slice is + empty. + +(5) + The slice of *s* from *i* to *j* with step *k* is defined as the sequence of + items with index ``x = i + n*k`` such that 0 ≤n < (j-i)/(k). In other words, + the indices are ``i``, ``i+k``, ``i+2*k``, ``i+3*k`` and so on, stopping when + *j* is reached (but never including *j*). If *i* or *j* is greater than + ``len(s)``, use ``len(s)``. If *i* or *j* are omitted or ``None``, they become + "end" values (which end depends on the sign of *k*). Note, *k* cannot be zero. + If *k* is ``None``, it is treated like ``1``. + +(6) + If *s* and *t* are both strings, some Python implementations such as CPython can + usually perform an in-place optimization for assignments of the form ``s=s+t`` + or ``s+=t``. When applicable, this optimization makes quadratic run-time much + less likely. This optimization is both version and implementation dependent. + For performance sensitive code, it is preferable to use the :meth:`str.join` + method which assures consistent linear concatenation performance across versions + and implementations. + + .. versionchanged:: 2.4 + Formerly, string concatenation never occurred in-place. + + +.. _string-methods: + +String Methods +-------------- + +.. index:: pair: string; methods + +Below are listed the string methods which both 8-bit strings and Unicode objects +support. In addition, Python's strings support the sequence type methods +described in the :ref:`typesseq` section. To output formatted strings +use template strings or the ``%`` operator described in the +:ref:`string-formatting` section. Also, see the :mod:`re` module for +string functions based on regular expressions. + +.. method:: str.capitalize() + + Return a copy of the string with only its first character capitalized. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.center(width[, fillchar]) + + Return centered in a string of length *width*. Padding is done using the + specified *fillchar* (default is a space). + + .. versionchanged:: 2.4 + Support for the *fillchar* argument. + + +.. method:: str.count(sub[, start[, end]]) + + Return the number of occurrences of substring *sub* in string S\ + ``[start:end]``. Optional arguments *start* and *end* are interpreted as in + slice notation. + + +.. method:: str.decode([encoding[, errors]]) + + Decodes the string using the codec registered for *encoding*. *encoding* + defaults to the default string encoding. *errors* may be given to set a + different error handling scheme. The default is ``'strict'``, meaning that + encoding errors raise :exc:`UnicodeError`. Other possible values are + ``'ignore'``, ``'replace'`` and any other name registered via + :func:`codecs.register_error`, see section :ref:`codec-base-classes`. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.3 + Support for other error handling schemes added. + + +.. method:: str.encode([encoding[,errors]]) + + Return an encoded version of the string. Default encoding is the current + default string encoding. *errors* may be given to set a different error + handling scheme. The default for *errors* is ``'strict'``, meaning that + encoding errors raise a :exc:`UnicodeError`. Other possible values are + ``'ignore'``, ``'replace'``, ``'xmlcharrefreplace'``, ``'backslashreplace'`` and + any other name registered via :func:`codecs.register_error`, see section + :ref:`codec-base-classes`. For a list of possible encodings, see section + :ref:`standard-encodings`. + + .. versionadded:: 2.0 + + .. versionchanged:: 2.3 + Support for ``'xmlcharrefreplace'`` and ``'backslashreplace'`` and other error + handling schemes added. + + +.. method:: str.endswith(suffix[, start[, end]]) + + Return ``True`` if the string ends with the specified *suffix*, otherwise return + ``False``. *suffix* can also be a tuple of suffixes to look for. With optional + *start*, test beginning at that position. With optional *end*, stop comparing + at that position. + + .. versionchanged:: 2.5 + Accept tuples as *suffix*. + + +.. method:: str.expandtabs([tabsize]) + + Return a copy of the string where all tab characters are expanded using spaces. + If *tabsize* is not given, a tab size of ``8`` characters is assumed. + + +.. method:: str.find(sub[, start[, end]]) + + Return the lowest index in the string where substring *sub* is found, such that + *sub* is contained in the range [*start*, *end*]. Optional arguments *start* + and *end* are interpreted as in slice notation. Return ``-1`` if *sub* is not + found. + + +.. method:: str.index(sub[, start[, end]]) + + Like :meth:`find`, but raise :exc:`ValueError` when the substring is not found. + + +.. method:: str.isalnum() + + Return true if all characters in the string are alphanumeric and there is at + least one character, false otherwise. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.isalpha() + + Return true if all characters in the string are alphabetic and there is at least + one character, false otherwise. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.isdigit() + + Return true if all characters in the string are digits and there is at least one + character, false otherwise. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.isidentifier() + + Return true if the string is a valid identifier according to the language + definition. + + .. XXX link to the definition? + + +.. method:: str.islower() + + Return true if all cased characters in the string are lowercase and there is at + least one cased character, false otherwise. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.isspace() + + Return true if there are only whitespace characters in the string and there is + at least one character, false otherwise. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.istitle() + + Return true if the string is a titlecased string and there is at least one + character, for example uppercase characters may only follow uncased characters + and lowercase characters only cased ones. Return false otherwise. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.isupper() + + Return true if all cased characters in the string are uppercase and there is at + least one cased character, false otherwise. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.join(seq) + + Return a string which is the concatenation of the strings in the sequence *seq*. + The separator between elements is the string providing this method. + + +.. method:: str.ljust(width[, fillchar]) + + Return the string left justified in a string of length *width*. Padding is done + using the specified *fillchar* (default is a space). The original string is + returned if *width* is less than ``len(s)``. + + .. versionchanged:: 2.4 + Support for the *fillchar* argument. + + +.. method:: str.lower() + + Return a copy of the string converted to lowercase. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.lstrip([chars]) + + Return a copy of the string with leading characters removed. The *chars* + argument is a string specifying the set of characters to be removed. If omitted + or ``None``, the *chars* argument defaults to removing whitespace. The *chars* + argument is not a prefix; rather, all combinations of its values are stripped:: + + >>> ' spacious '.lstrip() + 'spacious ' + >>> 'www.example.com'.lstrip('cmowz.') + 'example.com' + + .. versionchanged:: 2.2.2 + Support for the *chars* argument. + + +.. method:: str.partition(sep) + + Split the string at the first occurrence of *sep*, and return a 3-tuple + containing the part before the separator, the separator itself, and the part + after the separator. If the separator is not found, return a 3-tuple containing + the string itself, followed by two empty strings. + + .. versionadded:: 2.5 + + +.. method:: str.replace(old, new[, count]) + + Return a copy of the string with all occurrences of substring *old* replaced by + *new*. If the optional argument *count* is given, only the first *count* + occurrences are replaced. + + +.. method:: str.rfind(sub [,start [,end]]) + + Return the highest index in the string where substring *sub* is found, such that + *sub* is contained within s[start,end]. Optional arguments *start* and *end* + are interpreted as in slice notation. Return ``-1`` on failure. + + +.. method:: str.rindex(sub[, start[, end]]) + + Like :meth:`rfind` but raises :exc:`ValueError` when the substring *sub* is not + found. + + +.. method:: str.rjust(width[, fillchar]) + + Return the string right justified in a string of length *width*. Padding is done + using the specified *fillchar* (default is a space). The original string is + returned if *width* is less than ``len(s)``. + + .. versionchanged:: 2.4 + Support for the *fillchar* argument. + + +.. method:: str.rpartition(sep) + + Split the string at the last occurrence of *sep*, and return a 3-tuple + containing the part before the separator, the separator itself, and the part + after the separator. If the separator is not found, return a 3-tuple containing + two empty strings, followed by the string itself. + + .. versionadded:: 2.5 + + +.. method:: str.rsplit([sep [,maxsplit]]) + + Return a list of the words in the string, using *sep* as the delimiter string. + If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost* + ones. If *sep* is not specified or ``None``, any whitespace string is a + separator. Except for splitting from the right, :meth:`rsplit` behaves like + :meth:`split` which is described in detail below. + + .. versionadded:: 2.4 + + +.. method:: str.rstrip([chars]) + + Return a copy of the string with trailing characters removed. The *chars* + argument is a string specifying the set of characters to be removed. If omitted + or ``None``, the *chars* argument defaults to removing whitespace. The *chars* + argument is not a suffix; rather, all combinations of its values are stripped:: + + >>> ' spacious '.rstrip() + ' spacious' + >>> 'mississippi'.rstrip('ipz') + 'mississ' + + .. versionchanged:: 2.2.2 + Support for the *chars* argument. + + +.. method:: str.split([sep [,maxsplit]]) + + Return a list of the words in the string, using *sep* as the delimiter string. + If *maxsplit* is given, at most *maxsplit* splits are done. (thus, the list will + have at most ``maxsplit+1`` elements). If *maxsplit* is not specified, then + there is no limit on the number of splits (all possible splits are made). + Consecutive delimiters are not grouped together and are deemed to delimit empty + strings (for example, ``'1,,2'.split(',')`` returns ``['1', '', '2']``). The + *sep* argument may consist of multiple characters (for example, ``'1, 2, + 3'.split(', ')`` returns ``['1', '2', '3']``). Splitting an empty string with a + specified separator returns ``['']``. + + If *sep* is not specified or is ``None``, a different splitting algorithm is + applied. First, whitespace characters (spaces, tabs, newlines, returns, and + formfeeds) are stripped from both ends. Then, words are separated by arbitrary + length strings of whitespace characters. Consecutive whitespace delimiters are + treated as a single delimiter (``'1 2 3'.split()`` returns ``['1', '2', + '3']``). Splitting an empty string or a string consisting of just whitespace + returns an empty list. + + +.. method:: str.splitlines([keepends]) + + Return a list of the lines in the string, breaking at line boundaries. Line + breaks are not included in the resulting list unless *keepends* is given and + true. + + +.. method:: str.startswith(prefix[, start[, end]]) + + Return ``True`` if string starts with the *prefix*, otherwise return ``False``. + *prefix* can also be a tuple of prefixes to look for. With optional *start*, + test string beginning at that position. With optional *end*, stop comparing + string at that position. + + .. versionchanged:: 2.5 + Accept tuples as *prefix*. + + +.. method:: str.strip([chars]) + + Return a copy of the string with the leading and trailing characters removed. + The *chars* argument is a string specifying the set of characters to be removed. + If omitted or ``None``, the *chars* argument defaults to removing whitespace. + The *chars* argument is not a prefix or suffix; rather, all combinations of its + values are stripped:: + + >>> ' spacious '.strip() + 'spacious' + >>> 'www.example.com'.strip('cmowz.') + 'example' + + .. versionchanged:: 2.2.2 + Support for the *chars* argument. + + +.. method:: str.swapcase() + + Return a copy of the string with uppercase characters converted to lowercase and + vice versa. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.title() + + Return a titlecased version of the string: words start with uppercase + characters, all remaining cased characters are lowercase. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.translate(table[, deletechars]) + + Return a copy of the string where all characters occurring in the optional + argument *deletechars* are removed, and the remaining characters have been + mapped through the given translation table, which must be a string of length + 256. + + You can use the :func:`maketrans` helper function in the :mod:`string` module to + create a translation table. For string objects, set the *table* argument to + ``None`` for translations that only delete characters:: + + >>> 'read this short text'.translate(None, 'aeiou') + 'rd ths shrt txt' + + .. versionadded:: 2.6 + Support for a ``None`` *table* argument. + + For Unicode objects, the :meth:`translate` method does not accept the optional + *deletechars* argument. Instead, it returns a copy of the *s* where all + characters have been mapped through the given translation table which must be a + mapping of Unicode ordinals to Unicode ordinals, Unicode strings or ``None``. + Unmapped characters are left untouched. Characters mapped to ``None`` are + deleted. Note, a more flexible approach is to create a custom character mapping + codec using the :mod:`codecs` module (see :mod:`encodings.cp1251` for an + example). + + +.. method:: str.upper() + + Return a copy of the string converted to uppercase. + + For 8-bit strings, this method is locale-dependent. + + +.. method:: str.zfill(width) + + Return the numeric string left filled with zeros in a string of length *width*. + The original string is returned if *width* is less than ``len(s)``. + + .. versionadded:: 2.2.2 + + +.. _string-formatting: + +String Formatting Operations +---------------------------- + +.. index:: + single: formatting, string (%) + single: interpolation, string (%) + single: string; formatting + single: string; interpolation + single: printf-style formatting + single: sprintf-style formatting + single: % formatting + single: % interpolation + +String and Unicode objects have one unique built-in operation: the ``%`` +operator (modulo). This is also known as the string *formatting* or +*interpolation* operator. Given ``format % values`` (where *format* is a string +or Unicode object), ``%`` conversion specifications in *format* are replaced +with zero or more elements of *values*. The effect is similar to the using +:cfunc:`sprintf` in the C language. If *format* is a Unicode object, or if any +of the objects being converted using the ``%s`` conversion are Unicode objects, +the result will also be a Unicode object. + +If *format* requires a single argument, *values* may be a single non-tuple +object. [#]_ Otherwise, *values* must be a tuple with exactly the number of +items specified by the format string, or a single mapping object (for example, a +dictionary). + +A conversion specifier contains two or more characters and has the following +components, which must occur in this order: + +#. The ``'%'`` character, which marks the start of the specifier. + +#. Mapping key (optional), consisting of a parenthesised sequence of characters + (for example, ``(somename)``). + +#. Conversion flags (optional), which affect the result of some conversion + types. + +#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the + actual width is read from the next element of the tuple in *values*, and the + object to convert comes after the minimum field width and optional precision. + +#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If + specified as ``'*'`` (an asterisk), the actual width is read from the next + element of the tuple in *values*, and the value to convert comes after the + precision. + +#. Length modifier (optional). + +#. Conversion type. + +When the right argument is a dictionary (or other mapping type), then the +formats in the string *must* include a parenthesised mapping key into that +dictionary inserted immediately after the ``'%'`` character. The mapping key +selects the value to be formatted from the mapping. For example:: + + >>> print '%(language)s has %(#)03d quote types.' % \ + {'language': "Python", "#": 2} + Python has 002 quote types. + +In this case no ``*`` specifiers may occur in a format (since they require a +sequential parameter list). + +The conversion flag characters are: + ++---------+---------------------------------------------------------------------+ +| Flag | Meaning | ++=========+=====================================================================+ +| ``'#'`` | The value conversion will use the "alternate form" (where defined | +| | below). | ++---------+---------------------------------------------------------------------+ +| ``'0'`` | The conversion will be zero padded for numeric values. | ++---------+---------------------------------------------------------------------+ +| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` | +| | conversion if both are given). | ++---------+---------------------------------------------------------------------+ +| ``' '`` | (a space) A blank should be left before a positive number (or empty | +| | string) produced by a signed conversion. | ++---------+---------------------------------------------------------------------+ +| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion | +| | (overrides a "space" flag). | ++---------+---------------------------------------------------------------------+ + +A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it +is not necessary for Python. + +The conversion types are: + ++------------+-----------------------------------------------------+-------+ +| Conversion | Meaning | Notes | ++============+=====================================================+=======+ +| ``'d'`` | Signed integer decimal. | | ++------------+-----------------------------------------------------+-------+ +| ``'i'`` | Signed integer decimal. | | ++------------+-----------------------------------------------------+-------+ +| ``'o'`` | Unsigned octal. | \(1) | ++------------+-----------------------------------------------------+-------+ +| ``'u'`` | Unsigned decimal. | | ++------------+-----------------------------------------------------+-------+ +| ``'x'`` | Unsigned hexadecimal (lowercase). | \(2) | ++------------+-----------------------------------------------------+-------+ +| ``'X'`` | Unsigned hexadecimal (uppercase). | \(2) | ++------------+-----------------------------------------------------+-------+ +| ``'e'`` | Floating point exponential format (lowercase). | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'E'`` | Floating point exponential format (uppercase). | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'f'`` | Floating point decimal format. | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'F'`` | Floating point decimal format. | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'g'`` | Floating point format. Uses exponential format if | \(4) | +| | exponent is greater than -4 or less than precision, | | +| | decimal format otherwise. | | ++------------+-----------------------------------------------------+-------+ +| ``'G'`` | Floating point format. Uses exponential format if | \(4) | +| | exponent is greater than -4 or less than precision, | | +| | decimal format otherwise. | | ++------------+-----------------------------------------------------+-------+ +| ``'c'`` | Single character (accepts integer or single | | +| | character string). | | ++------------+-----------------------------------------------------+-------+ +| ``'r'`` | String (converts any python object using | \(5) | +| | :func:`repr`). | | ++------------+-----------------------------------------------------+-------+ +| ``'s'`` | String (converts any python object using | \(6) | +| | :func:`str`). | | ++------------+-----------------------------------------------------+-------+ +| ``'%'`` | No argument is converted, results in a ``'%'`` | | +| | character in the result. | | ++------------+-----------------------------------------------------+-------+ + +Notes: + +(1) + The alternate form causes a leading zero (``'0'``) to be inserted between + left-hand padding and the formatting of the number if the leading character + of the result is not already a zero. + +(2) + The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether + the ``'x'`` or ``'X'`` format was used) to be inserted between left-hand padding + and the formatting of the number if the leading character of the result is not + already a zero. + +(3) + The alternate form causes the result to always contain a decimal point, even if + no digits follow it. + + The precision determines the number of digits after the decimal point and + defaults to 6. + +(4) + The alternate form causes the result to always contain a decimal point, and + trailing zeroes are not removed as they would otherwise be. + + The precision determines the number of significant digits before and after the + decimal point and defaults to 6. + +(5) + The ``%r`` conversion was added in Python 2.0. + + The precision determines the maximal number of characters used. + +(6) + If the object or format provided is a :class:`unicode` string, the resulting + string will also be :class:`unicode`. + + The precision determines the maximal number of characters used. + +Since Python strings have an explicit length, ``%s`` conversions do not assume +that ``'\0'`` is the end of the string. + +.. % XXX Examples? + +For safety reasons, floating point precisions are clipped to 50; ``%f`` +conversions for numbers whose absolute value is over 1e25 are replaced by ``%g`` +conversions. [#]_ All other errors raise exceptions. + +.. index:: + module: string + module: re + +Additional string operations are defined in standard modules :mod:`string` and +:mod:`re`. + + +.. _typesseq-range: + +XRange Type +----------- + +.. index:: object: range + +The :class:`range` type is an immutable sequence which is commonly used for +looping. The advantage of the :class:`range` type is that an :class:`range` +object will always take the same amount of memory, no matter the size of the +range it represents. There are no consistent performance advantages. + +XRange objects have very little behavior: they only support indexing, iteration, +and the :func:`len` function. + + +.. _typesseq-mutable: + +Mutable Sequence Types +---------------------- + +.. index:: + triple: mutable; sequence; types + object: list + +List objects support additional operations that allow in-place modification of +the object. Other mutable sequence types (when added to the language) should +also support these operations. Strings and tuples are immutable sequence types: +such objects cannot be modified once created. The following operations are +defined on mutable sequence types (where *x* is an arbitrary object): + ++------------------------------+--------------------------------+---------------------+ +| Operation | Result | Notes | ++==============================+================================+=====================+ +| ``s[i] = x`` | item *i* of *s* is replaced by | | +| | *x* | | ++------------------------------+--------------------------------+---------------------+ +| ``s[i:j] = t`` | slice of *s* from *i* to *j* | | +| | is replaced by the contents of | | +| | the iterable *t* | | ++------------------------------+--------------------------------+---------------------+ +| ``del s[i:j]`` | same as ``s[i:j] = []`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) | +| | are replaced by those of *t* | | ++------------------------------+--------------------------------+---------------------+ +| ``del s[i:j:k]`` | removes the elements of | | +| | ``s[i:j:k]`` from the list | | ++------------------------------+--------------------------------+---------------------+ +| ``s.append(x)`` | same as ``s[len(s):len(s)] = | \(2) | +| | [x]`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.extend(x)`` | same as ``s[len(s):len(s)] = | \(3) | +| | x`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.count(x)`` | return number of *i*'s for | | +| | which ``s[i] == x`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.index(x[, i[, j]])`` | return smallest *k* such that | \(4) | +| | ``s[k] == x`` and ``i <= k < | | +| | j`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.insert(i, x)`` | same as ``s[i:i] = [x]`` | \(5) | ++------------------------------+--------------------------------+---------------------+ +| ``s.pop([i])`` | same as ``x = s[i]; del s[i]; | \(6) | +| | return x`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.remove(x)`` | same as ``del s[s.index(x)]`` | \(4) | ++------------------------------+--------------------------------+---------------------+ +| ``s.reverse()`` | reverses the items of *s* in | \(7) | +| | place | | ++------------------------------+--------------------------------+---------------------+ +| ``s.sort([cmp[, key[, | sort the items of *s* in place | (7), (8), (9), (10) | +| reverse]]])`` | | | ++------------------------------+--------------------------------+---------------------+ + +.. index:: + triple: operations on; sequence; types + triple: operations on; list; type + pair: subscript; assignment + pair: slice; assignment + pair: extended slice; assignment + statement: del + single: append() (list method) + single: extend() (list method) + single: count() (list method) + single: index() (list method) + single: insert() (list method) + single: pop() (list method) + single: remove() (list method) + single: reverse() (list method) + single: sort() (list method) + +Notes: + +(1) + *t* must have the same length as the slice it is replacing. + +(2) + The C implementation of Python has historically accepted multiple parameters and + implicitly joined them into a tuple; this no longer works in Python 2.0. Use of + this misfeature has been deprecated since Python 1.4. + +(3) + *x* can be any iterable object. + +(4) + Raises :exc:`ValueError` when *x* is not found in *s*. When a negative index is + passed as the second or third parameter to the :meth:`index` method, the list + length is added, as for slice indices. If it is still negative, it is truncated + to zero, as for slice indices. + + .. versionchanged:: 2.3 + Previously, :meth:`index` didn't have arguments for specifying start and stop + positions. + +(5) + When a negative index is passed as the first parameter to the :meth:`insert` + method, the list length is added, as for slice indices. If it is still + negative, it is truncated to zero, as for slice indices. + + .. versionchanged:: 2.3 + Previously, all negative indices were truncated to zero. + +(6) + The :meth:`pop` method is only supported by the list and array types. The + optional argument *i* defaults to ``-1``, so that by default the last item is + removed and returned. + +(7) + The :meth:`sort` and :meth:`reverse` methods modify the list in place for + economy of space when sorting or reversing a large list. To remind you that + they operate by side effect, they don't return the sorted or reversed list. + +(8) + The :meth:`sort` method takes optional arguments for controlling the + comparisons. + + *cmp* specifies a custom comparison function of two arguments (list items) which + should return a negative, zero or positive number depending on whether the first + argument is considered smaller than, equal to, or larger than the second + argument: ``cmp=lambda x,y: cmp(x.lower(), y.lower())`` + + *key* specifies a function of one argument that is used to extract a comparison + key from each list element: ``key=str.lower`` + + *reverse* is a boolean value. If set to ``True``, then the list elements are + sorted as if each comparison were reversed. + + In general, the *key* and *reverse* conversion processes are much faster than + specifying an equivalent *cmp* function. This is because *cmp* is called + multiple times for each list element while *key* and *reverse* touch each + element only once. + + .. versionchanged:: 2.3 + Support for ``None`` as an equivalent to omitting *cmp* was added. + + .. versionchanged:: 2.4 + Support for *key* and *reverse* was added. + +(9) + Starting with Python 2.3, the :meth:`sort` method is guaranteed to be stable. A + sort is stable if it guarantees not to change the relative order of elements + that compare equal --- this is helpful for sorting in multiple passes (for + example, sort by department, then by salary grade). + +(10) + While a list is being sorted, the effect of attempting to mutate, or even + inspect, the list is undefined. The C implementation of Python 2.3 and newer + makes the list appear empty for the duration, and raises :exc:`ValueError` if it + can detect that the list has been mutated during a sort. + + +.. _types-set: + +Set Types --- :class:`set`, :class:`frozenset` +============================================== + +.. index:: object: set + +A :dfn:`set` object is an unordered collection of distinct hashable objects. +Common uses include membership testing, removing duplicates from a sequence, and +computing mathematical operations such as intersection, union, difference, and +symmetric difference. +(For other containers see the built in :class:`dict`, :class:`list`, +and :class:`tuple` classes, and the :mod:`collections` module.) + + +.. versionadded:: 2.4 + +Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in +set``. Being an unordered collection, sets do not record element position or +order of insertion. Accordingly, sets do not support indexing, slicing, or +other sequence-like behavior. + +There are currently two builtin set types, :class:`set` and :class:`frozenset`. +The :class:`set` type is mutable --- the contents can be changed using methods +like :meth:`add` and :meth:`remove`. Since it is mutable, it has no hash value +and cannot be used as either a dictionary key or as an element of another set. +The :class:`frozenset` type is immutable and hashable --- its contents cannot be +altered after it is created; it can therefore be used as a dictionary key or as +an element of another set. + +The constructors for both classes work the same: + +.. class:: set([iterable]) + frozenset([iterable]) + + Return a new set or frozenset object whose elements are taken from + *iterable*. The elements of a set must be hashable. To represent sets of + sets, the inner sets must be :class:`frozenset` objects. If *iterable* is + not specified, a new empty set is returned. + +Instances of :class:`set` and :class:`frozenset` provide the following +operations: + +.. describe:: len(s) + + Return the cardinality of set *s*. + +.. describe:: x in s + + Test *x* for membership in *s*. + +.. describe:: x not in s + + Test *x* for non-membership in *s*. + +.. method:: set.issubset(other) + set <= other + + Test whether every element in the set is in *other*. + +.. method:: set.issuperset(other) + set >= other + + Test whether every element in *other* is in the set. + +.. method:: set.union(other) + set | other + + Return a new set with elements from both sets. + +.. method:: set.intersection(other) + set & other + + Return a new set with elements common to both sets. + +.. method:: set.difference(other) + set - other + + Return a new set with elements in the set that are not in *other*. + +.. method:: set.symmetric_difference(other) + set ^ other + + Return a new set with elements in either the set or *other* but not both. + +.. method:: set.copy() + + Return a new set with a shallow copy of *s*. + + +Note, the non-operator versions of :meth:`union`, :meth:`intersection`, +:meth:`difference`, and :meth:`symmetric_difference`, :meth:`issubset`, and +:meth:`issuperset` methods will accept any iterable as an argument. In +contrast, their operator based counterparts require their arguments to be sets. +This precludes error-prone constructions like ``set('abc') & 'cbs'`` in favor of +the more readable ``set('abc').intersection('cbs')``. + +Both :class:`set` and :class:`frozenset` support set to set comparisons. Two +sets are equal if and only if every element of each set is contained in the +other (each is a subset of the other). A set is less than another set if and +only if the first set is a proper subset of the second set (is a subset, but is +not equal). A set is greater than another set if and only if the first set is a +proper superset of the second set (is a superset, but is not equal). + +Instances of :class:`set` are compared to instances of :class:`frozenset` based +on their members. For example, ``set('abc') == frozenset('abc')`` returns +``True``. + +The subset and equality comparisons do not generalize to a complete ordering +function. For example, any two disjoint sets are not equal and are not subsets +of each other, so *all* of the following return ``False``: ``a<b``, ``a==b``, +or ``a>b``. Accordingly, sets do not implement the :meth:`__cmp__` method. + +Since sets only define partial ordering (subset relationships), the output of +the :meth:`list.sort` method is undefined for lists of sets. + +Set elements are like dictionary keys; they need to define both :meth:`__hash__` +and :meth:`__eq__` methods. + +Binary operations that mix :class:`set` instances with :class:`frozenset` return +the type of the first operand. For example: ``frozenset('ab') | set('bc')`` +returns an instance of :class:`frozenset`. + +The following table lists operations available for :class:`set` that do not +apply to immutable instances of :class:`frozenset`: + +.. method:: set.update(other) + set |= other + + Update the set, adding elements from *other*. + +.. method:: set.intersection_update(other) + set &= other + + Update the set, keeping only elements found in it and *other*. + +.. method:: set.difference_update(other) + set -= other + + Update the set, removing elements found in *other*. + +.. method:: set.symmetric_difference_update(other) + set ^= other + + Update the set, keeping only elements found in either set, but not in both. + +.. method:: set.add(el) + + Add element *el* to the set. + +.. method:: set.remove(el) + + Remove element *el* from the set. Raises :exc:`KeyError` if *el* is not + contained in the set. + +.. method:: set.discard(el) + + Remove element *el* from the set if it is present. + +.. method:: set.pop() + + Remove and return an arbitrary element from the set. Raises :exc:`KeyError` + if the set is empty. + +.. method:: set.clear() + + Remove all elements from the set. + + +Note, the non-operator versions of the :meth:`update`, +:meth:`intersection_update`, :meth:`difference_update`, and +:meth:`symmetric_difference_update` methods will accept any iterable as an +argument. + + +.. _typesmapping: + +Mapping Types --- :class:`dict` +=============================== + +.. index:: + object: mapping + object: dictionary + triple: operations on; mapping; types + triple: operations on; dictionary; type + statement: del + builtin: len + +A :dfn:`mapping` object maps immutable values to arbitrary objects. Mappings +are mutable objects. There is currently only one standard mapping type, the +:dfn:`dictionary`. +(For other containers see the built in :class:`list`, +:class:`set`, and :class:`tuple` classes, and the :mod:`collections` +module.) + +A dictionary's keys are *almost* arbitrary values. Only +values containing lists, dictionaries or other mutable types (that are compared +by value rather than by object identity) may not be used as keys. Numeric types +used for keys obey the normal rules for numeric comparison: if two numbers +compare equal (such as ``1`` and ``1.0``) then they can be used interchangeably +to index the same dictionary entry. (Note however, that since computers +store floating-point numbers as approximations it is usually unwise to +use them as dictionary keys.) + +Dictionaries can be created by placing a comma-separated list of ``key: value`` +pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098: +'jack', 4127: 'sjoerd'}``, or by the :class:`dict` constructor. + +.. class:: dict([arg]) + + Return a new dictionary initialized from an optional positional argument or from + a set of keyword arguments. If no arguments are given, return a new empty + dictionary. If the positional argument *arg* is a mapping object, return a + dictionary mapping the same keys to the same values as does the mapping object. + Otherwise the positional argument must be a sequence, a container that supports + iteration, or an iterator object. The elements of the argument must each also + be of one of those kinds, and each must in turn contain exactly two objects. + The first is used as a key in the new dictionary, and the second as the key's + value. If a given key is seen more than once, the last value associated with it + is retained in the new dictionary. + + If keyword arguments are given, the keywords themselves with their associated + values are added as items to the dictionary. If a key is specified both in the + positional argument and as a keyword argument, the value associated with the + keyword is retained in the dictionary. For example, these all return a + dictionary equal to ``{"one": 2, "two": 3}``: + + * ``dict(one=2, two=3)`` + + * ``dict({'one': 2, 'two': 3})`` + + * ``dict(zip(('one', 'two'), (2, 3)))`` + + * ``dict([['two', 3], ['one', 2]])`` + + The first example only works for keys that are valid Python + identifiers; the others work with any valid keys. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.3 + Support for building a dictionary from keyword arguments added. + + +These are the operations that dictionaries support (and therefore, custom mapping +types should support too): + +.. describe:: len(d) + + Return the number of items in the dictionary *d*. + +.. describe:: d[key] + + Return the item of *d* with key *key*. Raises a :exc:`KeyError` if *key* is + not in the map. + + .. versionadded:: 2.5 + If a subclass of dict defines a method :meth:`__missing__`, if the key + *key* is not present, the ``d[key]`` operation calls that method with the + key *key* as argument. The ``d[key]`` operation then returns or raises + whatever is returned or raised by the ``__missing__(key)`` call if the key + is not present. No other operations or methods invoke + :meth:`__missing__`. If :meth:`__missing__` is not defined, + :exc:`KeyError` is raised. :meth:`__missing__` must be a method; it + cannot be an instance variable. For an example, see + :class:`collections.defaultdict`. + +.. describe:: d[key] = value + + Set ``d[key]`` to *value*. + +.. describe:: del d[key] + + Remove ``d[key]`` from *d*. Raises a :exc:`KeyError` if *key* is not in the + map. + +.. describe:: key in d + + Return ``True`` if *d* has a key *key*, else ``False``. + + .. versionadded:: 2.2 + +.. describe:: key not in d + + Equivalent to ``not key in d``. + + .. versionadded:: 2.2 + +.. method:: dict.clear() + + Remove all items from the dictionary. + +.. method:: dict.copy() + + Return a shallow copy of the dictionary. + +.. method:: dict.fromkeys(seq[, value]) + + Create a new dictionary with keys from *seq* and values set to *value*. + + :func:`fromkeys` is a class method that returns a new dictionary. *value* + defaults to ``None``. + + .. versionadded:: 2.3 + +.. method:: dict.get(key[, default]) + + Return the value for *key* if *key* is in the dictionary, else *default*. If + *default* is not given, it defaults to ``None``, so that this method never + raises a :exc:`KeyError`. + +.. method:: dict.has_key(key) + + ``d.has_key(key)`` is equivalent to ``key in d``, but deprecated. + +.. method:: dict.items() + + Return a copy of the dictionary's list of ``(key, value)`` pairs. + + .. note:: + + Keys and values are listed in an arbitrary order which is non-random, varies + across Python implementations, and depends on the dictionary's history of + insertions and deletions. If :meth:`items`, :meth:`keys`, :meth:`values`, + :meth:`iteritems`, :meth:`iterkeys`, and :meth:`itervalues` are called with no + intervening modifications to the dictionary, the lists will directly correspond. + This allows the creation of ``(value, key)`` pairs using :func:`zip`: ``pairs = + zip(d.values(), d.keys())``. The same relationship holds for the + :meth:`iterkeys` and :meth:`itervalues` methods: ``pairs = zip(d.itervalues(), + d.iterkeys())`` provides the same value for ``pairs``. Another way to create the + same list is ``pairs = [(v, k) for (k, v) in d.iteritems()]``. + +.. method:: dict.iteritems() + + Return an iterator over the dictionary's ``(key, value)`` pairs. + See the note for :meth:`dict.items`. + + .. versionadded:: 2.2 + +.. method:: dict.iterkeys() + + Return an iterator over the dictionary's keys. See the note for + :meth:`dict.items`. + + .. versionadded:: 2.2 + +.. method:: dict.itervalues() + + Return an iterator over the dictionary's values. See the note for + :meth:`dict.items`. + + .. versionadded:: 2.2 + +.. method:: dict.keys() + + Return a copy of the dictionary's list of keys. See the note for + :meth:`dict.items`. + +.. method:: dict.pop(key[, default]) + + If *key* is in the dictionary, remove it and return its value, else return + *default*. If *default* is not given and *key* is not in the dictionary, a + :exc:`KeyError` is raised. + + .. versionadded:: 2.3 + +.. method:: dict.popitem() + + Remove and return an arbitrary ``(key, value)`` pair from the dictionary. + + :func:`popitem` is useful to destructively iterate over a dictionary, as + often used in set algorithms. If the dictionary is empty, calling + :func:`popitem` raises a :exc:`KeyError`. + +.. method:: dict.setdefault(key[, default]) + + If *key* is in the dictionary, return its value. If not, insert *key* with a + value of *default* and return *default*. *default* defaults to ``None``. + +.. method:: dict.update([other]) + + Update the dictionary with the key/value pairs from *other*, overwriting existing + keys. Return ``None``. + + :func:`update` accepts either another dictionary object or an iterable of + key/value pairs (as a tuple or other iterable of length two). If keyword + arguments are specified, the dictionary is then is updated with those + key/value pairs: ``d.update(red=1, blue=2)``. + + .. versionchanged:: 2.4 + Allowed the argument to be an iterable of key/value pairs and allowed + keyword arguments. + +.. method:: dict.values() + + Return a copy of the dictionary's list of values. See the note for + :meth:`mapping.items`. + + +.. _bltin-file-objects: + +File Objects +============ + +.. index:: + object: file + builtin: file + module: os + module: socket + +File objects are implemented using C's ``stdio`` package and can be +created with the built-in :func:`file` and (more usually) :func:`open` +constructors described in the :ref:`built-in-funcs` section. [#]_ File +objects are also returned by some other built-in functions and methods, +such as :func:`os.popen` and :func:`os.fdopen` and the :meth:`makefile` +method of socket objects. + +When a file operation fails for an I/O-related reason, the exception +:exc:`IOError` is raised. This includes situations where the operation is not +defined for some reason, like :meth:`seek` on a tty device or writing a file +opened for reading. + +Files have the following methods: + + +.. method:: file.close() + + Close the file. A closed file cannot be read or written any more. Any operation + which requires that the file be open will raise a :exc:`ValueError` after the + file has been closed. Calling :meth:`close` more than once is allowed. + + As of Python 2.5, you can avoid having to call this method explicitly if you use + the :keyword:`with` statement. For example, the following code will + automatically close ``f`` when the :keyword:`with` block is exited:: + + from __future__ import with_statement + + with open("hello.txt") as f: + for line in f: + print line + + In older versions of Python, you would have needed to do this to get the same + effect:: + + f = open("hello.txt") + try: + for line in f: + print line + finally: + f.close() + + .. note:: + + Not all "file-like" types in Python support use as a context manager for the + :keyword:`with` statement. If your code is intended to work with any file-like + object, you can use the function :func:`contextlib.closing` instead of using + the object directly. + + +.. method:: file.flush() + + Flush the internal buffer, like ``stdio``'s :cfunc:`fflush`. This may be a + no-op on some file-like objects. + + +.. method:: file.fileno() + + .. index:: + single: file descriptor + single: descriptor, file + module: fcntl + + Return the integer "file descriptor" that is used by the underlying + implementation to request I/O operations from the operating system. This can be + useful for other, lower level interfaces that use file descriptors, such as the + :mod:`fcntl` module or :func:`os.read` and friends. + + .. note:: + + File-like objects which do not have a real file descriptor should *not* provide + this method! + + +.. method:: file.isatty() + + Return ``True`` if the file is connected to a tty(-like) device, else ``False``. + + .. note:: + + If a file-like object is not associated with a real file, this method should + *not* be implemented. + + +.. method:: file.__next__() + + A file object is its own iterator, for example ``iter(f)`` returns *f* (unless + *f* is closed). When a file is used as an iterator, typically in a + :keyword:`for` loop (for example, ``for line in f: print line``), the + :meth:`__next__` method is called repeatedly. This method returns the next + input line, or raises :exc:`StopIteration` when EOF is hit when the file is open + for reading (behavior is undefined when the file is open for writing). In order + to make a :keyword:`for` loop the most efficient way of looping over the lines + of a file (a very common operation), the :meth:`__next__` method uses a hidden + read-ahead buffer. As a consequence of using a read-ahead buffer, combining + :meth:`__next__` with other file methods (like :meth:`readline`) does not work + right. However, using :meth:`seek` to reposition the file to an absolute + position will flush the read-ahead buffer. + + .. versionadded:: 2.3 + + +.. method:: file.read([size]) + + Read at most *size* bytes from the file (less if the read hits EOF before + obtaining *size* bytes). If the *size* argument is negative or omitted, read + all data until EOF is reached. The bytes are returned as a string object. An + empty string is returned when EOF is encountered immediately. (For certain + files, like ttys, it makes sense to continue reading after an EOF is hit.) Note + that this method may call the underlying C function :cfunc:`fread` more than + once in an effort to acquire as close to *size* bytes as possible. Also note + that when in non-blocking mode, less data than what was requested may be + returned, even if no *size* parameter was given. + + +.. method:: file.readline([size]) + + Read one entire line from the file. A trailing newline character is kept in the + string (but may be absent when a file ends with an incomplete line). [#]_ If + the *size* argument is present and non-negative, it is a maximum byte count + (including the trailing newline) and an incomplete line may be returned. An + empty string is returned *only* when EOF is encountered immediately. + + .. note:: + + Unlike ``stdio``'s :cfunc:`fgets`, the returned string contains null characters + (``'\0'``) if they occurred in the input. + + +.. method:: file.readlines([sizehint]) + + Read until EOF using :meth:`readline` and return a list containing the lines + thus read. If the optional *sizehint* argument is present, instead of + reading up to EOF, whole lines totalling approximately *sizehint* bytes + (possibly after rounding up to an internal buffer size) are read. Objects + implementing a file-like interface may choose to ignore *sizehint* if it + cannot be implemented, or cannot be implemented efficiently. + + +.. method:: file.seek(offset[, whence]) + + Set the file's current position, like ``stdio``'s :cfunc:`fseek`. The *whence* + argument is optional and defaults to ``os.SEEK_SET`` or ``0`` (absolute file + positioning); other values are ``os.SEEK_CUR`` or ``1`` (seek relative to the + current position) and ``os.SEEK_END`` or ``2`` (seek relative to the file's + end). There is no return value. Note that if the file is opened for appending + (mode ``'a'`` or ``'a+'``), any :meth:`seek` operations will be undone at the + next write. If the file is only opened for writing in append mode (mode + ``'a'``), this method is essentially a no-op, but it remains useful for files + opened in append mode with reading enabled (mode ``'a+'``). If the file is + opened in text mode (without ``'b'``), only offsets returned by :meth:`tell` are + legal. Use of other offsets causes undefined behavior. + + Note that not all file objects are seekable. + + .. versionchanged:: 2.6 + Passing float values as offset has been deprecated + + +.. method:: file.tell() + + Return the file's current position, like ``stdio``'s :cfunc:`ftell`. + + .. note:: + + On Windows, :meth:`tell` can return illegal values (after an :cfunc:`fgets`) + when reading files with Unix-style line-endings. Use binary mode (``'rb'``) to + circumvent this problem. + + +.. method:: file.truncate([size]) + + Truncate the file's size. If the optional *size* argument is present, the file + is truncated to (at most) that size. The size defaults to the current position. + The current file position is not changed. Note that if a specified size exceeds + the file's current size, the result is platform-dependent: possibilities + include that the file may remain unchanged, increase to the specified size as if + zero-filled, or increase to the specified size with undefined new content. + Availability: Windows, many Unix variants. + + +.. method:: file.write(str) + + Write a string to the file. There is no return value. Due to buffering, the + string may not actually show up in the file until the :meth:`flush` or + :meth:`close` method is called. + + +.. method:: file.writelines(sequence) + + Write a sequence of strings to the file. The sequence can be any iterable + object producing strings, typically a list of strings. There is no return value. + (The name is intended to match :meth:`readlines`; :meth:`writelines` does not + add line separators.) + +Files support the iterator protocol. Each iteration returns the same result as +``file.readline()``, and iteration ends when the :meth:`readline` method returns +an empty string. + +File objects also offer a number of other interesting attributes. These are not +required for file-like objects, but should be implemented if they make sense for +the particular object. + + +.. attribute:: file.closed + + bool indicating the current state of the file object. This is a read-only + attribute; the :meth:`close` method changes the value. It may not be available + on all file-like objects. + + +.. attribute:: file.encoding + + The encoding that this file uses. When Unicode strings are written to a file, + they will be converted to byte strings using this encoding. In addition, when + the file is connected to a terminal, the attribute gives the encoding that the + terminal is likely to use (that information might be incorrect if the user has + misconfigured the terminal). The attribute is read-only and may not be present + on all file-like objects. It may also be ``None``, in which case the file uses + the system default encoding for converting Unicode strings. + + .. versionadded:: 2.3 + + +.. attribute:: file.mode + + The I/O mode for the file. If the file was created using the :func:`open` + built-in function, this will be the value of the *mode* parameter. This is a + read-only attribute and may not be present on all file-like objects. + + +.. attribute:: file.name + + If the file object was created using :func:`open`, the name of the file. + Otherwise, some string that indicates the source of the file object, of the + form ``<...>``. This is a read-only attribute and may not be present on all + file-like objects. + + +.. attribute:: file.newlines + + If Python was built with the :option:`--with-universal-newlines` option to + :program:`configure` (the default) this read-only attribute exists, and for + files opened in universal newline read mode it keeps track of the types of + newlines encountered while reading the file. The values it can take are + ``'\r'``, ``'\n'``, ``'\r\n'``, ``None`` (unknown, no newlines read yet) or a + tuple containing all the newline types seen, to indicate that multiple newline + conventions were encountered. For files not opened in universal newline read + mode the value of this attribute will be ``None``. + + +.. attribute:: file.softspace + + Boolean that indicates whether a space character needs to be printed before + another value when using the :keyword:`print` statement. Classes that are trying + to simulate a file object should also have a writable :attr:`softspace` + attribute, which should be initialized to zero. This will be automatic for most + classes implemented in Python (care may be needed for objects that override + attribute access); types implemented in C will have to provide a writable + :attr:`softspace` attribute. + + .. note:: + + This attribute is not used to control the :keyword:`print` statement, but to + allow the implementation of :keyword:`print` to keep track of its internal + state. + + +.. _typecontextmanager: + +Context Manager Types +===================== + +.. versionadded:: 2.5 + +.. index:: + single: context manager + single: context management protocol + single: protocol; context management + +Python's :keyword:`with` statement supports the concept of a runtime context +defined by a context manager. This is implemented using two separate methods +that allow user-defined classes to define a runtime context that is entered +before the statement body is executed and exited when the statement ends. + +The :dfn:`context management protocol` consists of a pair of methods that need +to be provided for a context manager object to define a runtime context: + + +.. method:: contextmanager.__enter__() + + Enter the runtime context and return either this object or another object + related to the runtime context. The value returned by this method is bound to + the identifier in the :keyword:`as` clause of :keyword:`with` statements using + this context manager. + + An example of a context manager that returns itself is a file object. File + objects return themselves from __enter__() to allow :func:`open` to be used as + the context expression in a :keyword:`with` statement. + + An example of a context manager that returns a related object is the one + returned by ``decimal.Context.get_manager()``. These managers set the active + decimal context to a copy of the original decimal context and then return the + copy. This allows changes to be made to the current decimal context in the body + of the :keyword:`with` statement without affecting code outside the + :keyword:`with` statement. + + +.. method:: contextmanager.__exit__(exc_type, exc_val, exc_tb) + + Exit the runtime context and return a Boolean flag indicating if any expection + that occurred should be suppressed. If an exception occurred while executing the + body of the :keyword:`with` statement, the arguments contain the exception type, + value and traceback information. Otherwise, all three arguments are ``None``. + + Returning a true value from this method will cause the :keyword:`with` statement + to suppress the exception and continue execution with the statement immediately + following the :keyword:`with` statement. Otherwise the exception continues + propagating after this method has finished executing. Exceptions that occur + during execution of this method will replace any exception that occurred in the + body of the :keyword:`with` statement. + + The exception passed in should never be reraised explicitly - instead, this + method should return a false value to indicate that the method completed + successfully and does not want to suppress the raised exception. This allows + context management code (such as ``contextlib.nested``) to easily detect whether + or not an :meth:`__exit__` method has actually failed. + +Python defines several context managers to support easy thread synchronisation, +prompt closure of files or other objects, and simpler manipulation of the active +decimal arithmetic context. The specific types are not treated specially beyond +their implementation of the context management protocol. See the +:mod:`contextlib` module for some examples. + +Python's generators and the ``contextlib.contextfactory`` decorator provide a +convenient way to implement these protocols. If a generator function is +decorated with the ``contextlib.contextfactory`` decorator, it will return a +context manager implementing the necessary :meth:`__enter__` and +:meth:`__exit__` methods, rather than the iterator produced by an undecorated +generator function. + +Note that there is no specific slot for any of these methods in the type +structure for Python objects in the Python/C API. Extension types wanting to +define these methods must provide them as a normal Python accessible method. +Compared to the overhead of setting up the runtime context, the overhead of a +single class dictionary lookup is negligible. + + +.. _typesother: + +Other Built-in Types +==================== + +The interpreter supports several other kinds of objects. Most of these support +only one or two operations. + + +.. _typesmodules: + +Modules +------- + +The only special operation on a module is attribute access: ``m.name``, where +*m* is a module and *name* accesses a name defined in *m*'s symbol table. +Module attributes can be assigned to. (Note that the :keyword:`import` +statement is not, strictly speaking, an operation on a module object; ``import +foo`` does not require a module object named *foo* to exist, rather it requires +an (external) *definition* for a module named *foo* somewhere.) + +A special member of every module is :attr:`__dict__`. This is the dictionary +containing the module's symbol table. Modifying this dictionary will actually +change the module's symbol table, but direct assignment to the :attr:`__dict__` +attribute is not possible (you can write ``m.__dict__['a'] = 1``, which defines +``m.a`` to be ``1``, but you can't write ``m.__dict__ = {}``). Modifying +:attr:`__dict__` directly is not recommended. + +Modules built into the interpreter are written like this: ``<module 'sys' +(built-in)>``. If loaded from a file, they are written as ``<module 'os' from +'/usr/local/lib/pythonX.Y/os.pyc'>``. + + +.. _typesobjects: + +Classes and Class Instances +--------------------------- + +See :ref:`objects` and :ref:`class` for these. + + +.. _typesfunctions: + +Functions +--------- + +Function objects are created by function definitions. The only operation on a +function object is to call it: ``func(argument-list)``. + +There are really two flavors of function objects: built-in functions and +user-defined functions. Both support the same operation (to call the function), +but the implementation is different, hence the different object types. + +See :ref:`function` for more information. + + +.. _typesmethods: + +Methods +------- + +.. index:: object: method + +Methods are functions that are called using the attribute notation. There are +two flavors: built-in methods (such as :meth:`append` on lists) and class +instance methods. Built-in methods are described with the types that support +them. + +The implementation adds two special read-only attributes to class instance +methods: ``m.im_self`` is the object on which the method operates, and +``m.im_func`` is the function implementing the method. Calling ``m(arg-1, +arg-2, ..., arg-n)`` is completely equivalent to calling ``m.im_func(m.im_self, +arg-1, arg-2, ..., arg-n)``. + +Class instance methods are either *bound* or *unbound*, referring to whether the +method was accessed through an instance or a class, respectively. When a method +is unbound, its ``im_self`` attribute will be ``None`` and if called, an +explicit ``self`` object must be passed as the first argument. In this case, +``self`` must be an instance of the unbound method's class (or a subclass of +that class), otherwise a :exc:`TypeError` is raised. + +Like function objects, methods objects support getting arbitrary attributes. +However, since method attributes are actually stored on the underlying function +object (``meth.im_func``), setting method attributes on either bound or unbound +methods is disallowed. Attempting to set a method attribute results in a +:exc:`TypeError` being raised. In order to set a method attribute, you need to +explicitly set it on the underlying function object:: + + class C: + def method(self): + pass + + c = C() + c.method.im_func.whoami = 'my name is c' + +See :ref:`types` for more information. + + +.. _bltin-code-objects: + +Code Objects +------------ + +.. index:: object: code + +.. index:: + builtin: compile + single: __code__ (function object attribute) + +Code objects are used by the implementation to represent "pseudo-compiled" +executable Python code such as a function body. They differ from function +objects because they don't contain a reference to their global execution +environment. Code objects are returned by the built-in :func:`compile` function +and can be extracted from function objects through their :attr:`__code__` +attribute. See also the :mod:`code` module. + +.. index:: + builtin: exec + builtin: eval + +A code object can be executed or evaluated by passing it (instead of a source +string) to the :func:`exec` or :func:`eval` built-in functions. + +See :ref:`types` for more information. + + +.. _bltin-type-objects: + +Type Objects +------------ + +.. index:: + builtin: type + module: types + +Type objects represent the various object types. An object's type is accessed +by the built-in function :func:`type`. There are no special operations on +types. The standard module :mod:`types` defines names for all standard built-in +types. + +Types are written like this: ``<type 'int'>``. + + +.. _bltin-null-object: + +The Null Object +--------------- + +This object is returned by functions that don't explicitly return a value. It +supports no special operations. There is exactly one null object, named +``None`` (a built-in name). + +It is written as ``None``. + + +.. _bltin-ellipsis-object: + +The Ellipsis Object +------------------- + +This object is mostly used by extended slice notation (see :ref:`slicings`). It +supports no special operations. There is exactly one ellipsis object, named +:const:`Ellipsis` (a built-in name). + +It is written as ``Ellipsis`` or ``...``. + + +Boolean Values +-------------- + +Boolean values are the two constant objects ``False`` and ``True``. They are +used to represent truth values (although other values can also be considered +false or true). In numeric contexts (for example when used as the argument to +an arithmetic operator), they behave like the integers 0 and 1, respectively. +The built-in function :func:`bool` can be used to cast any value to a Boolean, +if the value can be interpreted as a truth value (see section Truth Value +Testing above). + +.. index:: + single: False + single: True + pair: Boolean; values + +They are written as ``False`` and ``True``, respectively. + + +.. _typesinternal: + +Internal Objects +---------------- + +See :ref:`types` for this information. It describes stack frame objects, +traceback objects, and slice objects. + + +.. _specialattrs: + +Special Attributes +================== + +The implementation adds a few special read-only attributes to several object +types, where they are relevant. Some of these are not reported by the +:func:`dir` built-in function. + + +.. attribute:: object.__dict__ + + A dictionary or other mapping object used to store an object's (writable) + attributes. + + +.. attribute:: instance.__class__ + + The class to which a class instance belongs. + + +.. attribute:: class.__bases__ + + The tuple of base classes of a class object. If there are no base classes, this + will be an empty tuple. + + +.. attribute:: class.__name__ + + The name of the class or type. + +.. rubric:: Footnotes + +.. [#] Additional information on these special methods may be found in the Python + Reference Manual (:ref:`customization`). + +.. [#] As a consequence, the list ``[1, 2]`` is considered equal to ``[1.0, 2.0]``, and + similarly for tuples. + +.. [#] They must have since the parser can't tell the type of the operands. + +.. [#] To format only a tuple you should therefore provide a singleton tuple whose only + element is the tuple to be formatted. + +.. [#] These numbers are fairly arbitrary. They are intended to avoid printing endless + strings of meaningless digits without hampering correct use and without having + to know the exact precision of floating point values on a particular machine. + +.. [#] :func:`file` is new in Python 2.2. The older built-in :func:`open` is an alias + for :func:`file`. + +.. [#] The advantage of leaving the newline on is that returning an empty string is + then an unambiguous EOF indication. It is also possible (in cases where it + might matter, for example, if you want to make an exact copy of a file while + scanning its lines) to tell whether the last line of a file ended in a newline + or not (yes this happens!). diff --git a/Doc/library/string.rst b/Doc/library/string.rst new file mode 100644 index 0000000..aa2494b --- /dev/null +++ b/Doc/library/string.rst @@ -0,0 +1,468 @@ + +:mod:`string` --- Common string operations +========================================== + +.. module:: string + :synopsis: Common string operations. + + +.. index:: module: re + +The :mod:`string` module contains a number of useful constants and +classes, as well as some deprecated legacy functions that are also +available as methods on strings. In addition, Python's built-in string +classes support the sequence type methods described in the +:ref:`typesseq` section, and also the string-specific methods described +in the :ref:`string-methods` section. To output formatted strings use +template strings or the ``%`` operator described in the +:ref:`string-formatting` section. Also, see the :mod:`re` module for +string functions based on regular expressions. + + +String constants +---------------- + +The constants defined in this module are: + + +.. data:: ascii_letters + + The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase` + constants described below. This value is not locale-dependent. + + +.. data:: ascii_lowercase + + The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``. This value is not + locale-dependent and will not change. + + +.. data:: ascii_uppercase + + The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``. This value is not + locale-dependent and will not change. + + +.. data:: digits + + The string ``'0123456789'``. + + +.. data:: hexdigits + + The string ``'0123456789abcdefABCDEF'``. + + +.. data:: octdigits + + The string ``'01234567'``. + + +.. data:: punctuation + + String of ASCII characters which are considered punctuation characters + in the ``C`` locale. + + +.. data:: printable + + String of ASCII characters which are considered printable. This is a + combination of :const:`digits`, :const:`ascii_letters`, :const:`punctuation`, + and :const:`whitespace`. + + +.. data:: whitespace + + A string containing all characters that are considered whitespace. + This includes the characters space, tab, linefeed, return, formfeed, and + vertical tab. + + +Template strings +---------------- + +Templates provide simpler string substitutions as described in :pep:`292`. +Instead of the normal ``%``\ -based substitutions, Templates support ``$``\ +-based substitutions, using the following rules: + +* ``$$`` is an escape; it is replaced with a single ``$``. + +* ``$identifier`` names a substitution placeholder matching a mapping key of + ``"identifier"``. By default, ``"identifier"`` must spell a Python + identifier. The first non-identifier character after the ``$`` character + terminates this placeholder specification. + +* ``${identifier}`` is equivalent to ``$identifier``. It is required when valid + identifier characters follow the placeholder but are not part of the + placeholder, such as ``"${noun}ification"``. + +Any other appearance of ``$`` in the string will result in a :exc:`ValueError` +being raised. + +.. versionadded:: 2.4 + +The :mod:`string` module provides a :class:`Template` class that implements +these rules. The methods of :class:`Template` are: + + +.. class:: Template(template) + + The constructor takes a single argument which is the template string. + + +.. method:: Template.substitute(mapping[, **kws]) + + Performs the template substitution, returning a new string. *mapping* is any + dictionary-like object with keys that match the placeholders in the template. + Alternatively, you can provide keyword arguments, where the keywords are the + placeholders. When both *mapping* and *kws* are given and there are duplicates, + the placeholders from *kws* take precedence. + + +.. method:: Template.safe_substitute(mapping[, **kws]) + + Like :meth:`substitute`, except that if placeholders are missing from *mapping* + and *kws*, instead of raising a :exc:`KeyError` exception, the original + placeholder will appear in the resulting string intact. Also, unlike with + :meth:`substitute`, any other appearances of the ``$`` will simply return ``$`` + instead of raising :exc:`ValueError`. + + While other exceptions may still occur, this method is called "safe" because + substitutions always tries to return a usable string instead of raising an + exception. In another sense, :meth:`safe_substitute` may be anything other than + safe, since it will silently ignore malformed templates containing dangling + delimiters, unmatched braces, or placeholders that are not valid Python + identifiers. + +:class:`Template` instances also provide one public data attribute: + + +.. attribute:: string.template + + This is the object passed to the constructor's *template* argument. In general, + you shouldn't change it, but read-only access is not enforced. + +Here is an example of how to use a Template:: + + >>> from string import Template + >>> s = Template('$who likes $what') + >>> s.substitute(who='tim', what='kung pao') + 'tim likes kung pao' + >>> d = dict(who='tim') + >>> Template('Give $who $100').substitute(d) + Traceback (most recent call last): + [...] + ValueError: Invalid placeholder in string: line 1, col 10 + >>> Template('$who likes $what').substitute(d) + Traceback (most recent call last): + [...] + KeyError: 'what' + >>> Template('$who likes $what').safe_substitute(d) + 'tim likes $what' + +Advanced usage: you can derive subclasses of :class:`Template` to customize the +placeholder syntax, delimiter character, or the entire regular expression used +to parse template strings. To do this, you can override these class attributes: + +* *delimiter* -- This is the literal string describing a placeholder introducing + delimiter. The default value ``$``. Note that this should *not* be a regular + expression, as the implementation will call :meth:`re.escape` on this string as + needed. + +* *idpattern* -- This is the regular expression describing the pattern for + non-braced placeholders (the braces will be added automatically as + appropriate). The default value is the regular expression + ``[_a-z][_a-z0-9]*``. + +Alternatively, you can provide the entire regular expression pattern by +overriding the class attribute *pattern*. If you do this, the value must be a +regular expression object with four named capturing groups. The capturing +groups correspond to the rules given above, along with the invalid placeholder +rule: + +* *escaped* -- This group matches the escape sequence, e.g. ``$$``, in the + default pattern. + +* *named* -- This group matches the unbraced placeholder name; it should not + include the delimiter in capturing group. + +* *braced* -- This group matches the brace enclosed placeholder name; it should + not include either the delimiter or braces in the capturing group. + +* *invalid* -- This group matches any other delimiter pattern (usually a single + delimiter), and it should appear last in the regular expression. + + +String functions +---------------- + +The following functions are available to operate on string and Unicode objects. +They are not available as string methods. + + +.. function:: capwords(s) + + Split the argument into words using :func:`split`, capitalize each word using + :func:`capitalize`, and join the capitalized words using :func:`join`. Note + that this replaces runs of whitespace characters by a single space, and removes + leading and trailing whitespace. + + +.. function:: maketrans(from, to) + + Return a translation table suitable for passing to :func:`translate`, that will + map each character in *from* into the character at the same position in *to*; + *from* and *to* must have the same length. + + .. warning:: + + Don't use strings derived from :const:`lowercase` and :const:`uppercase` as + arguments; in some locales, these don't have the same length. For case + conversions, always use :func:`lower` and :func:`upper`. + + +Deprecated string functions +--------------------------- + +The following list of functions are also defined as methods of string and +Unicode objects; see section :ref:`string-methods` for more information on +those. You should consider these functions as deprecated, although they will +not be removed until Python 3.0. The functions defined in this module are: + + +.. function:: atof(s) + + .. deprecated:: 2.0 + Use the :func:`float` built-in function. + + .. index:: builtin: float + + Convert a string to a floating point number. The string must have the standard + syntax for a floating point literal in Python, optionally preceded by a sign + (``+`` or ``-``). Note that this behaves identical to the built-in function + :func:`float` when passed a string. + + .. note:: + + .. index:: + single: NaN + single: Infinity + + When passing in a string, values for NaN and Infinity may be returned, depending + on the underlying C library. The specific set of strings accepted which cause + these values to be returned depends entirely on the C library and is known to + vary. + + +.. function:: atoi(s[, base]) + + .. deprecated:: 2.0 + Use the :func:`int` built-in function. + + .. index:: builtin: eval + + Convert string *s* to an integer in the given *base*. The string must consist + of one or more digits, optionally preceded by a sign (``+`` or ``-``). The + *base* defaults to 10. If it is 0, a default base is chosen depending on the + leading characters of the string (after stripping the sign): ``0x`` or ``0X`` + means 16, ``0`` means 8, anything else means 10. If *base* is 16, a leading + ``0x`` or ``0X`` is always accepted, though not required. This behaves + identically to the built-in function :func:`int` when passed a string. (Also + note: for a more flexible interpretation of numeric literals, use the built-in + function :func:`eval`.) + + +.. function:: atol(s[, base]) + + .. deprecated:: 2.0 + Use the :func:`long` built-in function. + + .. index:: builtin: long + + Convert string *s* to a long integer in the given *base*. The string must + consist of one or more digits, optionally preceded by a sign (``+`` or ``-``). + The *base* argument has the same meaning as for :func:`atoi`. A trailing ``l`` + or ``L`` is not allowed, except if the base is 0. Note that when invoked + without *base* or with *base* set to 10, this behaves identical to the built-in + function :func:`long` when passed a string. + + +.. function:: capitalize(word) + + Return a copy of *word* with only its first character capitalized. + + +.. function:: expandtabs(s[, tabsize]) + + Expand tabs in a string replacing them by one or more spaces, depending on the + current column and the given tab size. The column number is reset to zero after + each newline occurring in the string. This doesn't understand other non-printing + characters or escape sequences. The tab size defaults to 8. + + +.. function:: find(s, sub[, start[,end]]) + + Return the lowest index in *s* where the substring *sub* is found such that + *sub* is wholly contained in ``s[start:end]``. Return ``-1`` on failure. + Defaults for *start* and *end* and interpretation of negative values is the same + as for slices. + + +.. function:: rfind(s, sub[, start[, end]]) + + Like :func:`find` but find the highest index. + + +.. function:: index(s, sub[, start[, end]]) + + Like :func:`find` but raise :exc:`ValueError` when the substring is not found. + + +.. function:: rindex(s, sub[, start[, end]]) + + Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found. + + +.. function:: count(s, sub[, start[, end]]) + + Return the number of (non-overlapping) occurrences of substring *sub* in string + ``s[start:end]``. Defaults for *start* and *end* and interpretation of negative + values are the same as for slices. + + +.. function:: lower(s) + + Return a copy of *s*, but with upper case letters converted to lower case. + + +.. function:: split(s[, sep[, maxsplit]]) + + Return a list of the words of the string *s*. If the optional second argument + *sep* is absent or ``None``, the words are separated by arbitrary strings of + whitespace characters (space, tab, newline, return, formfeed). If the second + argument *sep* is present and not ``None``, it specifies a string to be used as + the word separator. The returned list will then have one more item than the + number of non-overlapping occurrences of the separator in the string. The + optional third argument *maxsplit* defaults to 0. If it is nonzero, at most + *maxsplit* number of splits occur, and the remainder of the string is returned + as the final element of the list (thus, the list will have at most + ``maxsplit+1`` elements). + + The behavior of split on an empty string depends on the value of *sep*. If *sep* + is not specified, or specified as ``None``, the result will be an empty list. + If *sep* is specified as any string, the result will be a list containing one + element which is an empty string. + + +.. function:: rsplit(s[, sep[, maxsplit]]) + + Return a list of the words of the string *s*, scanning *s* from the end. To all + intents and purposes, the resulting list of words is the same as returned by + :func:`split`, except when the optional third argument *maxsplit* is explicitly + specified and nonzero. When *maxsplit* is nonzero, at most *maxsplit* number of + splits -- the *rightmost* ones -- occur, and the remainder of the string is + returned as the first element of the list (thus, the list will have at most + ``maxsplit+1`` elements). + + .. versionadded:: 2.4 + + +.. function:: splitfields(s[, sep[, maxsplit]]) + + This function behaves identically to :func:`split`. (In the past, :func:`split` + was only used with one argument, while :func:`splitfields` was only used with + two arguments.) + + +.. function:: join(words[, sep]) + + Concatenate a list or tuple of words with intervening occurrences of *sep*. + The default value for *sep* is a single space character. It is always true that + ``string.join(string.split(s, sep), sep)`` equals *s*. + + +.. function:: joinfields(words[, sep]) + + This function behaves identically to :func:`join`. (In the past, :func:`join` + was only used with one argument, while :func:`joinfields` was only used with two + arguments.) Note that there is no :meth:`joinfields` method on string objects; + use the :meth:`join` method instead. + + +.. function:: lstrip(s[, chars]) + + Return a copy of the string with leading characters removed. If *chars* is + omitted or ``None``, whitespace characters are removed. If given and not + ``None``, *chars* must be a string; the characters in the string will be + stripped from the beginning of the string this method is called on. + + .. versionchanged:: 2.2.3 + The *chars* parameter was added. The *chars* parameter cannot be passed in + earlier 2.2 versions. + + +.. function:: rstrip(s[, chars]) + + Return a copy of the string with trailing characters removed. If *chars* is + omitted or ``None``, whitespace characters are removed. If given and not + ``None``, *chars* must be a string; the characters in the string will be + stripped from the end of the string this method is called on. + + .. versionchanged:: 2.2.3 + The *chars* parameter was added. The *chars* parameter cannot be passed in + earlier 2.2 versions. + + +.. function:: strip(s[, chars]) + + Return a copy of the string with leading and trailing characters removed. If + *chars* is omitted or ``None``, whitespace characters are removed. If given and + not ``None``, *chars* must be a string; the characters in the string will be + stripped from the both ends of the string this method is called on. + + .. versionchanged:: 2.2.3 + The *chars* parameter was added. The *chars* parameter cannot be passed in + earlier 2.2 versions. + + +.. function:: swapcase(s) + + Return a copy of *s*, but with lower case letters converted to upper case and + vice versa. + + +.. function:: translate(s, table[, deletechars]) + + Delete all characters from *s* that are in *deletechars* (if present), and then + translate the characters using *table*, which must be a 256-character string + giving the translation for each character value, indexed by its ordinal. If + *table* is ``None``, then only the character deletion step is performed. + + +.. function:: upper(s) + + Return a copy of *s*, but with lower case letters converted to upper case. + + +.. function:: ljust(s, width) + rjust(s, width) + center(s, width) + + These functions respectively left-justify, right-justify and center a string in + a field of given width. They return a string that is at least *width* + characters wide, created by padding the string *s* with spaces until the given + width on the right, left or both sides. The string is never truncated. + + +.. function:: zfill(s, width) + + Pad a numeric string on the left with zero digits until the given width is + reached. Strings starting with a sign are handled correctly. + + +.. function:: replace(str, old, new[, maxreplace]) + + Return a copy of string *str* with all occurrences of substring *old* replaced + by *new*. If the optional argument *maxreplace* is given, the first + *maxreplace* occurrences are replaced. + diff --git a/Doc/library/stringio.rst b/Doc/library/stringio.rst new file mode 100644 index 0000000..9e2f0da --- /dev/null +++ b/Doc/library/stringio.rst @@ -0,0 +1,122 @@ + +:mod:`StringIO` --- Read and write strings as files +=================================================== + +.. module:: StringIO + :synopsis: Read and write strings as if they were files. + + +This module implements a file-like class, :class:`StringIO`, that reads and +writes a string buffer (also known as *memory files*). See the description of +file objects for operations (section :ref:`bltin-file-objects`). + + +.. class:: StringIO([buffer]) + + When a :class:`StringIO` object is created, it can be initialized to an existing + string by passing the string to the constructor. If no string is given, the + :class:`StringIO` will start empty. In both cases, the initial file position + starts at zero. + + The :class:`StringIO` object can accept either Unicode or 8-bit strings, but + mixing the two may take some care. If both are used, 8-bit strings that cannot + be interpreted as 7-bit ASCII (that use the 8th bit) will cause a + :exc:`UnicodeError` to be raised when :meth:`getvalue` is called. + +The following methods of :class:`StringIO` objects require special mention: + + +.. method:: StringIO.getvalue() + + Retrieve the entire contents of the "file" at any time before the + :class:`StringIO` object's :meth:`close` method is called. See the note above + for information about mixing Unicode and 8-bit strings; such mixing can cause + this method to raise :exc:`UnicodeError`. + + +.. method:: StringIO.close() + + Free the memory buffer. + +Example usage:: + + import StringIO + + output = StringIO.StringIO() + output.write('First line.\n') + print >>output, 'Second line.' + + # Retrieve file contents -- this will be + # 'First line.\nSecond line.\n' + contents = output.getvalue() + + # Close object and discard memory buffer -- + # .getvalue() will now raise an exception. + output.close() + + +:mod:`cStringIO` --- Faster version of :mod:`StringIO` +====================================================== + +.. module:: cStringIO + :synopsis: Faster version of StringIO, but not subclassable. +.. moduleauthor:: Jim Fulton <jim@zope.com> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The module :mod:`cStringIO` provides an interface similar to that of the +:mod:`StringIO` module. Heavy use of :class:`StringIO.StringIO` objects can be +made more efficient by using the function :func:`StringIO` from this module +instead. + +Since this module provides a factory function which returns objects of built-in +types, there's no way to build your own version using subclassing. Use the +original :mod:`StringIO` module in that case. + +Unlike the memory files implemented by the :mod:`StringIO` module, those +provided by this module are not able to accept Unicode strings that cannot be +encoded as plain ASCII strings. + +Calling :func:`StringIO` with a Unicode string parameter populates +the object with the buffer representation of the Unicode string, instead of +encoding the string. + +Another difference from the :mod:`StringIO` module is that calling +:func:`StringIO` with a string parameter creates a read-only object. Unlike an +object created without a string parameter, it does not have write methods. +These objects are not generally visible. They turn up in tracebacks as +:class:`StringI` and :class:`StringO`. + +The following data objects are provided as well: + + +.. data:: InputType + + The type object of the objects created by calling :func:`StringIO` with a string + parameter. + + +.. data:: OutputType + + The type object of the objects returned by calling :func:`StringIO` with no + parameters. + +There is a C API to the module as well; refer to the module source for more +information. + +Example usage:: + + import cStringIO + + output = cStringIO.StringIO() + output.write('First line.\n') + print >>output, 'Second line.' + + # Retrieve file contents -- this will be + # 'First line.\nSecond line.\n' + contents = output.getvalue() + + # Close object and discard memory buffer -- + # .getvalue() will now raise an exception. + output.close() + diff --git a/Doc/library/stringprep.rst b/Doc/library/stringprep.rst new file mode 100644 index 0000000..b0944e4 --- /dev/null +++ b/Doc/library/stringprep.rst @@ -0,0 +1,142 @@ + +:mod:`stringprep` --- Internet String Preparation +================================================= + +.. module:: stringprep + :synopsis: String preparation, as per RFC 3453 +.. moduleauthor:: Martin v. Löwis <martin@v.loewis.de> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.3 + +When identifying things (such as host names) in the internet, it is often +necessary to compare such identifications for "equality". Exactly how this +comparison is executed may depend on the application domain, e.g. whether it +should be case-insensitive or not. It may be also necessary to restrict the +possible identifications, to allow only identifications consisting of +"printable" characters. + +:rfc:`3454` defines a procedure for "preparing" Unicode strings in internet +protocols. Before passing strings onto the wire, they are processed with the +preparation procedure, after which they have a certain normalized form. The RFC +defines a set of tables, which can be combined into profiles. Each profile must +define which tables it uses, and what other optional parts of the ``stringprep`` +procedure are part of the profile. One example of a ``stringprep`` profile is +``nameprep``, which is used for internationalized domain names. + +The module :mod:`stringprep` only exposes the tables from RFC 3454. As these +tables would be very large to represent them as dictionaries or lists, the +module uses the Unicode character database internally. The module source code +itself was generated using the ``mkstringprep.py`` utility. + +As a result, these tables are exposed as functions, not as data structures. +There are two kinds of tables in the RFC: sets and mappings. For a set, +:mod:`stringprep` provides the "characteristic function", i.e. a function that +returns true if the parameter is part of the set. For mappings, it provides the +mapping function: given the key, it returns the associated value. Below is a +list of all functions available in the module. + + +.. function:: in_table_a1(code) + + Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2). + + +.. function:: in_table_b1(code) + + Determine whether *code* is in tableB.1 (Commonly mapped to nothing). + + +.. function:: map_table_b2(code) + + Return the mapped value for *code* according to tableB.2 (Mapping for + case-folding used with NFKC). + + +.. function:: map_table_b3(code) + + Return the mapped value for *code* according to tableB.3 (Mapping for + case-folding used with no normalization). + + +.. function:: in_table_c11(code) + + Determine whether *code* is in tableC.1.1 (ASCII space characters). + + +.. function:: in_table_c12(code) + + Determine whether *code* is in tableC.1.2 (Non-ASCII space characters). + + +.. function:: in_table_c11_c12(code) + + Determine whether *code* is in tableC.1 (Space characters, union of C.1.1 and + C.1.2). + + +.. function:: in_table_c21(code) + + Determine whether *code* is in tableC.2.1 (ASCII control characters). + + +.. function:: in_table_c22(code) + + Determine whether *code* is in tableC.2.2 (Non-ASCII control characters). + + +.. function:: in_table_c21_c22(code) + + Determine whether *code* is in tableC.2 (Control characters, union of C.2.1 and + C.2.2). + + +.. function:: in_table_c3(code) + + Determine whether *code* is in tableC.3 (Private use). + + +.. function:: in_table_c4(code) + + Determine whether *code* is in tableC.4 (Non-character code points). + + +.. function:: in_table_c5(code) + + Determine whether *code* is in tableC.5 (Surrogate codes). + + +.. function:: in_table_c6(code) + + Determine whether *code* is in tableC.6 (Inappropriate for plain text). + + +.. function:: in_table_c7(code) + + Determine whether *code* is in tableC.7 (Inappropriate for canonical + representation). + + +.. function:: in_table_c8(code) + + Determine whether *code* is in tableC.8 (Change display properties or are + deprecated). + + +.. function:: in_table_c9(code) + + Determine whether *code* is in tableC.9 (Tagging characters). + + +.. function:: in_table_d1(code) + + Determine whether *code* is in tableD.1 (Characters with bidirectional property + "R" or "AL"). + + +.. function:: in_table_d2(code) + + Determine whether *code* is in tableD.2 (Characters with bidirectional property + "L"). + diff --git a/Doc/library/strings.rst b/Doc/library/strings.rst new file mode 100644 index 0000000..5c8ec4b --- /dev/null +++ b/Doc/library/strings.rst @@ -0,0 +1,31 @@ + +.. _stringservices: + +*************** +String Services +*************** + +The modules described in this chapter provide a wide range of string +manipulation operations. + +In addition, Python's built-in string classes support the sequence type +methods described in the :ref:`typesseq` section, and also the +string-specific methods described in the :ref:`string-methods` section. +To output formatted strings use template strings or the ``%`` operator +described in the :ref:`string-formatting` section. Also, see the +:mod:`re` module for string functions based on regular expressions. + + +.. toctree:: + + string.rst + re.rst + struct.rst + difflib.rst + stringio.rst + textwrap.rst + codecs.rst + unicodedata.rst + stringprep.rst + fpformat.rst + diff --git a/Doc/library/struct.rst b/Doc/library/struct.rst new file mode 100644 index 0000000..2f27d13 --- /dev/null +++ b/Doc/library/struct.rst @@ -0,0 +1,292 @@ + +:mod:`struct` --- Interpret strings as packed binary data +========================================================= + +.. module:: struct + :synopsis: Interpret strings as packed binary data. + +.. index:: + pair: C; structures + triple: packing; binary; data + +This module performs conversions between Python values and C structs represented +as Python strings. It uses :dfn:`format strings` (explained below) as compact +descriptions of the lay-out of the C structs and the intended conversion to/from +Python values. This can be used in handling binary data stored in files or from +network connections, among other sources. + +The module defines the following exception and functions: + + +.. exception:: error + + Exception raised on various occasions; argument is a string describing what is + wrong. + + +.. function:: pack(fmt, v1, v2, ...) + + Return a string containing the values ``v1, v2, ...`` packed according to the + given format. The arguments must match the values required by the format + exactly. + + +.. function:: pack_into(fmt, buffer, offset, v1, v2, ...) + + Pack the values ``v1, v2, ...`` according to the given format, write the packed + bytes into the writable *buffer* starting at *offset*. Note that the offset is + a required argument. + + .. versionadded:: 2.5 + + +.. function:: unpack(fmt, string) + + Unpack the string (presumably packed by ``pack(fmt, ...)``) according to the + given format. The result is a tuple even if it contains exactly one item. The + string must contain exactly the amount of data required by the format + (``len(string)`` must equal ``calcsize(fmt)``). + + +.. function:: unpack_from(fmt, buffer[,offset=0]) + + Unpack the *buffer* according to tthe given format. The result is a tuple even + if it contains exactly one item. The *buffer* must contain at least the amount + of data required by the format (``len(buffer[offset:])`` must be at least + ``calcsize(fmt)``). + + .. versionadded:: 2.5 + + +.. function:: calcsize(fmt) + + Return the size of the struct (and hence of the string) corresponding to the + given format. + +Format characters have the following meaning; the conversion between C and +Python values should be obvious given their types: + ++--------+-------------------------+--------------------+-------+ +| Format | C Type | Python | Notes | ++========+=========================+====================+=======+ +| ``x`` | pad byte | no value | | ++--------+-------------------------+--------------------+-------+ +| ``c`` | :ctype:`char` | string of length 1 | | ++--------+-------------------------+--------------------+-------+ +| ``b`` | :ctype:`signed char` | integer | | ++--------+-------------------------+--------------------+-------+ +| ``B`` | :ctype:`unsigned char` | integer | | ++--------+-------------------------+--------------------+-------+ +| ``t`` | :ctype:`_Bool` | bool | \(1) | ++--------+-------------------------+--------------------+-------+ +| ``h`` | :ctype:`short` | integer | | ++--------+-------------------------+--------------------+-------+ +| ``H`` | :ctype:`unsigned short` | integer | | ++--------+-------------------------+--------------------+-------+ +| ``i`` | :ctype:`int` | integer | | ++--------+-------------------------+--------------------+-------+ +| ``I`` | :ctype:`unsigned int` | long | | ++--------+-------------------------+--------------------+-------+ +| ``l`` | :ctype:`long` | integer | | ++--------+-------------------------+--------------------+-------+ +| ``L`` | :ctype:`unsigned long` | long | | ++--------+-------------------------+--------------------+-------+ +| ``q`` | :ctype:`long long` | long | \(2) | ++--------+-------------------------+--------------------+-------+ +| ``Q`` | :ctype:`unsigned long | long | \(2) | +| | long` | | | ++--------+-------------------------+--------------------+-------+ +| ``f`` | :ctype:`float` | float | | ++--------+-------------------------+--------------------+-------+ +| ``d`` | :ctype:`double` | float | | ++--------+-------------------------+--------------------+-------+ +| ``s`` | :ctype:`char[]` | string | | ++--------+-------------------------+--------------------+-------+ +| ``p`` | :ctype:`char[]` | string | | ++--------+-------------------------+--------------------+-------+ +| ``P`` | :ctype:`void \*` | integer | | ++--------+-------------------------+--------------------+-------+ + +Notes: + +(1) + The ``'t'`` conversion code corresponds to the :ctype:`_Bool` type defined by + C99. If this type is not available, it is simulated using a :ctype:`char`. In + standard mode, it is always represented by one byte. + + .. versionadded:: 2.6 + +(2) + The ``'q'`` and ``'Q'`` conversion codes are available in native mode only if + the platform C compiler supports C :ctype:`long long`, or, on Windows, + :ctype:`__int64`. They are always available in standard modes. + + .. versionadded:: 2.2 + +A format character may be preceded by an integral repeat count. For example, +the format string ``'4h'`` means exactly the same as ``'hhhh'``. + +Whitespace characters between formats are ignored; a count and its format must +not contain whitespace though. + +For the ``'s'`` format character, the count is interpreted as the size of the +string, not a repeat count like for the other format characters; for example, +``'10s'`` means a single 10-byte string, while ``'10c'`` means 10 characters. +For packing, the string is truncated or padded with null bytes as appropriate to +make it fit. For unpacking, the resulting string always has exactly the +specified number of bytes. As a special case, ``'0s'`` means a single, empty +string (while ``'0c'`` means 0 characters). + +The ``'p'`` format character encodes a "Pascal string", meaning a short +variable-length string stored in a fixed number of bytes. The count is the total +number of bytes stored. The first byte stored is the length of the string, or +255, whichever is smaller. The bytes of the string follow. If the string +passed in to :func:`pack` is too long (longer than the count minus 1), only the +leading count-1 bytes of the string are stored. If the string is shorter than +count-1, it is padded with null bytes so that exactly count bytes in all are +used. Note that for :func:`unpack`, the ``'p'`` format character consumes count +bytes, but that the string returned can never contain more than 255 characters. + +For the ``'I'``, ``'L'``, ``'q'`` and ``'Q'`` format characters, the return +value is a Python long integer. + +For the ``'P'`` format character, the return value is a Python integer or long +integer, depending on the size needed to hold a pointer when it has been cast to +an integer type. A *NULL* pointer will always be returned as the Python integer +``0``. When packing pointer-sized values, Python integer or long integer objects +may be used. For example, the Alpha and Merced processors use 64-bit pointer +values, meaning a Python long integer will be used to hold the pointer; other +platforms use 32-bit pointers and will use a Python integer. + +For the ``'t'`` format character, the return value is either :const:`True` or +:const:`False`. When packing, the truth value of the argument object is used. +Either 0 or 1 in the native or standard bool representation will be packed, and +any non-zero value will be True when unpacking. + +By default, C numbers are represented in the machine's native format and byte +order, and properly aligned by skipping pad bytes if necessary (according to the +rules used by the C compiler). + +Alternatively, the first character of the format string can be used to indicate +the byte order, size and alignment of the packed data, according to the +following table: + ++-----------+------------------------+--------------------+ +| Character | Byte order | Size and alignment | ++===========+========================+====================+ +| ``@`` | native | native | ++-----------+------------------------+--------------------+ +| ``=`` | native | standard | ++-----------+------------------------+--------------------+ +| ``<`` | little-endian | standard | ++-----------+------------------------+--------------------+ +| ``>`` | big-endian | standard | ++-----------+------------------------+--------------------+ +| ``!`` | network (= big-endian) | standard | ++-----------+------------------------+--------------------+ + +If the first character is not one of these, ``'@'`` is assumed. + +Native byte order is big-endian or little-endian, depending on the host system. +For example, Motorola and Sun processors are big-endian; Intel and DEC +processors are little-endian. + +Native size and alignment are determined using the C compiler's +:keyword:`sizeof` expression. This is always combined with native byte order. + +Standard size and alignment are as follows: no alignment is required for any +type (so you have to use pad bytes); :ctype:`short` is 2 bytes; :ctype:`int` and +:ctype:`long` are 4 bytes; :ctype:`long long` (:ctype:`__int64` on Windows) is 8 +bytes; :ctype:`float` and :ctype:`double` are 32-bit and 64-bit IEEE floating +point numbers, respectively. :ctype:`_Bool` is 1 byte. + +Note the difference between ``'@'`` and ``'='``: both use native byte order, but +the size and alignment of the latter is standardized. + +The form ``'!'`` is available for those poor souls who claim they can't remember +whether network byte order is big-endian or little-endian. + +There is no way to indicate non-native byte order (force byte-swapping); use the +appropriate choice of ``'<'`` or ``'>'``. + +The ``'P'`` format character is only available for the native byte ordering +(selected as the default or with the ``'@'`` byte order character). The byte +order character ``'='`` chooses to use little- or big-endian ordering based on +the host system. The struct module does not interpret this as native ordering, +so the ``'P'`` format is not available. + +Examples (all using native byte order, size and alignment, on a big-endian +machine):: + + >>> from struct import * + >>> pack('hhl', 1, 2, 3) + '\x00\x01\x00\x02\x00\x00\x00\x03' + >>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03') + (1, 2, 3) + >>> calcsize('hhl') + 8 + +Hint: to align the end of a structure to the alignment requirement of a +particular type, end the format with the code for that type with a repeat count +of zero. For example, the format ``'llh0l'`` specifies two pad bytes at the +end, assuming longs are aligned on 4-byte boundaries. This only works when +native size and alignment are in effect; standard size and alignment does not +enforce any alignment. + + +.. seealso:: + + Module :mod:`array` + Packed binary storage of homogeneous data. + + Module :mod:`xdrlib` + Packing and unpacking of XDR data. + + +.. _struct-objects: + +Struct Objects +-------------- + +The :mod:`struct` module also defines the following type: + + +.. class:: Struct(format) + + Return a new Struct object which writes and reads binary data according to the + format string *format*. Creating a Struct object once and calling its methods + is more efficient than calling the :mod:`struct` functions with the same format + since the format string only needs to be compiled once. + + .. versionadded:: 2.5 + +Compiled Struct objects support the following methods and attributes: + + +.. method:: Struct.pack(v1, v2, ...) + + Identical to the :func:`pack` function, using the compiled format. + (``len(result)`` will equal :attr:`self.size`.) + + +.. method:: Struct.pack_into(buffer, offset, v1, v2, ...) + + Identical to the :func:`pack_into` function, using the compiled format. + + +.. method:: Struct.unpack(string) + + Identical to the :func:`unpack` function, using the compiled format. + (``len(string)`` must equal :attr:`self.size`). + + +.. method:: Struct.unpack_from(buffer[, offset=0]) + + Identical to the :func:`unpack_from` function, using the compiled format. + (``len(buffer[offset:])`` must be at least :attr:`self.size`). + + +.. attribute:: Struct.format + + The format string used to construct this Struct object. + diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst new file mode 100644 index 0000000..a3bc2cb --- /dev/null +++ b/Doc/library/subprocess.rst @@ -0,0 +1,340 @@ + +:mod:`subprocess` --- Subprocess management +=========================================== + +.. module:: subprocess + :synopsis: Subprocess management. +.. moduleauthor:: Peter Åstrand <astrand@lysator.liu.se> +.. sectionauthor:: Peter Åstrand <astrand@lysator.liu.se> + + +.. versionadded:: 2.4 + +The :mod:`subprocess` module allows you to spawn new processes, connect to their +input/output/error pipes, and obtain their return codes. This module intends to +replace several other, older modules and functions, such as:: + + os.system + os.spawn* + commands.* + +Information about how the :mod:`subprocess` module can be used to replace these +modules and functions can be found in the following sections. + + +Using the subprocess Module +--------------------------- + +This module defines one class called :class:`Popen`: + + +.. class:: Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0) + + Arguments are: + + *args* should be a string, or a sequence of program arguments. The program to + execute is normally the first item in the args sequence or string, but can be + explicitly set by using the executable argument. + + On Unix, with *shell=False* (default): In this case, the Popen class uses + :meth:`os.execvp` to execute the child program. *args* should normally be a + sequence. A string will be treated as a sequence with the string as the only + item (the program to execute). + + On Unix, with *shell=True*: If args is a string, it specifies the command string + to execute through the shell. If *args* is a sequence, the first item specifies + the command string, and any additional items will be treated as additional shell + arguments. + + On Windows: the :class:`Popen` class uses CreateProcess() to execute the child + program, which operates on strings. If *args* is a sequence, it will be + converted to a string using the :meth:`list2cmdline` method. Please note that + not all MS Windows applications interpret the command line the same way: + :meth:`list2cmdline` is designed for applications using the same rules as the MS + C runtime. + + *bufsize*, if given, has the same meaning as the corresponding argument to the + built-in open() function: :const:`0` means unbuffered, :const:`1` means line + buffered, any other positive value means use a buffer of (approximately) that + size. A negative *bufsize* means to use the system default, which usually means + fully buffered. The default value for *bufsize* is :const:`0` (unbuffered). + + The *executable* argument specifies the program to execute. It is very seldom + needed: Usually, the program to execute is defined by the *args* argument. If + ``shell=True``, the *executable* argument specifies which shell to use. On Unix, + the default shell is :file:`/bin/sh`. On Windows, the default shell is + specified by the :envvar:`COMSPEC` environment variable. + + *stdin*, *stdout* and *stderr* specify the executed programs' standard input, + standard output and standard error file handles, respectively. Valid values are + ``PIPE``, an existing file descriptor (a positive integer), an existing file + object, and ``None``. ``PIPE`` indicates that a new pipe to the child should be + created. With ``None``, no redirection will occur; the child's file handles + will be inherited from the parent. Additionally, *stderr* can be ``STDOUT``, + which indicates that the stderr data from the applications should be captured + into the same file handle as for stdout. + + If *preexec_fn* is set to a callable object, this object will be called in the + child process just before the child is executed. (Unix only) + + If *close_fds* is true, all file descriptors except :const:`0`, :const:`1` and + :const:`2` will be closed before the child process is executed. (Unix only). + Or, on Windows, if *close_fds* is true then no handles will be inherited by the + child process. Note that on Windows, you cannot set *close_fds* to true and + also redirect the standard handles by setting *stdin*, *stdout* or *stderr*. + + If *shell* is :const:`True`, the specified command will be executed through the + shell. + + If *cwd* is not ``None``, the child's current directory will be changed to *cwd* + before it is executed. Note that this directory is not considered when + searching the executable, so you can't specify the program's path relative to + *cwd*. + + If *env* is not ``None``, it defines the environment variables for the new + process. + + If *universal_newlines* is :const:`True`, the file objects stdout and stderr are + opened as text files, but lines may be terminated by any of ``'\n'``, the Unix + end-of-line convention, ``'\r'``, the Macintosh convention or ``'\r\n'``, the + Windows convention. All of these external representations are seen as ``'\n'`` + by the Python program. + + .. note:: + + This feature is only available if Python is built with universal newline support + (the default). Also, the newlines attribute of the file objects :attr:`stdout`, + :attr:`stdin` and :attr:`stderr` are not updated by the communicate() method. + + The *startupinfo* and *creationflags*, if given, will be passed to the + underlying CreateProcess() function. They can specify things such as appearance + of the main window and priority for the new process. (Windows only) + + +Convenience Functions +^^^^^^^^^^^^^^^^^^^^^ + +This module also defines two shortcut functions: + + +.. function:: call(*popenargs, **kwargs) + + Run command with arguments. Wait for command to complete, then return the + :attr:`returncode` attribute. + + The arguments are the same as for the Popen constructor. Example:: + + retcode = call(["ls", "-l"]) + + +.. function:: check_call(*popenargs, **kwargs) + + Run command with arguments. Wait for command to complete. If the exit code was + zero then return, otherwise raise :exc:`CalledProcessError.` The + :exc:`CalledProcessError` object will have the return code in the + :attr:`returncode` attribute. + + The arguments are the same as for the Popen constructor. Example:: + + check_call(["ls", "-l"]) + + .. versionadded:: 2.5 + + +Exceptions +^^^^^^^^^^ + +Exceptions raised in the child process, before the new program has started to +execute, will be re-raised in the parent. Additionally, the exception object +will have one extra attribute called :attr:`child_traceback`, which is a string +containing traceback information from the childs point of view. + +The most common exception raised is :exc:`OSError`. This occurs, for example, +when trying to execute a non-existent file. Applications should prepare for +:exc:`OSError` exceptions. + +A :exc:`ValueError` will be raised if :class:`Popen` is called with invalid +arguments. + +check_call() will raise :exc:`CalledProcessError`, if the called process returns +a non-zero return code. + + +Security +^^^^^^^^ + +Unlike some other popen functions, this implementation will never call /bin/sh +implicitly. This means that all characters, including shell metacharacters, can +safely be passed to child processes. + + +Popen Objects +------------- + +Instances of the :class:`Popen` class have the following methods: + + +.. method:: Popen.poll() + + Check if child process has terminated. Returns returncode attribute. + + +.. method:: Popen.wait() + + Wait for child process to terminate. Returns returncode attribute. + + +.. method:: Popen.communicate(input=None) + + Interact with process: Send data to stdin. Read data from stdout and stderr, + until end-of-file is reached. Wait for process to terminate. The optional + *input* argument should be a string to be sent to the child process, or + ``None``, if no data should be sent to the child. + + communicate() returns a tuple (stdout, stderr). + + .. note:: + + The data read is buffered in memory, so do not use this method if the data size + is large or unlimited. + +The following attributes are also available: + + +.. attribute:: Popen.stdin + + If the *stdin* argument is ``PIPE``, this attribute is a file object that + provides input to the child process. Otherwise, it is ``None``. + + +.. attribute:: Popen.stdout + + If the *stdout* argument is ``PIPE``, this attribute is a file object that + provides output from the child process. Otherwise, it is ``None``. + + +.. attribute:: Popen.stderr + + If the *stderr* argument is ``PIPE``, this attribute is file object that + provides error output from the child process. Otherwise, it is ``None``. + + +.. attribute:: Popen.pid + + The process ID of the child process. + + +.. attribute:: Popen.returncode + + The child return code. A ``None`` value indicates that the process hasn't + terminated yet. A negative value -N indicates that the child was terminated by + signal N (Unix only). + + +Replacing Older Functions with the subprocess Module +---------------------------------------------------- + +In this section, "a ==> b" means that b can be used as a replacement for a. + +.. note:: + + All functions in this section fail (more or less) silently if the executed + program cannot be found; this module raises an :exc:`OSError` exception. + +In the following examples, we assume that the subprocess module is imported with +"from subprocess import \*". + + +Replacing /bin/sh shell backquote +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + output=`mycmd myarg` + ==> + output = Popen(["mycmd", "myarg"], stdout=PIPE).communicate()[0] + + +Replacing shell pipe line +^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + output=`dmesg | grep hda` + ==> + p1 = Popen(["dmesg"], stdout=PIPE) + p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE) + output = p2.communicate()[0] + + +Replacing os.system() +^^^^^^^^^^^^^^^^^^^^^ + +:: + + sts = os.system("mycmd" + " myarg") + ==> + p = Popen("mycmd" + " myarg", shell=True) + sts = os.waitpid(p.pid, 0) + +Notes: + +* Calling the program through the shell is usually not required. + +* It's easier to look at the :attr:`returncode` attribute than the exit status. + +A more realistic example would look like this:: + + try: + retcode = call("mycmd" + " myarg", shell=True) + if retcode < 0: + print >>sys.stderr, "Child was terminated by signal", -retcode + else: + print >>sys.stderr, "Child returned", retcode + except OSError as e: + print >>sys.stderr, "Execution failed:", e + + +Replacing os.spawn\* +^^^^^^^^^^^^^^^^^^^^ + +P_NOWAIT example:: + + pid = os.spawnlp(os.P_NOWAIT, "/bin/mycmd", "mycmd", "myarg") + ==> + pid = Popen(["/bin/mycmd", "myarg"]).pid + +P_WAIT example:: + + retcode = os.spawnlp(os.P_WAIT, "/bin/mycmd", "mycmd", "myarg") + ==> + retcode = call(["/bin/mycmd", "myarg"]) + +Vector example:: + + os.spawnvp(os.P_NOWAIT, path, args) + ==> + Popen([path] + args[1:]) + +Environment example:: + + os.spawnlpe(os.P_NOWAIT, "/bin/mycmd", "mycmd", "myarg", env) + ==> + Popen(["/bin/mycmd", "myarg"], env={"PATH": "/usr/bin"}) + + +Replacing os.popen\* +^^^^^^^^^^^^^^^^^^^^ + +:: + + pipe = os.popen(cmd, mode='r', bufsize) + ==> + pipe = Popen(cmd, shell=True, bufsize=bufsize, stdout=PIPE).stdout + +:: + + pipe = os.popen(cmd, mode='w', bufsize) + ==> + pipe = Popen(cmd, shell=True, bufsize=bufsize, stdin=PIPE).stdin + diff --git a/Doc/library/sunau.rst b/Doc/library/sunau.rst new file mode 100644 index 0000000..9930133 --- /dev/null +++ b/Doc/library/sunau.rst @@ -0,0 +1,261 @@ + +:mod:`sunau` --- Read and write Sun AU files +============================================ + +.. module:: sunau + :synopsis: Provide an interface to the Sun AU sound format. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`sunau` module provides a convenient interface to the Sun AU sound +format. Note that this module is interface-compatible with the modules +:mod:`aifc` and :mod:`wave`. + +An audio file consists of a header followed by the data. The fields of the +header are: + ++---------------+-----------------------------------------------+ +| Field | Contents | ++===============+===============================================+ +| magic word | The four bytes ``.snd``. | ++---------------+-----------------------------------------------+ +| header size | Size of the header, including info, in bytes. | ++---------------+-----------------------------------------------+ +| data size | Physical size of the data, in bytes. | ++---------------+-----------------------------------------------+ +| encoding | Indicates how the audio samples are encoded. | ++---------------+-----------------------------------------------+ +| sample rate | The sampling rate. | ++---------------+-----------------------------------------------+ +| # of channels | The number of channels in the samples. | ++---------------+-----------------------------------------------+ +| info | ASCII string giving a description of the | +| | audio file (padded with null bytes). | ++---------------+-----------------------------------------------+ + +Apart from the info field, all header fields are 4 bytes in size. They are all +32-bit unsigned integers encoded in big-endian byte order. + +The :mod:`sunau` module defines the following functions: + + +.. function:: open(file, mode) + + If *file* is a string, open the file by that name, otherwise treat it as a + seekable file-like object. *mode* can be any of + + ``'r'`` + Read only mode. + + ``'w'`` + Write only mode. + + Note that it does not allow read/write files. + + A *mode* of ``'r'`` returns a :class:`AU_read` object, while a *mode* of ``'w'`` + or ``'wb'`` returns a :class:`AU_write` object. + + +.. function:: openfp(file, mode) + + A synonym for :func:`open`, maintained for backwards compatibility. + +The :mod:`sunau` module defines the following exception: + + +.. exception:: Error + + An error raised when something is impossible because of Sun AU specs or + implementation deficiency. + +The :mod:`sunau` module defines the following data items: + + +.. data:: AUDIO_FILE_MAGIC + + An integer every valid Sun AU file begins with, stored in big-endian form. This + is the string ``.snd`` interpreted as an integer. + + +.. data:: AUDIO_FILE_ENCODING_MULAW_8 + AUDIO_FILE_ENCODING_LINEAR_8 + AUDIO_FILE_ENCODING_LINEAR_16 + AUDIO_FILE_ENCODING_LINEAR_24 + AUDIO_FILE_ENCODING_LINEAR_32 + AUDIO_FILE_ENCODING_ALAW_8 + + Values of the encoding field from the AU header which are supported by this + module. + + +.. data:: AUDIO_FILE_ENCODING_FLOAT + AUDIO_FILE_ENCODING_DOUBLE + AUDIO_FILE_ENCODING_ADPCM_G721 + AUDIO_FILE_ENCODING_ADPCM_G722 + AUDIO_FILE_ENCODING_ADPCM_G723_3 + AUDIO_FILE_ENCODING_ADPCM_G723_5 + + Additional known values of the encoding field from the AU header, but which are + not supported by this module. + + +.. _au-read-objects: + +AU_read Objects +--------------- + +AU_read objects, as returned by :func:`open` above, have the following methods: + + +.. method:: AU_read.close() + + Close the stream, and make the instance unusable. (This is called automatically + on deletion.) + + +.. method:: AU_read.getnchannels() + + Returns number of audio channels (1 for mone, 2 for stereo). + + +.. method:: AU_read.getsampwidth() + + Returns sample width in bytes. + + +.. method:: AU_read.getframerate() + + Returns sampling frequency. + + +.. method:: AU_read.getnframes() + + Returns number of audio frames. + + +.. method:: AU_read.getcomptype() + + Returns compression type. Supported compression types are ``'ULAW'``, ``'ALAW'`` + and ``'NONE'``. + + +.. method:: AU_read.getcompname() + + Human-readable version of :meth:`getcomptype`. The supported types have the + respective names ``'CCITT G.711 u-law'``, ``'CCITT G.711 A-law'`` and ``'not + compressed'``. + + +.. method:: AU_read.getparams() + + Returns a tuple ``(nchannels, sampwidth, framerate, nframes, comptype, + compname)``, equivalent to output of the :meth:`get\*` methods. + + +.. method:: AU_read.readframes(n) + + Reads and returns at most *n* frames of audio, as a string of bytes. The data + will be returned in linear format. If the original data is in u-LAW format, it + will be converted. + + +.. method:: AU_read.rewind() + + Rewind the file pointer to the beginning of the audio stream. + +The following two methods define a term "position" which is compatible between +them, and is otherwise implementation dependent. + + +.. method:: AU_read.setpos(pos) + + Set the file pointer to the specified position. Only values returned from + :meth:`tell` should be used for *pos*. + + +.. method:: AU_read.tell() + + Return current file pointer position. Note that the returned value has nothing + to do with the actual position in the file. + +The following two functions are defined for compatibility with the :mod:`aifc`, +and don't do anything interesting. + + +.. method:: AU_read.getmarkers() + + Returns ``None``. + + +.. method:: AU_read.getmark(id) + + Raise an error. + + +.. _au-write-objects: + +AU_write Objects +---------------- + +AU_write objects, as returned by :func:`open` above, have the following methods: + + +.. method:: AU_write.setnchannels(n) + + Set the number of channels. + + +.. method:: AU_write.setsampwidth(n) + + Set the sample width (in bytes.) + + +.. method:: AU_write.setframerate(n) + + Set the frame rate. + + +.. method:: AU_write.setnframes(n) + + Set the number of frames. This can be later changed, when and if more frames + are written. + + +.. method:: AU_write.setcomptype(type, name) + + Set the compression type and description. Only ``'NONE'`` and ``'ULAW'`` are + supported on output. + + +.. method:: AU_write.setparams(tuple) + + The *tuple* should be ``(nchannels, sampwidth, framerate, nframes, comptype, + compname)``, with values valid for the :meth:`set\*` methods. Set all + parameters. + + +.. method:: AU_write.tell() + + Return current position in the file, with the same disclaimer for the + :meth:`AU_read.tell` and :meth:`AU_read.setpos` methods. + + +.. method:: AU_write.writeframesraw(data) + + Write audio frames, without correcting *nframes*. + + +.. method:: AU_write.writeframes(data) + + Write audio frames and make sure *nframes* is correct. + + +.. method:: AU_write.close() + + Make sure *nframes* is correct, and close the file. + + This method is called upon deletion. + +Note that it is invalid to set any parameters after calling :meth:`writeframes` +or :meth:`writeframesraw`. + diff --git a/Doc/library/symbol.rst b/Doc/library/symbol.rst new file mode 100644 index 0000000..1735276 --- /dev/null +++ b/Doc/library/symbol.rst @@ -0,0 +1,32 @@ + +:mod:`symbol` --- Constants used with Python parse trees +======================================================== + +.. module:: symbol + :synopsis: Constants representing internal nodes of the parse tree. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +This module provides constants which represent the numeric values of internal +nodes of the parse tree. Unlike most Python constants, these use lower-case +names. Refer to the file :file:`Grammar/Grammar` in the Python distribution for +the definitions of the names in the context of the language grammar. The +specific numeric values which the names map to may change between Python +versions. + +This module also provides one additional data object: + + +.. data:: sym_name + + Dictionary mapping the numeric values of the constants defined in this module + back to name strings, allowing more human-readable representation of parse trees + to be generated. + + +.. seealso:: + + Module :mod:`parser` + The second example for the :mod:`parser` module shows how to use the + :mod:`symbol` module. + diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst new file mode 100644 index 0000000..5184c25 --- /dev/null +++ b/Doc/library/sys.rst @@ -0,0 +1,606 @@ + +:mod:`sys` --- System-specific parameters and functions +======================================================= + +.. module:: sys + :synopsis: Access system-specific parameters and functions. + + +This module provides access to some variables used or maintained by the +interpreter and to functions that interact strongly with the interpreter. It is +always available. + + +.. data:: argv + + The list of command line arguments passed to a Python script. ``argv[0]`` is the + script name (it is operating system dependent whether this is a full pathname or + not). If the command was executed using the :option:`-c` command line option to + the interpreter, ``argv[0]`` is set to the string ``'-c'``. If no script name + was passed to the Python interpreter, ``argv[0]`` is the empty string. + + To loop over the standard input, or the list of files given on the + command line, see the :mod:`fileinput` module. + + +.. data:: byteorder + + An indicator of the native byte order. This will have the value ``'big'`` on + big-endian (most-significant byte first) platforms, and ``'little'`` on + little-endian (least-significant byte first) platforms. + + .. versionadded:: 2.0 + + +.. data:: subversion + + A triple (repo, branch, version) representing the Subversion information of the + Python interpreter. *repo* is the name of the repository, ``'CPython'``. + *branch* is a string of one of the forms ``'trunk'``, ``'branches/name'`` or + ``'tags/name'``. *version* is the output of ``svnversion``, if the interpreter + was built from a Subversion checkout; it contains the revision number (range) + and possibly a trailing 'M' if there were local modifications. If the tree was + exported (or svnversion was not available), it is the revision of + ``Include/patchlevel.h`` if the branch is a tag. Otherwise, it is ``None``. + + .. versionadded:: 2.5 + + +.. data:: builtin_module_names + + A tuple of strings giving the names of all modules that are compiled into this + Python interpreter. (This information is not available in any other way --- + ``modules.keys()`` only lists the imported modules.) + + +.. data:: copyright + + A string containing the copyright pertaining to the Python interpreter. + + +.. function:: _current_frames() + + Return a dictionary mapping each thread's identifier to the topmost stack frame + currently active in that thread at the time the function is called. Note that + functions in the :mod:`traceback` module can build the call stack given such a + frame. + + This is most useful for debugging deadlock: this function does not require the + deadlocked threads' cooperation, and such threads' call stacks are frozen for as + long as they remain deadlocked. The frame returned for a non-deadlocked thread + may bear no relationship to that thread's current activity by the time calling + code examines the frame. + + This function should be used for internal and specialized purposes only. + + .. versionadded:: 2.5 + + +.. data:: dllhandle + + Integer specifying the handle of the Python DLL. Availability: Windows. + + +.. function:: displayhook(value) + + If *value* is not ``None``, this function prints it to ``sys.stdout``, and saves + it in ``__builtin__._``. + + ``sys.displayhook`` is called on the result of evaluating an expression entered + in an interactive Python session. The display of these values can be customized + by assigning another one-argument function to ``sys.displayhook``. + + +.. function:: excepthook(type, value, traceback) + + This function prints out a given traceback and exception to ``sys.stderr``. + + When an exception is raised and uncaught, the interpreter calls + ``sys.excepthook`` with three arguments, the exception class, exception + instance, and a traceback object. In an interactive session this happens just + before control is returned to the prompt; in a Python program this happens just + before the program exits. The handling of such top-level exceptions can be + customized by assigning another three-argument function to ``sys.excepthook``. + + +.. data:: __displayhook__ + __excepthook__ + + These objects contain the original values of ``displayhook`` and ``excepthook`` + at the start of the program. They are saved so that ``displayhook`` and + ``excepthook`` can be restored in case they happen to get replaced with broken + objects. + + +.. function:: exc_info() + + This function returns a tuple of three values that give information about the + exception that is currently being handled. The information returned is specific + both to the current thread and to the current stack frame. If the current stack + frame is not handling an exception, the information is taken from the calling + stack frame, or its caller, and so on until a stack frame is found that is + handling an exception. Here, "handling an exception" is defined as "executing + or having executed an except clause." For any stack frame, only information + about the most recently handled exception is accessible. + + .. index:: object: traceback + + If no exception is being handled anywhere on the stack, a tuple containing three + ``None`` values is returned. Otherwise, the values returned are ``(type, value, + traceback)``. Their meaning is: *type* gets the exception type of the exception + being handled (a class object); *value* gets the exception parameter (its + :dfn:`associated value` or the second argument to :keyword:`raise`, which is + always a class instance if the exception type is a class object); *traceback* + gets a traceback object (see the Reference Manual) which encapsulates the call + stack at the point where the exception originally occurred. + + .. warning:: + + Assigning the *traceback* return value to a local variable in a function that is + handling an exception will cause a circular reference. This will prevent + anything referenced by a local variable in the same function or by the traceback + from being garbage collected. Since most functions don't need access to the + traceback, the best solution is to use something like ``exctype, value = + sys.exc_info()[:2]`` to extract only the exception type and value. If you do + need the traceback, make sure to delete it after use (best done with a + :keyword:`try` ... :keyword:`finally` statement) or to call :func:`exc_info` in + a function that does not itself handle an exception. + + .. note:: + + Beginning with Python 2.2, such cycles are automatically reclaimed when garbage + collection is enabled and they become unreachable, but it remains more efficient + to avoid creating cycles. + + +.. data:: exec_prefix + + A string giving the site-specific directory prefix where the platform-dependent + Python files are installed; by default, this is also ``'/usr/local'``. This can + be set at build time with the :option:`--exec-prefix` argument to the + :program:`configure` script. Specifically, all configuration files (e.g. the + :file:`pyconfig.h` header file) are installed in the directory ``exec_prefix + + '/lib/pythonversion/config'``, and shared library modules are installed in + ``exec_prefix + '/lib/pythonversion/lib-dynload'``, where *version* is equal to + ``version[:3]``. + + +.. data:: executable + + A string giving the name of the executable binary for the Python interpreter, on + systems where this makes sense. + + +.. function:: exit([arg]) + + Exit from Python. This is implemented by raising the :exc:`SystemExit` + exception, so cleanup actions specified by finally clauses of :keyword:`try` + statements are honored, and it is possible to intercept the exit attempt at an + outer level. The optional argument *arg* can be an integer giving the exit + status (defaulting to zero), or another type of object. If it is an integer, + zero is considered "successful termination" and any nonzero value is considered + "abnormal termination" by shells and the like. Most systems require it to be in + the range 0-127, and produce undefined results otherwise. Some systems have a + convention for assigning specific meanings to specific exit codes, but these are + generally underdeveloped; Unix programs generally use 2 for command line syntax + errors and 1 for all other kind of errors. If another type of object is passed, + ``None`` is equivalent to passing zero, and any other object is printed to + ``sys.stderr`` and results in an exit code of 1. In particular, + ``sys.exit("some error message")`` is a quick way to exit a program when an + error occurs. + + +.. function:: getcheckinterval() + + Return the interpreter's "check interval"; see :func:`setcheckinterval`. + + .. versionadded:: 2.3 + + +.. function:: getdefaultencoding() + + Return the name of the current default string encoding used by the Unicode + implementation. + + .. versionadded:: 2.0 + + +.. function:: getdlopenflags() + + Return the current value of the flags that are used for :cfunc:`dlopen` calls. + The flag constants are defined in the :mod:`dl` and :mod:`DLFCN` modules. + Availability: Unix. + + .. versionadded:: 2.2 + + +.. function:: getfilesystemencoding() + + Return the name of the encoding used to convert Unicode filenames into system + file names, or ``None`` if the system default encoding is used. The result value + depends on the operating system: + + * On Windows 9x, the encoding is "mbcs". + + * On Mac OS X, the encoding is "utf-8". + + * On Unix, the encoding is the user's preference according to the result of + nl_langinfo(CODESET), or :const:`None` if the ``nl_langinfo(CODESET)`` failed. + + * On Windows NT+, file names are Unicode natively, so no conversion is + performed. :func:`getfilesystemencoding` still returns ``'mbcs'``, as this is + the encoding that applications should use when they explicitly want to convert + Unicode strings to byte strings that are equivalent when used as file names. + + .. versionadded:: 2.3 + + +.. function:: getrefcount(object) + + Return the reference count of the *object*. The count returned is generally one + higher than you might expect, because it includes the (temporary) reference as + an argument to :func:`getrefcount`. + + +.. function:: getrecursionlimit() + + Return the current value of the recursion limit, the maximum depth of the Python + interpreter stack. This limit prevents infinite recursion from causing an + overflow of the C stack and crashing Python. It can be set by + :func:`setrecursionlimit`. + + +.. function:: _getframe([depth]) + + Return a frame object from the call stack. If optional integer *depth* is + given, return the frame object that many calls below the top of the stack. If + that is deeper than the call stack, :exc:`ValueError` is raised. The default + for *depth* is zero, returning the frame at the top of the call stack. + + This function should be used for internal and specialized purposes only. + + +.. function:: getwindowsversion() + + Return a tuple containing five components, describing the Windows version + currently running. The elements are *major*, *minor*, *build*, *platform*, and + *text*. *text* contains a string while all other values are integers. + + *platform* may be one of the following values: + + +-----------------------------------------+-----------------------+ + | Constant | Platform | + +=========================================+=======================+ + | :const:`0 (VER_PLATFORM_WIN32s)` | Win32s on Windows 3.1 | + +-----------------------------------------+-----------------------+ + | :const:`1 (VER_PLATFORM_WIN32_WINDOWS)` | Windows 95/98/ME | + +-----------------------------------------+-----------------------+ + | :const:`2 (VER_PLATFORM_WIN32_NT)` | Windows NT/2000/XP | + +-----------------------------------------+-----------------------+ + | :const:`3 (VER_PLATFORM_WIN32_CE)` | Windows CE | + +-----------------------------------------+-----------------------+ + + This function wraps the Win32 :cfunc:`GetVersionEx` function; see the Microsoft + documentation for more information about these fields. + + Availability: Windows. + + .. versionadded:: 2.3 + + +.. data:: hexversion + + The version number encoded as a single integer. This is guaranteed to increase + with each version, including proper support for non-production releases. For + example, to test that the Python interpreter is at least version 1.5.2, use:: + + if sys.hexversion >= 0x010502F0: + # use some advanced feature + ... + else: + # use an alternative implementation or warn the user + ... + + This is called ``hexversion`` since it only really looks meaningful when viewed + as the result of passing it to the built-in :func:`hex` function. The + ``version_info`` value may be used for a more human-friendly encoding of the + same information. + + .. versionadded:: 1.5.2 + + +.. function:: intern(string) + + Enter *string* in the table of "interned" strings and return the interned string + -- which is *string* itself or a copy. Interning strings is useful to gain a + little performance on dictionary lookup -- if the keys in a dictionary are + interned, and the lookup key is interned, the key comparisons (after hashing) + can be done by a pointer compare instead of a string compare. Normally, the + names used in Python programs are automatically interned, and the dictionaries + used to hold module, class or instance attributes have interned keys. + + .. versionchanged:: 2.3 + Interned strings are not immortal (like they used to be in Python 2.2 and + before); you must keep a reference to the return value of :func:`intern` around + to benefit from it. + + +.. data:: last_type + last_value + last_traceback + + These three variables are not always defined; they are set when an exception is + not handled and the interpreter prints an error message and a stack traceback. + Their intended use is to allow an interactive user to import a debugger module + and engage in post-mortem debugging without having to re-execute the command + that caused the error. (Typical use is ``import pdb; pdb.pm()`` to enter the + post-mortem debugger; see chapter :ref:`debugger` for + more information.) + + The meaning of the variables is the same as that of the return values from + :func:`exc_info` above. (Since there is only one interactive thread, + thread-safety is not a concern for these variables, unlike for ``exc_type`` + etc.) + + +.. data:: maxint + + The largest positive integer supported by Python's regular integer type. This + is at least 2\*\*31-1. The largest negative integer is ``-maxint-1`` --- the + asymmetry results from the use of 2's complement binary arithmetic. + + +.. data:: maxunicode + + An integer giving the largest supported code point for a Unicode character. The + value of this depends on the configuration option that specifies whether Unicode + characters are stored as UCS-2 or UCS-4. + + +.. data:: modules + + This is a dictionary that maps module names to modules which have already been + loaded. This can be manipulated to force reloading of modules and other tricks. + + +.. data:: path + + .. index:: triple: module; search; path + + A list of strings that specifies the search path for modules. Initialized from + the environment variable :envvar:`PYTHONPATH`, plus an installation-dependent + default. + + As initialized upon program startup, the first item of this list, ``path[0]``, + is the directory containing the script that was used to invoke the Python + interpreter. If the script directory is not available (e.g. if the interpreter + is invoked interactively or if the script is read from standard input), + ``path[0]`` is the empty string, which directs Python to search modules in the + current directory first. Notice that the script directory is inserted *before* + the entries inserted as a result of :envvar:`PYTHONPATH`. + + A program is free to modify this list for its own purposes. + + .. versionchanged:: 2.3 + Unicode strings are no longer ignored. + + +.. data:: platform + + This string contains a platform identifier, e.g. ``'sunos5'`` or ``'linux1'``. + This can be used to append platform-specific components to ``path``, for + instance. + + +.. data:: prefix + + A string giving the site-specific directory prefix where the platform + independent Python files are installed; by default, this is the string + ``'/usr/local'``. This can be set at build time with the :option:`--prefix` + argument to the :program:`configure` script. The main collection of Python + library modules is installed in the directory ``prefix + '/lib/pythonversion'`` + while the platform independent header files (all except :file:`pyconfig.h`) are + stored in ``prefix + '/include/pythonversion'``, where *version* is equal to + ``version[:3]``. + + +.. data:: ps1 + ps2 + + .. index:: + single: interpreter prompts + single: prompts, interpreter + + Strings specifying the primary and secondary prompt of the interpreter. These + are only defined if the interpreter is in interactive mode. Their initial + values in this case are ``'>>> '`` and ``'... '``. If a non-string object is + assigned to either variable, its :func:`str` is re-evaluated each time the + interpreter prepares to read a new interactive command; this can be used to + implement a dynamic prompt. + + +.. function:: setcheckinterval(interval) + + Set the interpreter's "check interval". This integer value determines how often + the interpreter checks for periodic things such as thread switches and signal + handlers. The default is ``100``, meaning the check is performed every 100 + Python virtual instructions. Setting it to a larger value may increase + performance for programs using threads. Setting it to a value ``<=`` 0 checks + every virtual instruction, maximizing responsiveness as well as overhead. + + +.. function:: setdefaultencoding(name) + + Set the current default string encoding used by the Unicode implementation. If + *name* does not match any available encoding, :exc:`LookupError` is raised. + This function is only intended to be used by the :mod:`site` module + implementation and, where needed, by :mod:`sitecustomize`. Once used by the + :mod:`site` module, it is removed from the :mod:`sys` module's namespace. + + .. % Note that \refmodule{site} is not imported if + .. % the \programopt{-S} option is passed to the interpreter, in which + .. % case this function will remain available. + + .. versionadded:: 2.0 + + +.. function:: setdlopenflags(n) + + Set the flags used by the interpreter for :cfunc:`dlopen` calls, such as when + the interpreter loads extension modules. Among other things, this will enable a + lazy resolving of symbols when importing a module, if called as + ``sys.setdlopenflags(0)``. To share symbols across extension modules, call as + ``sys.setdlopenflags(dl.RTLD_NOW | dl.RTLD_GLOBAL)``. Symbolic names for the + flag modules can be either found in the :mod:`dl` module, or in the :mod:`DLFCN` + module. If :mod:`DLFCN` is not available, it can be generated from + :file:`/usr/include/dlfcn.h` using the :program:`h2py` script. Availability: + Unix. + + .. versionadded:: 2.2 + + +.. function:: setprofile(profilefunc) + + .. index:: + single: profile function + single: profiler + + Set the system's profile function, which allows you to implement a Python source + code profiler in Python. See chapter :ref:`profile` for more information on the + Python profiler. The system's profile function is called similarly to the + system's trace function (see :func:`settrace`), but it isn't called for each + executed line of code (only on call and return, but the return event is reported + even when an exception has been set). The function is thread-specific, but + there is no way for the profiler to know about context switches between threads, + so it does not make sense to use this in the presence of multiple threads. Also, + its return value is not used, so it can simply return ``None``. + + +.. function:: setrecursionlimit(limit) + + Set the maximum depth of the Python interpreter stack to *limit*. This limit + prevents infinite recursion from causing an overflow of the C stack and crashing + Python. + + The highest possible limit is platform-dependent. A user may need to set the + limit higher when she has a program that requires deep recursion and a platform + that supports a higher limit. This should be done with care, because a too-high + limit can lead to a crash. + + +.. function:: settrace(tracefunc) + + .. index:: + single: trace function + single: debugger + + Set the system's trace function, which allows you to implement a Python + source code debugger in Python. See section :ref:`debugger-hooks` in the + chapter on the Python debugger. The function is thread-specific; for a + debugger to support multiple threads, it must be registered using + :func:`settrace` for each thread being debugged. + + .. note:: + + The :func:`settrace` function is intended only for implementing debuggers, + profilers, coverage tools and the like. Its behavior is part of the + implementation platform, rather than part of the language definition, and thus + may not be available in all Python implementations. + + +.. function:: settscdump(on_flag) + + Activate dumping of VM measurements using the Pentium timestamp counter, if + *on_flag* is true. Deactivate these dumps if *on_flag* is off. The function is + available only if Python was compiled with :option:`--with-tsc`. To understand + the output of this dump, read :file:`Python/ceval.c` in the Python sources. + + .. versionadded:: 2.4 + + +.. data:: stdin + stdout + stderr + + File objects corresponding to the interpreter's standard input, output and error + streams. ``stdin`` is used for all interpreter input except for scripts. + ``stdout`` is used for the output of :keyword:`print` and expression statements. + The interpreter's own prompts and (almost all of) its error messages go to + ``stderr``. ``stdout`` and ``stderr`` needn't be built-in file objects: any + object is acceptable as long as it has a :meth:`write` method that takes a + string argument. (Changing these objects doesn't affect the standard I/O + streams of processes executed by :func:`os.popen`, :func:`os.system` or the + :func:`exec\*` family of functions in the :mod:`os` module.) + + +.. data:: __stdin__ + __stdout__ + __stderr__ + + These objects contain the original values of ``stdin``, ``stderr`` and + ``stdout`` at the start of the program. They are used during finalization, and + could be useful to restore the actual files to known working file objects in + case they have been overwritten with a broken object. + + +.. data:: tracebacklimit + + When this variable is set to an integer value, it determines the maximum number + of levels of traceback information printed when an unhandled exception occurs. + The default is ``1000``. When set to ``0`` or less, all traceback information + is suppressed and only the exception type and value are printed. + + +.. data:: version + + A string containing the version number of the Python interpreter plus additional + information on the build number and compiler used. It has a value of the form + ``'version (#build_number, build_date, build_time) [compiler]'``. The first + three characters are used to identify the version in the installation + directories (where appropriate on each platform). An example:: + + >>> import sys + >>> sys.version + '1.5.2 (#0 Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)]' + + +.. data:: api_version + + The C API version for this interpreter. Programmers may find this useful when + debugging version conflicts between Python and extension modules. + + .. versionadded:: 2.3 + + +.. data:: version_info + + A tuple containing the five components of the version number: *major*, *minor*, + *micro*, *releaselevel*, and *serial*. All values except *releaselevel* are + integers; the release level is ``'alpha'``, ``'beta'``, ``'candidate'``, or + ``'final'``. The ``version_info`` value corresponding to the Python version 2.0 + is ``(2, 0, 0, 'final', 0)``. + + .. versionadded:: 2.0 + + +.. data:: warnoptions + + This is an implementation detail of the warnings framework; do not modify this + value. Refer to the :mod:`warnings` module for more information on the warnings + framework. + + +.. data:: winver + + The version number used to form registry keys on Windows platforms. This is + stored as string resource 1000 in the Python DLL. The value is normally the + first three characters of :const:`version`. It is provided in the :mod:`sys` + module for informational purposes; modifying this value has no effect on the + registry keys used by Python. Availability: Windows. + + +.. seealso:: + + Module :mod:`site` + This describes how to use .pth files to extend ``sys.path``. + diff --git a/Doc/library/syslog.rst b/Doc/library/syslog.rst new file mode 100644 index 0000000..549f26b --- /dev/null +++ b/Doc/library/syslog.rst @@ -0,0 +1,66 @@ + +:mod:`syslog` --- Unix syslog library routines +============================================== + +.. module:: syslog + :platform: Unix + :synopsis: An interface to the Unix syslog library routines. + + +This module provides an interface to the Unix ``syslog`` library routines. +Refer to the Unix manual pages for a detailed description of the ``syslog`` +facility. + +The module defines the following functions: + + +.. function:: syslog([priority,] message) + + Send the string *message* to the system logger. A trailing newline is added if + necessary. Each message is tagged with a priority composed of a *facility* and + a *level*. The optional *priority* argument, which defaults to + :const:`LOG_INFO`, determines the message priority. If the facility is not + encoded in *priority* using logical-or (``LOG_INFO | LOG_USER``), the value + given in the :func:`openlog` call is used. + + +.. function:: openlog(ident[, logopt[, facility]]) + + Logging options other than the defaults can be set by explicitly opening the log + file with :func:`openlog` prior to calling :func:`syslog`. The defaults are + (usually) *ident* = ``'syslog'``, *logopt* = ``0``, *facility* = + :const:`LOG_USER`. The *ident* argument is a string which is prepended to every + message. The optional *logopt* argument is a bit field - see below for possible + values to combine. The optional *facility* argument sets the default facility + for messages which do not have a facility explicitly encoded. + + +.. function:: closelog() + + Close the log file. + + +.. function:: setlogmask(maskpri) + + Set the priority mask to *maskpri* and return the previous mask value. Calls to + :func:`syslog` with a priority level not set in *maskpri* are ignored. The + default is to log all priorities. The function ``LOG_MASK(pri)`` calculates the + mask for the individual priority *pri*. The function ``LOG_UPTO(pri)`` + calculates the mask for all priorities up to and including *pri*. + +The module defines the following constants: + +Priority levels (high to low): + :const:`LOG_EMERG`, :const:`LOG_ALERT`, :const:`LOG_CRIT`, :const:`LOG_ERR`, + :const:`LOG_WARNING`, :const:`LOG_NOTICE`, :const:`LOG_INFO`, + :const:`LOG_DEBUG`. + +Facilities: + :const:`LOG_KERN`, :const:`LOG_USER`, :const:`LOG_MAIL`, :const:`LOG_DAEMON`, + :const:`LOG_AUTH`, :const:`LOG_LPR`, :const:`LOG_NEWS`, :const:`LOG_UUCP`, + :const:`LOG_CRON` and :const:`LOG_LOCAL0` to :const:`LOG_LOCAL7`. + +Log options: + :const:`LOG_PID`, :const:`LOG_CONS`, :const:`LOG_NDELAY`, :const:`LOG_NOWAIT` + and :const:`LOG_PERROR` if defined in ``<syslog.h>``. + diff --git a/Doc/library/tabnanny.rst b/Doc/library/tabnanny.rst new file mode 100644 index 0000000..8032655 --- /dev/null +++ b/Doc/library/tabnanny.rst @@ -0,0 +1,70 @@ + +:mod:`tabnanny` --- Detection of ambiguous indentation +====================================================== + +.. module:: tabnanny + :synopsis: Tool for detecting white space related problems in Python source files in a + directory tree. +.. moduleauthor:: Tim Peters <tim_one@users.sourceforge.net> +.. sectionauthor:: Peter Funk <pf@artcom-gmbh.de> + + +.. % rudimentary documentation based on module comments, by Peter Funk +.. % <pf@artcom-gmbh.de> + +For the time being this module is intended to be called as a script. However it +is possible to import it into an IDE and use the function :func:`check` +described below. + +.. warning:: + + The API provided by this module is likely to change in future releases; such + changes may not be backward compatible. + + +.. function:: check(file_or_dir) + + If *file_or_dir* is a directory and not a symbolic link, then recursively + descend the directory tree named by *file_or_dir*, checking all :file:`.py` + files along the way. If *file_or_dir* is an ordinary Python source file, it is + checked for whitespace related problems. The diagnostic messages are written to + standard output using the print statement. + + +.. data:: verbose + + Flag indicating whether to print verbose messages. This is incremented by the + ``-v`` option if called as a script. + + +.. data:: filename_only + + Flag indicating whether to print only the filenames of files containing + whitespace related problems. This is set to true by the ``-q`` option if called + as a script. + + +.. exception:: NannyNag + + Raised by :func:`tokeneater` if detecting an ambiguous indent. Captured and + handled in :func:`check`. + + +.. function:: tokeneater(type, token, start, end, line) + + This function is used by :func:`check` as a callback parameter to the function + :func:`tokenize.tokenize`. + +.. % XXX FIXME: Document \function{errprint}, +.. % \function{format_witnesses} \class{Whitespace} +.. % check_equal, indents +.. % \function{reset_globals} + + +.. seealso:: + + Module :mod:`tokenize` + Lexical scanner for Python source code. + + .. % XXX may be add a reference to IDLE? + diff --git a/Doc/library/tarfile.rst b/Doc/library/tarfile.rst new file mode 100644 index 0000000..a0cd673 --- /dev/null +++ b/Doc/library/tarfile.rst @@ -0,0 +1,738 @@ +.. _tarfile-mod: + +:mod:`tarfile` --- Read and write tar archive files +=================================================== + +.. module:: tarfile + :synopsis: Read and write tar-format archive files. + + +.. versionadded:: 2.3 + +.. moduleauthor:: Lars Gustäbel <lars@gustaebel.de> +.. sectionauthor:: Lars Gustäbel <lars@gustaebel.de> + + +The :mod:`tarfile` module makes it possible to read and create tar archives. +Some facts and figures: + +* reads and writes :mod:`gzip` and :mod:`bzip2` compressed archives. + +* read/write support for the POSIX.1-1988 (ustar) format. + +* read/write support for the GNU tar format including *longname* and *longlink* + extensions, read-only support for the *sparse* extension. + +* read/write support for the POSIX.1-2001 (pax) format. + + .. versionadded:: 2.6 + +* handles directories, regular files, hardlinks, symbolic links, fifos, + character devices and block devices and is able to acquire and restore file + information like timestamp, access permissions and owner. + +* can handle tape devices. + + +.. function:: open(name[, mode[, fileobj[, bufsize]]], **kwargs) + + Return a :class:`TarFile` object for the pathname *name*. For detailed + information on :class:`TarFile` objects and the keyword arguments that are + allowed, see :ref:`tarfile-objects`. + + *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults + to ``'r'``. Here is a full list of mode combinations: + + +------------------+---------------------------------------------+ + | mode | action | + +==================+=============================================+ + | ``'r' or 'r:*'`` | Open for reading with transparent | + | | compression (recommended). | + +------------------+---------------------------------------------+ + | ``'r:'`` | Open for reading exclusively without | + | | compression. | + +------------------+---------------------------------------------+ + | ``'r:gz'`` | Open for reading with gzip compression. | + +------------------+---------------------------------------------+ + | ``'r:bz2'`` | Open for reading with bzip2 compression. | + +------------------+---------------------------------------------+ + | ``'a' or 'a:'`` | Open for appending with no compression. The | + | | file is created if it does not exist. | + +------------------+---------------------------------------------+ + | ``'w' or 'w:'`` | Open for uncompressed writing. | + +------------------+---------------------------------------------+ + | ``'w:gz'`` | Open for gzip compressed writing. | + +------------------+---------------------------------------------+ + | ``'w:bz2'`` | Open for bzip2 compressed writing. | + +------------------+---------------------------------------------+ + + Note that ``'a:gz'`` or ``'a:bz2'`` is not possible. If *mode* is not suitable + to open a certain (compressed) file for reading, :exc:`ReadError` is raised. Use + *mode* ``'r'`` to avoid this. If a compression method is not supported, + :exc:`CompressionError` is raised. + + If *fileobj* is specified, it is used as an alternative to a file object opened + for *name*. It is supposed to be at position 0. + + For special purposes, there is a second format for *mode*: + ``'filemode|[compression]'``. :func:`open` will return a :class:`TarFile` + object that processes its data as a stream of blocks. No random seeking will + be done on the file. If given, *fileobj* may be any object that has a + :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize* + specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant + in combination with e.g. ``sys.stdin``, a socket file object or a tape + device. However, such a :class:`TarFile` object is limited in that it does + not allow to be accessed randomly, see :ref:`tar-examples`. The currently + possible modes: + + +-------------+--------------------------------------------+ + | Mode | Action | + +=============+============================================+ + | ``'r|*'`` | Open a *stream* of tar blocks for reading | + | | with transparent compression. | + +-------------+--------------------------------------------+ + | ``'r|'`` | Open a *stream* of uncompressed tar blocks | + | | for reading. | + +-------------+--------------------------------------------+ + | ``'r|gz'`` | Open a gzip compressed *stream* for | + | | reading. | + +-------------+--------------------------------------------+ + | ``'r|bz2'`` | Open a bzip2 compressed *stream* for | + | | reading. | + +-------------+--------------------------------------------+ + | ``'w|'`` | Open an uncompressed *stream* for writing. | + +-------------+--------------------------------------------+ + | ``'w|gz'`` | Open an gzip compressed *stream* for | + | | writing. | + +-------------+--------------------------------------------+ + | ``'w|bz2'`` | Open an bzip2 compressed *stream* for | + | | writing. | + +-------------+--------------------------------------------+ + + +.. class:: TarFile + + Class for reading and writing tar archives. Do not use this class directly, + better use :func:`open` instead. See :ref:`tarfile-objects`. + + +.. function:: is_tarfile(name) + + Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile` + module can read. + + +.. class:: TarFileCompat(filename[, mode[, compression]]) + + Class for limited access to tar archives with a :mod:`zipfile`\ -like interface. + Please consult the documentation of the :mod:`zipfile` module for more details. + *compression* must be one of the following constants: + + + .. data:: TAR_PLAIN + + Constant for an uncompressed tar archive. + + + .. data:: TAR_GZIPPED + + Constant for a :mod:`gzip` compressed tar archive. + + +.. exception:: TarError + + Base class for all :mod:`tarfile` exceptions. + + +.. exception:: ReadError + + Is raised when a tar archive is opened, that either cannot be handled by the + :mod:`tarfile` module or is somehow invalid. + + +.. exception:: CompressionError + + Is raised when a compression method is not supported or when the data cannot be + decoded properly. + + +.. exception:: StreamError + + Is raised for the limitations that are typical for stream-like :class:`TarFile` + objects. + + +.. exception:: ExtractError + + Is raised for *non-fatal* errors when using :meth:`extract`, but only if + :attr:`TarFile.errorlevel`\ ``== 2``. + + +.. exception:: HeaderError + + Is raised by :meth:`frombuf` if the buffer it gets is invalid. + + .. versionadded:: 2.6 + +Each of the following constants defines a tar archive format that the +:mod:`tarfile` module is able to create. See section :ref:`tar-formats` for +details. + + +.. data:: USTAR_FORMAT + + POSIX.1-1988 (ustar) format. + + +.. data:: GNU_FORMAT + + GNU tar format. + + +.. data:: PAX_FORMAT + + POSIX.1-2001 (pax) format. + + +.. data:: DEFAULT_FORMAT + + The default format for creating archives. This is currently :const:`GNU_FORMAT`. + + +.. seealso:: + + Module :mod:`zipfile` + Documentation of the :mod:`zipfile` standard module. + + `GNU tar manual, Basic Tar Format <http://www.gnu.org/software/tar/manual/html_node/tar_134.html#SEC134>`_ + Documentation for tar archive files, including GNU tar extensions. + +.. % ----------------- +.. % TarFile Objects +.. % ----------------- + + +.. _tarfile-objects: + +TarFile Objects +--------------- + +The :class:`TarFile` object provides an interface to a tar archive. A tar +archive is a sequence of blocks. An archive member (a stored file) is made up of +a header block followed by data blocks. It is possible to store a file in a tar +archive several times. Each archive member is represented by a :class:`TarInfo` +object, see :ref:`tarinfo-objects` for details. + + +.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=None, errors=None, pax_headers=None, debug=0, errorlevel=0) + + All following arguments are optional and can be accessed as instance attributes + as well. + + *name* is the pathname of the archive. It can be omitted if *fileobj* is given. + In this case, the file object's :attr:`name` attribute is used if it exists. + + *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append + data to an existing file or ``'w'`` to create a new file overwriting an existing + one. + + If *fileobj* is given, it is used for reading or writing data. If it can be + determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used + from position 0. + + .. note:: + + *fileobj* is not closed, when :class:`TarFile` is closed. + + *format* controls the archive format. It must be one of the constants + :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are + defined at module level. + + .. versionadded:: 2.6 + + The *tarinfo* argument can be used to replace the default :class:`TarInfo` class + with a different one. + + .. versionadded:: 2.6 + + If *dereference* is ``False``, add symbolic and hard links to the archive. If it + is ``True``, add the content of the target files to the archive. This has no + effect on systems that do not support symbolic links. + + If *ignore_zeros* is ``False``, treat an empty block as the end of the archive. + If it is *True*, skip empty (and invalid) blocks and try to get as many members + as possible. This is only useful for reading concatenated or damaged archives. + + *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug + messages). The messages are written to ``sys.stderr``. + + If *errorlevel* is ``0``, all errors are ignored when using :meth:`extract`. + Nevertheless, they appear as error messages in the debug output, when debugging + is enabled. If ``1``, all *fatal* errors are raised as :exc:`OSError` or + :exc:`IOError` exceptions. If ``2``, all *non-fatal* errors are raised as + :exc:`TarError` exceptions as well. + + The *encoding* and *errors* arguments control the way strings are converted to + unicode objects and vice versa. The default settings will work for most users. + See section :ref:`tar-unicode` for in-depth information. + + .. versionadded:: 2.6 + + The *pax_headers* argument is an optional dictionary of unicode strings which + will be added as a pax global header if *format* is :const:`PAX_FORMAT`. + + .. versionadded:: 2.6 + + +.. method:: TarFile.open(...) + + Alternative constructor. The :func:`open` function on module level is actually a + shortcut to this classmethod. See section :ref:`tarfile-mod` for details. + + +.. method:: TarFile.getmember(name) + + Return a :class:`TarInfo` object for member *name*. If *name* can not be found + in the archive, :exc:`KeyError` is raised. + + .. note:: + + If a member occurs more than once in the archive, its last occurrence is assumed + to be the most up-to-date version. + + +.. method:: TarFile.getmembers() + + Return the members of the archive as a list of :class:`TarInfo` objects. The + list has the same order as the members in the archive. + + +.. method:: TarFile.getnames() + + Return the members as a list of their names. It has the same order as the list + returned by :meth:`getmembers`. + + +.. method:: TarFile.list(verbose=True) + + Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`, + only the names of the members are printed. If it is :const:`True`, output + similar to that of :program:`ls -l` is produced. + + +.. method:: TarFile.next() + + Return the next member of the archive as a :class:`TarInfo` object, when + :class:`TarFile` is opened for reading. Return ``None`` if there is no more + available. + + +.. method:: TarFile.extractall([path[, members]]) + + Extract all members from the archive to the current working directory or + directory *path*. If optional *members* is given, it must be a subset of the + list returned by :meth:`getmembers`. Directory information like owner, + modification time and permissions are set after all members have been extracted. + This is done to work around two problems: A directory's modification time is + reset each time a file is created in it. And, if a directory's permissions do + not allow writing, extracting files to it will fail. + + .. versionadded:: 2.5 + + +.. method:: TarFile.extract(member[, path]) + + Extract a member from the archive to the current working directory, using its + full name. Its file information is extracted as accurately as possible. *member* + may be a filename or a :class:`TarInfo` object. You can specify a different + directory using *path*. + + .. note:: + + Because the :meth:`extract` method allows random access to a tar archive there + are some issues you must take care of yourself. See the description for + :meth:`extractall` above. + + +.. method:: TarFile.extractfile(member) + + Extract a member from the archive as a file object. *member* may be a filename + or a :class:`TarInfo` object. If *member* is a regular file, a file-like object + is returned. If *member* is a link, a file-like object is constructed from the + link's target. If *member* is none of the above, ``None`` is returned. + + .. note:: + + The file-like object is read-only and provides the following methods: + :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`seek`, :meth:`tell`. + + +.. method:: TarFile.add(name[, arcname[, recursive[, exclude]]]) + + Add the file *name* to the archive. *name* may be any type of file (directory, + fifo, symbolic link, etc.). If given, *arcname* specifies an alternative name + for the file in the archive. Directories are added recursively by default. This + can be avoided by setting *recursive* to :const:`False`. If *exclude* is given + it must be a function that takes one filename argument and returns a boolean + value. Depending on this value the respective file is either excluded + (:const:`True`) or added (:const:`False`). + + .. versionchanged:: 2.6 + Added the *exclude* parameter. + + +.. method:: TarFile.addfile(tarinfo[, fileobj]) + + Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given, + ``tarinfo.size`` bytes are read from it and added to the archive. You can + create :class:`TarInfo` objects using :meth:`gettarinfo`. + + .. note:: + + On Windows platforms, *fileobj* should always be opened with mode ``'rb'`` to + avoid irritation about the file size. + + +.. method:: TarFile.gettarinfo([name[, arcname[, fileobj]]]) + + Create a :class:`TarInfo` object for either the file *name* or the file object + *fileobj* (using :func:`os.fstat` on its file descriptor). You can modify some + of the :class:`TarInfo`'s attributes before you add it using :meth:`addfile`. + If given, *arcname* specifies an alternative name for the file in the archive. + + +.. method:: TarFile.close() + + Close the :class:`TarFile`. In write mode, two finishing zero blocks are + appended to the archive. + + +.. attribute:: TarFile.posix + + Setting this to :const:`True` is equivalent to setting the :attr:`format` + attribute to :const:`USTAR_FORMAT`, :const:`False` is equivalent to + :const:`GNU_FORMAT`. + + .. versionchanged:: 2.4 + *posix* defaults to :const:`False`. + + .. deprecated:: 2.6 + Use the :attr:`format` attribute instead. + + +.. attribute:: TarFile.pax_headers + + A dictionary containing key-value pairs of pax global headers. + + .. versionadded:: 2.6 + +.. % ----------------- +.. % TarInfo Objects +.. % ----------------- + + +.. _tarinfo-objects: + +TarInfo Objects +--------------- + +A :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside +from storing all required attributes of a file (like file type, size, time, +permissions, owner etc.), it provides some useful methods to determine its type. +It does *not* contain the file's data itself. + +:class:`TarInfo` objects are returned by :class:`TarFile`'s methods +:meth:`getmember`, :meth:`getmembers` and :meth:`gettarinfo`. + + +.. class:: TarInfo([name]) + + Create a :class:`TarInfo` object. + + +.. method:: TarInfo.frombuf(buf) + + Create and return a :class:`TarInfo` object from string buffer *buf*. + + .. versionadded:: 2.6 + Raises :exc:`HeaderError` if the buffer is invalid.. + + +.. method:: TarInfo.fromtarfile(tarfile) + + Read the next member from the :class:`TarFile` object *tarfile* and return it as + a :class:`TarInfo` object. + + .. versionadded:: 2.6 + + +.. method:: TarInfo.tobuf([format[, encoding [, errors]]]) + + Create a string buffer from a :class:`TarInfo` object. For information on the + arguments see the constructor of the :class:`TarFile` class. + + .. versionchanged:: 2.6 + The arguments were added. + +A ``TarInfo`` object has the following public data attributes: + + +.. attribute:: TarInfo.name + + Name of the archive member. + + +.. attribute:: TarInfo.size + + Size in bytes. + + +.. attribute:: TarInfo.mtime + + Time of last modification. + + +.. attribute:: TarInfo.mode + + Permission bits. + + +.. attribute:: TarInfo.type + + File type. *type* is usually one of these constants: :const:`REGTYPE`, + :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`, + :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`, + :const:`GNUTYPE_SPARSE`. To determine the type of a :class:`TarInfo` object + more conveniently, use the ``is_*()`` methods below. + + +.. attribute:: TarInfo.linkname + + Name of the target file name, which is only present in :class:`TarInfo` objects + of type :const:`LNKTYPE` and :const:`SYMTYPE`. + + +.. attribute:: TarInfo.uid + + User ID of the user who originally stored this member. + + +.. attribute:: TarInfo.gid + + Group ID of the user who originally stored this member. + + +.. attribute:: TarInfo.uname + + User name. + + +.. attribute:: TarInfo.gname + + Group name. + + +.. attribute:: TarInfo.pax_headers + + A dictionary containing key-value pairs of an associated pax extended header. + + .. versionadded:: 2.6 + +A :class:`TarInfo` object also provides some convenient query methods: + + +.. method:: TarInfo.isfile() + + Return :const:`True` if the :class:`Tarinfo` object is a regular file. + + +.. method:: TarInfo.isreg() + + Same as :meth:`isfile`. + + +.. method:: TarInfo.isdir() + + Return :const:`True` if it is a directory. + + +.. method:: TarInfo.issym() + + Return :const:`True` if it is a symbolic link. + + +.. method:: TarInfo.islnk() + + Return :const:`True` if it is a hard link. + + +.. method:: TarInfo.ischr() + + Return :const:`True` if it is a character device. + + +.. method:: TarInfo.isblk() + + Return :const:`True` if it is a block device. + + +.. method:: TarInfo.isfifo() + + Return :const:`True` if it is a FIFO. + + +.. method:: TarInfo.isdev() + + Return :const:`True` if it is one of character device, block device or FIFO. + +.. % ------------------------ +.. % Examples +.. % ------------------------ + + +.. _tar-examples: + +Examples +-------- + +How to extract an entire tar archive to the current working directory:: + + import tarfile + tar = tarfile.open("sample.tar.gz") + tar.extractall() + tar.close() + +How to create an uncompressed tar archive from a list of filenames:: + + import tarfile + tar = tarfile.open("sample.tar", "w") + for name in ["foo", "bar", "quux"]: + tar.add(name) + tar.close() + +How to read a gzip compressed tar archive and display some member information:: + + import tarfile + tar = tarfile.open("sample.tar.gz", "r:gz") + for tarinfo in tar: + print tarinfo.name, "is", tarinfo.size, "bytes in size and is", + if tarinfo.isreg(): + print "a regular file." + elif tarinfo.isdir(): + print "a directory." + else: + print "something else." + tar.close() + +How to create a tar archive with faked information:: + + import tarfile + tar = tarfile.open("sample.tar.gz", "w:gz") + for name in namelist: + tarinfo = tar.gettarinfo(name, "fakeproj-1.0/" + name) + tarinfo.uid = 123 + tarinfo.gid = 456 + tarinfo.uname = "johndoe" + tarinfo.gname = "fake" + tar.addfile(tarinfo, file(name)) + tar.close() + +The *only* way to extract an uncompressed tar stream from ``sys.stdin``:: + + import sys + import tarfile + tar = tarfile.open(mode="r|", fileobj=sys.stdin) + for tarinfo in tar: + tar.extract(tarinfo) + tar.close() + +.. % ------------ +.. % Tar format +.. % ------------ + + +.. _tar-formats: + +Supported tar formats +--------------------- + +There are three tar formats that can be created with the :mod:`tarfile` module: + +* The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames + up to a length of at best 256 characters and linknames up to 100 characters. The + maximum file size is 8 gigabytes. This is an old and limited but widely + supported format. + +* The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and + linknames, files bigger than 8 gigabytes and sparse files. It is the de facto + standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar + extensions for long names, sparse file support is read-only. + +* The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible + format with virtually no limits. It supports long filenames and linknames, large + files and stores pathnames in a portable way. However, not all tar + implementations today are able to handle pax archives properly. + + The *pax* format is an extension to the existing *ustar* format. It uses extra + headers for information that cannot be stored otherwise. There are two flavours + of pax headers: Extended headers only affect the subsequent file header, global + headers are valid for the complete archive and affect all following files. All + the data in a pax header is encoded in *UTF-8* for portability reasons. + +There are some more variants of the tar format which can be read, but not +created: + +* The ancient V7 format. This is the first tar format from Unix Seventh Edition, + storing only regular files and directories. Names must not be longer than 100 + characters, there is no user/group name information. Some archives have + miscalculated header checksums in case of fields with non-ASCII characters. + +* The SunOS tar extended format. This format is a variant of the POSIX.1-2001 + pax format, but is not compatible. + +.. % ---------------- +.. % Unicode issues +.. % ---------------- + + +.. _tar-unicode: + +Unicode issues +-------------- + +The tar format was originally conceived to make backups on tape drives with the +main focus on preserving file system information. Nowadays tar archives are +commonly used for file distribution and exchanging archives over networks. One +problem of the original format (that all other formats are merely variants of) +is that there is no concept of supporting different character encodings. For +example, an ordinary tar archive created on a *UTF-8* system cannot be read +correctly on a *Latin-1* system if it contains non-ASCII characters. Names (i.e. +filenames, linknames, user/group names) containing these characters will appear +damaged. Unfortunately, there is no way to autodetect the encoding of an +archive. + +The pax format was designed to solve this problem. It stores non-ASCII names +using the universal character encoding *UTF-8*. When a pax archive is read, +these *UTF-8* names are converted to the encoding of the local file system. + +The details of unicode conversion are controlled by the *encoding* and *errors* +keyword arguments of the :class:`TarFile` class. + +The default value for *encoding* is the local character encoding. It is deduced +from :func:`sys.getfilesystemencoding` and :func:`sys.getdefaultencoding`. In +read mode, *encoding* is used exclusively to convert unicode names from a pax +archive to strings in the local character encoding. In write mode, the use of +*encoding* depends on the chosen archive format. In case of :const:`PAX_FORMAT`, +input names that contain non-ASCII characters need to be decoded before being +stored as *UTF-8* strings. The other formats do not make use of *encoding* +unless unicode objects are used as input names. These are converted to 8-bit +character strings before they are added to the archive. + +The *errors* argument defines how characters are treated that cannot be +converted to or from *encoding*. Possible values are listed in section +:ref:`codec-base-classes`. In read mode, there is an additional scheme +``'utf-8'`` which means that bad characters are replaced by their *UTF-8* +representation. This is the default scheme. In write mode the default value for +*errors* is ``'strict'`` to ensure that name information is not altered +unnoticed. + diff --git a/Doc/library/telnetlib.rst b/Doc/library/telnetlib.rst new file mode 100644 index 0000000..f6ab852 --- /dev/null +++ b/Doc/library/telnetlib.rst @@ -0,0 +1,246 @@ + +:mod:`telnetlib` --- Telnet client +================================== + +.. module:: telnetlib + :synopsis: Telnet client class. +.. sectionauthor:: Skip Montanaro <skip@mojam.com> + + +.. index:: single: protocol; Telnet + +The :mod:`telnetlib` module provides a :class:`Telnet` class that implements the +Telnet protocol. See :rfc:`854` for details about the protocol. In addition, it +provides symbolic constants for the protocol characters (see below), and for the +telnet options. The symbolic names of the telnet options follow the definitions +in ``arpa/telnet.h``, with the leading ``TELOPT_`` removed. For symbolic names +of options which are traditionally not included in ``arpa/telnet.h``, see the +module source itself. + +The symbolic constants for the telnet commands are: IAC, DONT, DO, WONT, WILL, +SE (Subnegotiation End), NOP (No Operation), DM (Data Mark), BRK (Break), IP +(Interrupt process), AO (Abort output), AYT (Are You There), EC (Erase +Character), EL (Erase Line), GA (Go Ahead), SB (Subnegotiation Begin). + + +.. class:: Telnet([host[, port[, timeout]]]) + + :class:`Telnet` represents a connection to a Telnet server. The instance is + initially not connected by default; the :meth:`open` method must be used to + establish a connection. Alternatively, the host name and optional port number + can be passed to the constructor, to, in which case the connection to the server + will be established before the constructor returns. The optional *timeout* + parameter specifies a timeout in seconds for the connection attempt (if not + specified, or passed as None, the global default timeout setting will be used). + + Do not reopen an already connected instance. + + This class has many :meth:`read_\*` methods. Note that some of them raise + :exc:`EOFError` when the end of the connection is read, because they can return + an empty string for other reasons. See the individual descriptions below. + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. seealso:: + + :rfc:`854` - Telnet Protocol Specification + Definition of the Telnet protocol. + + +.. _telnet-objects: + +Telnet Objects +-------------- + +:class:`Telnet` instances have the following methods: + + +.. method:: Telnet.read_until(expected[, timeout]) + + Read until a given string, *expected*, is encountered or until *timeout* seconds + have passed. + + When no match is found, return whatever is available instead, possibly the empty + string. Raise :exc:`EOFError` if the connection is closed and no cooked data is + available. + + +.. method:: Telnet.read_all() + + Read all data until EOF; block until connection closed. + + +.. method:: Telnet.read_some() + + Read at least one byte of cooked data unless EOF is hit. Return ``''`` if EOF is + hit. Block if no data is immediately available. + + +.. method:: Telnet.read_very_eager() + + Read everything that can be without blocking in I/O (eager). + + Raise :exc:`EOFError` if connection closed and no cooked data available. Return + ``''`` if no cooked data available otherwise. Do not block unless in the midst + of an IAC sequence. + + +.. method:: Telnet.read_eager() + + Read readily available data. + + Raise :exc:`EOFError` if connection closed and no cooked data available. Return + ``''`` if no cooked data available otherwise. Do not block unless in the midst + of an IAC sequence. + + +.. method:: Telnet.read_lazy() + + Process and return data already in the queues (lazy). + + Raise :exc:`EOFError` if connection closed and no data available. Return ``''`` + if no cooked data available otherwise. Do not block unless in the midst of an + IAC sequence. + + +.. method:: Telnet.read_very_lazy() + + Return any data available in the cooked queue (very lazy). + + Raise :exc:`EOFError` if connection closed and no data available. Return ``''`` + if no cooked data available otherwise. This method never blocks. + + +.. method:: Telnet.read_sb_data() + + Return the data collected between a SB/SE pair (suboption begin/end). The + callback should access these data when it was invoked with a ``SE`` command. + This method never blocks. + + .. versionadded:: 2.3 + + +.. method:: Telnet.open(host[, port[, timeout]]) + + Connect to a host. The optional second argument is the port number, which + defaults to the standard Telnet port (23). The optional *timeout* parameter + specifies a timeout in seconds for the connection attempt (if not specified, or + passed as None, the global default timeout setting will be used). + + Do not try to reopen an already connected instance. + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. method:: Telnet.msg(msg[, *args]) + + Print a debug message when the debug level is ``>`` 0. If extra arguments are + present, they are substituted in the message using the standard string + formatting operator. + + +.. method:: Telnet.set_debuglevel(debuglevel) + + Set the debug level. The higher the value of *debuglevel*, the more debug + output you get (on ``sys.stdout``). + + +.. method:: Telnet.close() + + Close the connection. + + +.. method:: Telnet.get_socket() + + Return the socket object used internally. + + +.. method:: Telnet.fileno() + + Return the file descriptor of the socket object used internally. + + +.. method:: Telnet.write(buffer) + + Write a string to the socket, doubling any IAC characters. This can block if the + connection is blocked. May raise :exc:`socket.error` if the connection is + closed. + + +.. method:: Telnet.interact() + + Interaction function, emulates a very dumb Telnet client. + + +.. method:: Telnet.mt_interact() + + Multithreaded version of :meth:`interact`. + + +.. method:: Telnet.expect(list[, timeout]) + + Read until one from a list of a regular expressions matches. + + The first argument is a list of regular expressions, either compiled + (:class:`re.RegexObject` instances) or uncompiled (strings). The optional second + argument is a timeout, in seconds; the default is to block indefinitely. + + Return a tuple of three items: the index in the list of the first regular + expression that matches; the match object returned; and the text read up till + and including the match. + + If end of file is found and no text was read, raise :exc:`EOFError`. Otherwise, + when nothing matches, return ``(-1, None, text)`` where *text* is the text + received so far (may be the empty string if a timeout happened). + + If a regular expression ends with a greedy match (such as ``.*``) or if more + than one expression can match the same input, the results are indeterministic, + and may depend on the I/O timing. + + +.. method:: Telnet.set_option_negotiation_callback(callback) + + Each time a telnet option is read on the input flow, this *callback* (if set) is + called with the following parameters : callback(telnet socket, command + (DO/DONT/WILL/WONT), option). No other action is done afterwards by telnetlib. + + +.. _telnet-example: + +Telnet Example +-------------- + +.. sectionauthor:: Peter Funk <pf@artcom-gmbh.de> + + +A simple example illustrating typical use:: + + import getpass + import sys + import telnetlib + + def raw_input(prompt): + sys.stdout.write(prompt) + sys.stdout.flush() + return sys.stdin.readline() + + HOST = "localhost" + user = raw_input("Enter your remote account: ") + password = getpass.getpass() + + tn = telnetlib.Telnet(HOST) + + tn.read_until("login: ") + tn.write(user + "\n") + if password: + tn.read_until("Password: ") + tn.write(password + "\n") + + tn.write("ls\n") + tn.write("exit\n") + + print tn.read_all() + diff --git a/Doc/library/tempfile.rst b/Doc/library/tempfile.rst new file mode 100644 index 0000000..cafdd05 --- /dev/null +++ b/Doc/library/tempfile.rst @@ -0,0 +1,216 @@ + +:mod:`tempfile` --- Generate temporary files and directories +============================================================ + +.. sectionauthor:: Zack Weinberg <zack@codesourcery.com> + + +.. module:: tempfile + :synopsis: Generate temporary files and directories. + + +.. index:: + pair: temporary; file name + pair: temporary; file + +This module generates temporary files and directories. It works on all +supported platforms. + +In version 2.3 of Python, this module was overhauled for enhanced security. It +now provides three new functions, :func:`NamedTemporaryFile`, :func:`mkstemp`, +and :func:`mkdtemp`, which should eliminate all remaining need to use the +insecure :func:`mktemp` function. Temporary file names created by this module +no longer contain the process ID; instead a string of six random characters is +used. + +Also, all the user-callable functions now take additional arguments which allow +direct control over the location and name of temporary files. It is no longer +necessary to use the global *tempdir* and *template* variables. To maintain +backward compatibility, the argument order is somewhat odd; it is recommended to +use keyword arguments for clarity. + +The module defines the following user-callable functions: + + +.. function:: TemporaryFile([mode='w+b'[, bufsize=-1[, suffix[, prefix[, dir]]]]]) + + Return a file (or file-like) object that can be used as a temporary storage + area. The file is created using :func:`mkstemp`. It will be destroyed as soon + as it is closed (including an implicit close when the object is garbage + collected). Under Unix, the directory entry for the file is removed immediately + after the file is created. Other platforms do not support this; your code + should not rely on a temporary file created using this function having or not + having a visible name in the file system. + + The *mode* parameter defaults to ``'w+b'`` so that the file created can be read + and written without being closed. Binary mode is used so that it behaves + consistently on all platforms without regard for the data that is stored. + *bufsize* defaults to ``-1``, meaning that the operating system default is used. + + The *dir*, *prefix* and *suffix* parameters are passed to :func:`mkstemp`. + + +.. function:: NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix[, prefix[, dir[, delete]]]]]]) + + This function operates exactly as :func:`TemporaryFile` does, except that the + file is guaranteed to have a visible name in the file system (on Unix, the + directory entry is not unlinked). That name can be retrieved from the + :attr:`name` member of the file object. Whether the name can be used to open + the file a second time, while the named temporary file is still open, varies + across platforms (it can be so used on Unix; it cannot on Windows NT or later). + If *delete* is true (the default), the file is deleted as soon as it is closed. + + .. versionadded:: 2.3 + + .. versionadded:: 2.6 + The *delete* parameter. + + +.. function:: SpooledTemporaryFile([max_size=0, [mode='w+b'[, bufsize=-1[, suffix[, prefix[, dir]]]]]]) + + This function operates exactly as :func:`TemporaryFile` does, except that data + is spooled in memory until the file size exceeds *max_size*, or until the file's + :func:`fileno` method is called, at which point the contents are written to disk + and operation proceeds as with :func:`TemporaryFile`. + + The resulting file has one additional method, :func:`rollover`, which causes the + file to roll over to an on-disk file regardless of its size. + + .. versionadded:: 2.6 + + +.. function:: mkstemp([suffix[, prefix[, dir[, text]]]]) + + Creates a temporary file in the most secure manner possible. There are no + race conditions in the file's creation, assuming that the platform properly + implements the :const:`os.O_EXCL` flag for :func:`os.open`. The file is + readable and writable only by the creating user ID. If the platform uses + permission bits to indicate whether a file is executable, the file is + executable by no one. The file descriptor is not inherited by child + processes. + + Unlike :func:`TemporaryFile`, the user of :func:`mkstemp` is responsible for + deleting the temporary file when done with it. + + If *suffix* is specified, the file name will end with that suffix, otherwise + there will be no suffix. :func:`mkstemp` does not put a dot between the file + name and the suffix; if you need one, put it at the beginning of *suffix*. + + If *prefix* is specified, the file name will begin with that prefix; otherwise, + a default prefix is used. + + If *dir* is specified, the file will be created in that directory; otherwise, + a default directory is used. The default directory is chosen from a + platform-dependent list, but the user of the application can control the + directory location by setting the *TMPDIR*, *TEMP* or *TMP* environment + variables. There is thus no guarantee that the generated filename will have + any nice properties, such as not requiring quoting when passed to external + commands via ``os.popen()``. + + If *text* is specified, it indicates whether to open the file in binary mode + (the default) or text mode. On some platforms, this makes no difference. + + :func:`mkstemp` returns a tuple containing an OS-level handle to an open file + (as would be returned by :func:`os.open`) and the absolute pathname of that + file, in that order. + + .. versionadded:: 2.3 + + +.. function:: mkdtemp([suffix[, prefix[, dir]]]) + + Creates a temporary directory in the most secure manner possible. There are no + race conditions in the directory's creation. The directory is readable, + writable, and searchable only by the creating user ID. + + The user of :func:`mkdtemp` is responsible for deleting the temporary directory + and its contents when done with it. + + The *prefix*, *suffix*, and *dir* arguments are the same as for :func:`mkstemp`. + + :func:`mkdtemp` returns the absolute pathname of the new directory. + + .. versionadded:: 2.3 + + +.. function:: mktemp([suffix[, prefix[, dir]]]) + + .. deprecated:: 2.3 + Use :func:`mkstemp` instead. + + Return an absolute pathname of a file that did not exist at the time the call is + made. The *prefix*, *suffix*, and *dir* arguments are the same as for + :func:`mkstemp`. + + .. warning:: + + Use of this function may introduce a security hole in your program. By the time + you get around to doing anything with the file name it returns, someone else may + have beaten you to the punch. + +The module uses two global variables that tell it how to construct a temporary +name. They are initialized at the first call to any of the functions above. +The caller may change them, but this is discouraged; use the appropriate +function arguments, instead. + + +.. data:: tempdir + + When set to a value other than ``None``, this variable defines the default value + for the *dir* argument to all the functions defined in this module. + + If ``tempdir`` is unset or ``None`` at any call to any of the above functions, + Python searches a standard list of directories and sets *tempdir* to the first + one which the calling user can create files in. The list is: + + #. The directory named by the :envvar:`TMPDIR` environment variable. + + #. The directory named by the :envvar:`TEMP` environment variable. + + #. The directory named by the :envvar:`TMP` environment variable. + + #. A platform-specific location: + + * On RiscOS, the directory named by the :envvar:`Wimp$ScrapDir` environment + variable. + + * On Windows, the directories :file:`C:\\TEMP`, :file:`C:\\TMP`, + :file:`\\TEMP`, and :file:`\\TMP`, in that order. + + * On all other platforms, the directories :file:`/tmp`, :file:`/var/tmp`, and + :file:`/usr/tmp`, in that order. + + #. As a last resort, the current working directory. + + +.. function:: gettempdir() + + Return the directory currently selected to create temporary files in. If + :data:`tempdir` is not ``None``, this simply returns its contents; otherwise, + the search described above is performed, and the result returned. + + +.. data:: template + + .. deprecated:: 2.0 + Use :func:`gettempprefix` instead. + + When set to a value other than ``None``, this variable defines the prefix of the + final component of the filenames returned by :func:`mktemp`. A string of six + random letters and digits is appended to the prefix to make the filename unique. + On Windows, the default prefix is :file:`~T`; on all other systems it is + :file:`tmp`. + + Older versions of this module used to require that ``template`` be set to + ``None`` after a call to :func:`os.fork`; this has not been necessary since + version 1.5.2. + + +.. function:: gettempprefix() + + Return the filename prefix used to create temporary files. This does not + contain the directory component. Using this function is preferred over reading + the *template* variable directly. + + .. versionadded:: 1.5.2 + diff --git a/Doc/library/termios.rst b/Doc/library/termios.rst new file mode 100644 index 0000000..695faad --- /dev/null +++ b/Doc/library/termios.rst @@ -0,0 +1,111 @@ + +:mod:`termios` --- POSIX style tty control +========================================== + +.. module:: termios + :platform: Unix + :synopsis: POSIX style tty control. + + +.. index:: + pair: POSIX; I/O control + pair: tty; I/O control + +This module provides an interface to the POSIX calls for tty I/O control. For a +complete description of these calls, see the POSIX or Unix manual pages. It is +only available for those Unix versions that support POSIX *termios* style tty +I/O control (and then only if configured at installation time). + +All functions in this module take a file descriptor *fd* as their first +argument. This can be an integer file descriptor, such as returned by +``sys.stdin.fileno()``, or a file object, such as ``sys.stdin`` itself. + +This module also defines all the constants needed to work with the functions +provided here; these have the same name as their counterparts in C. Please +refer to your system documentation for more information on using these terminal +control interfaces. + +The module defines the following functions: + + +.. function:: tcgetattr(fd) + + Return a list containing the tty attributes for file descriptor *fd*, as + follows: ``[iflag, oflag, cflag, lflag, ispeed, ospeed, cc]`` where *cc* is a + list of the tty special characters (each a string of length 1, except the + items with indices :const:`VMIN` and :const:`VTIME`, which are integers when + these fields are defined). The interpretation of the flags and the speeds as + well as the indexing in the *cc* array must be done using the symbolic + constants defined in the :mod:`termios` module. + + +.. function:: tcsetattr(fd, when, attributes) + + Set the tty attributes for file descriptor *fd* from the *attributes*, which is + a list like the one returned by :func:`tcgetattr`. The *when* argument + determines when the attributes are changed: :const:`TCSANOW` to change + immediately, :const:`TCSADRAIN` to change after transmitting all queued output, + or :const:`TCSAFLUSH` to change after transmitting all queued output and + discarding all queued input. + + +.. function:: tcsendbreak(fd, duration) + + Send a break on file descriptor *fd*. A zero *duration* sends a break for 0.25 + --0.5 seconds; a nonzero *duration* has a system dependent meaning. + + +.. function:: tcdrain(fd) + + Wait until all output written to file descriptor *fd* has been transmitted. + + +.. function:: tcflush(fd, queue) + + Discard queued data on file descriptor *fd*. The *queue* selector specifies + which queue: :const:`TCIFLUSH` for the input queue, :const:`TCOFLUSH` for the + output queue, or :const:`TCIOFLUSH` for both queues. + + +.. function:: tcflow(fd, action) + + Suspend or resume input or output on file descriptor *fd*. The *action* + argument can be :const:`TCOOFF` to suspend output, :const:`TCOON` to restart + output, :const:`TCIOFF` to suspend input, or :const:`TCION` to restart input. + + +.. seealso:: + + Module :mod:`tty` + Convenience functions for common terminal control operations. + + +Example +------- + +.. _termios-example: + +Here's a function that prompts for a password with echoing turned off. Note the +technique using a separate :func:`tcgetattr` call and a :keyword:`try` ... +:keyword:`finally` statement to ensure that the old tty attributes are restored +exactly no matter what happens:: + + def raw_input(prompt): + import sys + sys.stdout.write(prompt) + sys.stdout.flush() + return sys.stdin.readline() + + def getpass(prompt = "Password: "): + import termios, sys + fd = sys.stdin.fileno() + old = termios.tcgetattr(fd) + new = termios.tcgetattr(fd) + new[3] = new[3] & ~termios.ECHO # lflags + try: + termios.tcsetattr(fd, termios.TCSADRAIN, new) + passwd = raw_input(prompt) + finally: + termios.tcsetattr(fd, termios.TCSADRAIN, old) + return passwd + diff --git a/Doc/library/test.rst b/Doc/library/test.rst new file mode 100644 index 0000000..8972091 --- /dev/null +++ b/Doc/library/test.rst @@ -0,0 +1,317 @@ + +:mod:`test` --- Regression tests package for Python +=================================================== + +.. module:: test + :synopsis: Regression tests package containing the testing suite for Python. +.. sectionauthor:: Brett Cannon <brett@python.org> + + +The :mod:`test` package contains all regression tests for Python as well as the +modules :mod:`test.test_support` and :mod:`test.regrtest`. +:mod:`test.test_support` is used to enhance your tests while +:mod:`test.regrtest` drives the testing suite. + +Each module in the :mod:`test` package whose name starts with ``test_`` is a +testing suite for a specific module or feature. All new tests should be written +using the :mod:`unittest` or :mod:`doctest` module. Some older tests are +written using a "traditional" testing style that compares output printed to +``sys.stdout``; this style of test is considered deprecated. + + +.. seealso:: + + Module :mod:`unittest` + Writing PyUnit regression tests. + + Module :mod:`doctest` + Tests embedded in documentation strings. + + +.. _writing-tests: + +Writing Unit Tests for the :mod:`test` package +---------------------------------------------- + +.. % + +It is preferred that tests that use the :mod:`unittest` module follow a few +guidelines. One is to name the test module by starting it with ``test_`` and end +it with the name of the module being tested. The test methods in the test module +should start with ``test_`` and end with a description of what the method is +testing. This is needed so that the methods are recognized by the test driver as +test methods. Also, no documentation string for the method should be included. A +comment (such as ``# Tests function returns only True or False``) should be used +to provide documentation for test methods. This is done because documentation +strings get printed out if they exist and thus what test is being run is not +stated. + +A basic boilerplate is often used:: + + import unittest + from test import test_support + + class MyTestCase1(unittest.TestCase): + + # Only use setUp() and tearDown() if necessary + + def setUp(self): + ... code to execute in preparation for tests ... + + def tearDown(self): + ... code to execute to clean up after tests ... + + def test_feature_one(self): + # Test feature one. + ... testing code ... + + def test_feature_two(self): + # Test feature two. + ... testing code ... + + ... more test methods ... + + class MyTestCase2(unittest.TestCase): + ... same structure as MyTestCase1 ... + + ... more test classes ... + + def test_main(): + test_support.run_unittest(MyTestCase1, + MyTestCase2, + ... list other tests ... + ) + + if __name__ == '__main__': + test_main() + +This boilerplate code allows the testing suite to be run by :mod:`test.regrtest` +as well as on its own as a script. + +The goal for regression testing is to try to break code. This leads to a few +guidelines to be followed: + +* The testing suite should exercise all classes, functions, and constants. This + includes not just the external API that is to be presented to the outside world + but also "private" code. + +* Whitebox testing (examining the code being tested when the tests are being + written) is preferred. Blackbox testing (testing only the published user + interface) is not complete enough to make sure all boundary and edge cases are + tested. + +* Make sure all possible values are tested including invalid ones. This makes + sure that not only all valid values are acceptable but also that improper values + are handled correctly. + +* Exhaust as many code paths as possible. Test where branching occurs and thus + tailor input to make sure as many different paths through the code are taken. + +* Add an explicit test for any bugs discovered for the tested code. This will + make sure that the error does not crop up again if the code is changed in the + future. + +* Make sure to clean up after your tests (such as close and remove all temporary + files). + +* If a test is dependent on a specific condition of the operating system then + verify the condition already exists before attempting the test. + +* Import as few modules as possible and do it as soon as possible. This + minimizes external dependencies of tests and also minimizes possible anomalous + behavior from side-effects of importing a module. + +* Try to maximize code reuse. On occasion, tests will vary by something as small + as what type of input is used. Minimize code duplication by subclassing a basic + test class with a class that specifies the input:: + + class TestFuncAcceptsSequences(unittest.TestCase): + + func = mySuperWhammyFunction + + def test_func(self): + self.func(self.arg) + + class AcceptLists(TestFuncAcceptsSequences): + arg = [1,2,3] + + class AcceptStrings(TestFuncAcceptsSequences): + arg = 'abc' + + class AcceptTuples(TestFuncAcceptsSequences): + arg = (1,2,3) + + +.. seealso:: + + Test Driven Development + A book by Kent Beck on writing tests before code. + + +.. _regrtest: + +Running tests using :mod:`test.regrtest` +---------------------------------------- + +:mod:`test.regrtest` can be used as a script to drive Python's regression test +suite. Running the script by itself automatically starts running all regression +tests in the :mod:`test` package. It does this by finding all modules in the +package whose name starts with ``test_``, importing them, and executing the +function :func:`test_main` if present. The names of tests to execute may also be +passed to the script. Specifying a single regression test (:program:`python +regrtest.py` :option:`test_spam.py`) will minimize output and only print whether +the test passed or failed and thus minimize output. + +Running :mod:`test.regrtest` directly allows what resources are available for +tests to use to be set. You do this by using the :option:`-u` command-line +option. Run :program:`python regrtest.py` :option:`-uall` to turn on all +resources; specifying :option:`all` as an option for :option:`-u` enables all +possible resources. If all but one resource is desired (a more common case), a +comma-separated list of resources that are not desired may be listed after +:option:`all`. The command :program:`python regrtest.py` +:option:`-uall,-audio,-largefile` will run :mod:`test.regrtest` with all +resources except the :option:`audio` and :option:`largefile` resources. For a +list of all resources and more command-line options, run :program:`python +regrtest.py` :option:`-h`. + +Some other ways to execute the regression tests depend on what platform the +tests are being executed on. On Unix, you can run :program:`make` :option:`test` +at the top-level directory where Python was built. On Windows, executing +:program:`rt.bat` from your :file:`PCBuild` directory will run all regression +tests. + + +:mod:`test.test_support` --- Utility functions for tests +======================================================== + +.. module:: test.test_support + :synopsis: Support for Python regression tests. + + +The :mod:`test.test_support` module provides support for Python's regression +tests. + +This module defines the following exceptions: + + +.. exception:: TestFailed + + Exception to be raised when a test fails. This is deprecated in favor of + :mod:`unittest`\ -based tests and :class:`unittest.TestCase`'s assertion + methods. + + +.. exception:: TestSkipped + + Subclass of :exc:`TestFailed`. Raised when a test is skipped. This occurs when a + needed resource (such as a network connection) is not available at the time of + testing. + + +.. exception:: ResourceDenied + + Subclass of :exc:`TestSkipped`. Raised when a resource (such as a network + connection) is not available. Raised by the :func:`requires` function. + +The :mod:`test.test_support` module defines the following constants: + + +.. data:: verbose + + :const:`True` when verbose output is enabled. Should be checked when more + detailed information is desired about a running test. *verbose* is set by + :mod:`test.regrtest`. + + +.. data:: have_unicode + + :const:`True` when Unicode support is available. + + +.. data:: is_jython + + :const:`True` if the running interpreter is Jython. + + +.. data:: TESTFN + + Set to the path that a temporary file may be created at. Any temporary that is + created should be closed and unlinked (removed). + +The :mod:`test.test_support` module defines the following functions: + + +.. function:: forget(module_name) + + Removes the module named *module_name* from ``sys.modules`` and deletes any + byte-compiled files of the module. + + +.. function:: is_resource_enabled(resource) + + Returns :const:`True` if *resource* is enabled and available. The list of + available resources is only set when :mod:`test.regrtest` is executing the + tests. + + +.. function:: requires(resource[, msg]) + + Raises :exc:`ResourceDenied` if *resource* is not available. *msg* is the + argument to :exc:`ResourceDenied` if it is raised. Always returns true if called + by a function whose ``__name__`` is ``'__main__'``. Used when tests are executed + by :mod:`test.regrtest`. + + +.. function:: findfile(filename) + + Return the path to the file named *filename*. If no match is found *filename* is + returned. This does not equal a failure since it could be the path to the file. + + +.. function:: run_unittest(*classes) + + Execute :class:`unittest.TestCase` subclasses passed to the function. The + function scans the classes for methods starting with the prefix ``test_`` and + executes the tests individually. + + It is also legal to pass strings as parameters; these should be keys in + ``sys.modules``. Each associated module will be scanned by + ``unittest.TestLoader.loadTestsFromModule()``. This is usually seen in the + following :func:`test_main` function:: + + def test_main(): + test_support.run_unittest(__name__) + + This will run all tests defined in the named module. + +The :mod:`test.test_support` module defines the following classes: + + +.. class:: TransientResource(exc[, **kwargs]) + + Instances are a context manager that raises :exc:`ResourceDenied` if the + specified exception type is raised. Any keyword arguments are treated as + attribute/value pairs to be compared against any exception raised within the + :keyword:`with` statement. Only if all pairs match properly against + attributes on the exception is :exc:`ResourceDenied` raised. + + .. versionadded:: 2.6 + + +.. class:: EnvironmentVarGuard() + + Class used to temporarily set or unset environment variables. Instances can be + used as a context manager. + + .. versionadded:: 2.6 + + +.. method:: EnvironmentVarGuard.set(envvar, value) + + Temporarily set the environment variable ``envvar`` to the value of ``value``. + + +.. method:: EnvironmentVarGuard.unset(envvar) + + Temporarily unset the environment variable ``envvar``. + diff --git a/Doc/library/textwrap.rst b/Doc/library/textwrap.rst new file mode 100644 index 0000000..f729a64 --- /dev/null +++ b/Doc/library/textwrap.rst @@ -0,0 +1,192 @@ + +:mod:`textwrap` --- Text wrapping and filling +============================================= + +.. module:: textwrap + :synopsis: Text wrapping and filling +.. moduleauthor:: Greg Ward <gward@python.net> +.. sectionauthor:: Greg Ward <gward@python.net> + + +.. versionadded:: 2.3 + +The :mod:`textwrap` module provides two convenience functions, :func:`wrap` and +:func:`fill`, as well as :class:`TextWrapper`, the class that does all the work, +and a utility function :func:`dedent`. If you're just wrapping or filling one +or two text strings, the convenience functions should be good enough; +otherwise, you should use an instance of :class:`TextWrapper` for efficiency. + + +.. function:: wrap(text[, width[, ...]]) + + Wraps the single paragraph in *text* (a string) so every line is at most *width* + characters long. Returns a list of output lines, without final newlines. + + Optional keyword arguments correspond to the instance attributes of + :class:`TextWrapper`, documented below. *width* defaults to ``70``. + + +.. function:: fill(text[, width[, ...]]) + + Wraps the single paragraph in *text*, and returns a single string containing the + wrapped paragraph. :func:`fill` is shorthand for :: + + "\n".join(wrap(text, ...)) + + In particular, :func:`fill` accepts exactly the same keyword arguments as + :func:`wrap`. + +Both :func:`wrap` and :func:`fill` work by creating a :class:`TextWrapper` +instance and calling a single method on it. That instance is not reused, so for +applications that wrap/fill many text strings, it will be more efficient for you +to create your own :class:`TextWrapper` object. + +An additional utility function, :func:`dedent`, is provided to remove +indentation from strings that have unwanted whitespace to the left of the text. + + +.. function:: dedent(text) + + Remove any common leading whitespace from every line in *text*. + + This can be used to make triple-quoted strings line up with the left edge of the + display, while still presenting them in the source code in indented form. + + Note that tabs and spaces are both treated as whitespace, but they are not + equal: the lines ``" hello"`` and ``"\thello"`` are considered to have no + common leading whitespace. (This behaviour is new in Python 2.5; older versions + of this module incorrectly expanded tabs before searching for common leading + whitespace.) + + For example:: + + def test(): + # end first line with \ to avoid the empty line! + s = '''\ + hello + world + ''' + print repr(s) # prints ' hello\n world\n ' + print repr(dedent(s)) # prints 'hello\n world\n' + + +.. class:: TextWrapper(...) + + The :class:`TextWrapper` constructor accepts a number of optional keyword + arguments. Each argument corresponds to one instance attribute, so for example + :: + + wrapper = TextWrapper(initial_indent="* ") + + is the same as :: + + wrapper = TextWrapper() + wrapper.initial_indent = "* " + + You can re-use the same :class:`TextWrapper` object many times, and you can + change any of its options through direct assignment to instance attributes + between uses. + +The :class:`TextWrapper` instance attributes (and keyword arguments to the +constructor) are as follows: + + +.. attribute:: TextWrapper.width + + (default: ``70``) The maximum length of wrapped lines. As long as there are no + individual words in the input text longer than :attr:`width`, + :class:`TextWrapper` guarantees that no output line will be longer than + :attr:`width` characters. + + +.. attribute:: TextWrapper.expand_tabs + + (default: ``True``) If true, then all tab characters in *text* will be expanded + to spaces using the :meth:`expandtabs` method of *text*. + + +.. attribute:: TextWrapper.replace_whitespace + + (default: ``True``) If true, each whitespace character (as defined by + ``string.whitespace``) remaining after tab expansion will be replaced by a + single space. + + .. note:: + + If :attr:`expand_tabs` is false and :attr:`replace_whitespace` is true, each tab + character will be replaced by a single space, which is *not* the same as tab + expansion. + + +.. attribute:: TextWrapper.drop_whitespace + + (default: ``True``) If true, whitespace that, after wrapping, happens to end up + at the beginning or end of a line is dropped (leading whitespace in the first + line is always preserved, though). + + .. versionadded:: 2.6 + Whitespace was always dropped in earlier versions. + + +.. attribute:: TextWrapper.initial_indent + + (default: ``''``) String that will be prepended to the first line of wrapped + output. Counts towards the length of the first line. + + +.. attribute:: TextWrapper.subsequent_indent + + (default: ``''``) String that will be prepended to all lines of wrapped output + except the first. Counts towards the length of each line except the first. + + +.. attribute:: TextWrapper.fix_sentence_endings + + (default: ``False``) If true, :class:`TextWrapper` attempts to detect sentence + endings and ensure that sentences are always separated by exactly two spaces. + This is generally desired for text in a monospaced font. However, the sentence + detection algorithm is imperfect: it assumes that a sentence ending consists of + a lowercase letter followed by one of ``'.'``, ``'!'``, or ``'?'``, possibly + followed by one of ``'"'`` or ``"'"``, followed by a space. One problem with + this is algorithm is that it is unable to detect the difference between "Dr." in + :: + + [...] Dr. Frankenstein's monster [...] + + and "Spot." in :: + + [...] See Spot. See Spot run [...] + + :attr:`fix_sentence_endings` is false by default. + + Since the sentence detection algorithm relies on ``string.lowercase`` for the + definition of "lowercase letter," and a convention of using two spaces after + a period to separate sentences on the same line, it is specific to + English-language texts. + + +.. attribute:: TextWrapper.break_long_words + + (default: ``True``) If true, then words longer than :attr:`width` will be broken + in order to ensure that no lines are longer than :attr:`width`. If it is false, + long words will not be broken, and some lines may be longer than :attr:`width`. + (Long words will be put on a line by themselves, in order to minimize the amount + by which :attr:`width` is exceeded.) + +:class:`TextWrapper` also provides two public methods, analogous to the +module-level convenience functions: + + +.. method:: TextWrapper.wrap(text) + + Wraps the single paragraph in *text* (a string) so every line is at most + :attr:`width` characters long. All wrapping options are taken from instance + attributes of the :class:`TextWrapper` instance. Returns a list of output lines, + without final newlines. + + +.. method:: TextWrapper.fill(text) + + Wraps the single paragraph in *text*, and returns a single string containing the + wrapped paragraph. + diff --git a/Doc/library/thread.rst b/Doc/library/thread.rst new file mode 100644 index 0000000..c9be598 --- /dev/null +++ b/Doc/library/thread.rst @@ -0,0 +1,171 @@ + +:mod:`thread` --- Multiple threads of control +============================================= + +.. module:: thread + :synopsis: Create multiple threads of control within one interpreter. + + +.. index:: + single: light-weight processes + single: processes, light-weight + single: binary semaphores + single: semaphores, binary + +This module provides low-level primitives for working with multiple threads +(a.k.a. :dfn:`light-weight processes` or :dfn:`tasks`) --- multiple threads of +control sharing their global data space. For synchronization, simple locks +(a.k.a. :dfn:`mutexes` or :dfn:`binary semaphores`) are provided. + +.. index:: + single: pthreads + pair: threads; POSIX + +The module is optional. It is supported on Windows, Linux, SGI IRIX, Solaris +2.x, as well as on systems that have a POSIX thread (a.k.a. "pthread") +implementation. For systems lacking the :mod:`thread` module, the +:mod:`dummy_thread` module is available. It duplicates this module's interface +and can be used as a drop-in replacement. + +It defines the following constant and functions: + + +.. exception:: error + + Raised on thread-specific errors. + + +.. data:: LockType + + This is the type of lock objects. + + +.. function:: start_new_thread(function, args[, kwargs]) + + Start a new thread and return its identifier. The thread executes the function + *function* with the argument list *args* (which must be a tuple). The optional + *kwargs* argument specifies a dictionary of keyword arguments. When the function + returns, the thread silently exits. When the function terminates with an + unhandled exception, a stack trace is printed and then the thread exits (but + other threads continue to run). + + +.. function:: interrupt_main() + + Raise a :exc:`KeyboardInterrupt` exception in the main thread. A subthread can + use this function to interrupt the main thread. + + .. versionadded:: 2.3 + + +.. function:: exit() + + Raise the :exc:`SystemExit` exception. When not caught, this will cause the + thread to exit silently. + +.. % \begin{funcdesc}{exit_prog}{status} +.. % Exit all threads and report the value of the integer argument +.. % \var{status} as the exit status of the entire program. +.. % \strong{Caveat:} code in pending \keyword{finally} clauses, in this thread +.. % or in other threads, is not executed. +.. % \end{funcdesc} + + +.. function:: allocate_lock() + + Return a new lock object. Methods of locks are described below. The lock is + initially unlocked. + + +.. function:: get_ident() + + Return the 'thread identifier' of the current thread. This is a nonzero + integer. Its value has no direct meaning; it is intended as a magic cookie to + be used e.g. to index a dictionary of thread-specific data. Thread identifiers + may be recycled when a thread exits and another thread is created. + + +.. function:: stack_size([size]) + + Return the thread stack size used when creating new threads. The optional + *size* argument specifies the stack size to be used for subsequently created + threads, and must be 0 (use platform or configured default) or a positive + integer value of at least 32,768 (32kB). If changing the thread stack size is + unsupported, a :exc:`ThreadError` is raised. If the specified stack size is + invalid, a :exc:`ValueError` is raised and the stack size is unmodified. 32kB + is currently the minimum supported stack size value to guarantee sufficient + stack space for the interpreter itself. Note that some platforms may have + particular restrictions on values for the stack size, such as requiring a + minimum stack size > 32kB or requiring allocation in multiples of the system + memory page size - platform documentation should be referred to for more + information (4kB pages are common; using multiples of 4096 for the stack size is + the suggested approach in the absence of more specific information). + Availability: Windows, systems with POSIX threads. + + .. versionadded:: 2.5 + +Lock objects have the following methods: + + +.. method:: lock.acquire([waitflag]) + + Without the optional argument, this method acquires the lock unconditionally, if + necessary waiting until it is released by another thread (only one thread at a + time can acquire a lock --- that's their reason for existence). If the integer + *waitflag* argument is present, the action depends on its value: if it is zero, + the lock is only acquired if it can be acquired immediately without waiting, + while if it is nonzero, the lock is acquired unconditionally as before. The + return value is ``True`` if the lock is acquired successfully, ``False`` if not. + + +.. method:: lock.release() + + Releases the lock. The lock must have been acquired earlier, but not + necessarily by the same thread. + + +.. method:: lock.locked() + + Return the status of the lock: ``True`` if it has been acquired by some thread, + ``False`` if not. + +In addition to these methods, lock objects can also be used via the +:keyword:`with` statement, e.g.:: + + from __future__ import with_statement + import thread + + a_lock = thread.allocate_lock() + + with a_lock: + print "a_lock is locked while this executes" + +**Caveats:** + + .. index:: module: signal + +* Threads interact strangely with interrupts: the :exc:`KeyboardInterrupt` + exception will be received by an arbitrary thread. (When the :mod:`signal` + module is available, interrupts always go to the main thread.) + +* Calling :func:`sys.exit` or raising the :exc:`SystemExit` exception is + equivalent to calling :func:`exit`. + +* Not all built-in functions that may block waiting for I/O allow other threads + to run. (The most popular ones (:func:`time.sleep`, :meth:`file.read`, + :func:`select.select`) work as expected.) + +* It is not possible to interrupt the :meth:`acquire` method on a lock --- the + :exc:`KeyboardInterrupt` exception will happen after the lock has been acquired. + + .. index:: pair: threads; IRIX + +* When the main thread exits, it is system defined whether the other threads + survive. On SGI IRIX using the native thread implementation, they survive. On + most other systems, they are killed without executing :keyword:`try` ... + :keyword:`finally` clauses or executing object destructors. + +* When the main thread exits, it does not do any of its usual cleanup (except + that :keyword:`try` ... :keyword:`finally` clauses are honored), and the + standard I/O files are not flushed. + diff --git a/Doc/library/threading.rst b/Doc/library/threading.rst new file mode 100644 index 0000000..92ce02a --- /dev/null +++ b/Doc/library/threading.rst @@ -0,0 +1,732 @@ + +:mod:`threading` --- Higher-level threading interface +===================================================== + +.. module:: threading + :synopsis: Higher-level threading interface. + + +This module constructs higher-level threading interfaces on top of the lower +level :mod:`thread` module. + +The :mod:`dummy_threading` module is provided for situations where +:mod:`threading` cannot be used because :mod:`thread` is missing. + +This module defines the following functions and objects: + + +.. function:: activeCount() + + Return the number of :class:`Thread` objects currently alive. The returned + count is equal to the length of the list returned by :func:`enumerate`. + + +.. function:: Condition() + :noindex: + + A factory function that returns a new condition variable object. A condition + variable allows one or more threads to wait until they are notified by another + thread. + + +.. function:: currentThread() + + Return the current :class:`Thread` object, corresponding to the caller's thread + of control. If the caller's thread of control was not created through the + :mod:`threading` module, a dummy thread object with limited functionality is + returned. + + +.. function:: enumerate() + + Return a list of all :class:`Thread` objects currently alive. The list includes + daemonic threads, dummy thread objects created by :func:`currentThread`, and the + main thread. It excludes terminated threads and threads that have not yet been + started. + + +.. function:: Event() + :noindex: + + A factory function that returns a new event object. An event manages a flag + that can be set to true with the :meth:`set` method and reset to false with the + :meth:`clear` method. The :meth:`wait` method blocks until the flag is true. + + +.. class:: local + + A class that represents thread-local data. Thread-local data are data whose + values are thread specific. To manage thread-local data, just create an + instance of :class:`local` (or a subclass) and store attributes on it:: + + mydata = threading.local() + mydata.x = 1 + + The instance's values will be different for separate threads. + + For more details and extensive examples, see the documentation string of the + :mod:`_threading_local` module. + + .. versionadded:: 2.4 + + +.. function:: Lock() + + A factory function that returns a new primitive lock object. Once a thread has + acquired it, subsequent attempts to acquire it block, until it is released; any + thread may release it. + + +.. function:: RLock() + + A factory function that returns a new reentrant lock object. A reentrant lock + must be released by the thread that acquired it. Once a thread has acquired a + reentrant lock, the same thread may acquire it again without blocking; the + thread must release it once for each time it has acquired it. + + +.. function:: Semaphore([value]) + :noindex: + + A factory function that returns a new semaphore object. A semaphore manages a + counter representing the number of :meth:`release` calls minus the number of + :meth:`acquire` calls, plus an initial value. The :meth:`acquire` method blocks + if necessary until it can return without making the counter negative. If not + given, *value* defaults to 1. + + +.. function:: BoundedSemaphore([value]) + + A factory function that returns a new bounded semaphore object. A bounded + semaphore checks to make sure its current value doesn't exceed its initial + value. If it does, :exc:`ValueError` is raised. In most situations semaphores + are used to guard resources with limited capacity. If the semaphore is released + too many times it's a sign of a bug. If not given, *value* defaults to 1. + + +.. class:: Thread + + A class that represents a thread of control. This class can be safely + subclassed in a limited fashion. + + +.. class:: Timer + + A thread that executes a function after a specified interval has passed. + + +.. function:: settrace(func) + + .. index:: single: trace function + + Set a trace function for all threads started from the :mod:`threading` module. + The *func* will be passed to :func:`sys.settrace` for each thread, before its + :meth:`run` method is called. + + .. versionadded:: 2.3 + + +.. function:: setprofile(func) + + .. index:: single: profile function + + Set a profile function for all threads started from the :mod:`threading` module. + The *func* will be passed to :func:`sys.setprofile` for each thread, before its + :meth:`run` method is called. + + .. versionadded:: 2.3 + + +.. function:: stack_size([size]) + + Return the thread stack size used when creating new threads. The optional + *size* argument specifies the stack size to be used for subsequently created + threads, and must be 0 (use platform or configured default) or a positive + integer value of at least 32,768 (32kB). If changing the thread stack size is + unsupported, a :exc:`ThreadError` is raised. If the specified stack size is + invalid, a :exc:`ValueError` is raised and the stack size is unmodified. 32kB + is currently the minimum supported stack size value to guarantee sufficient + stack space for the interpreter itself. Note that some platforms may have + particular restrictions on values for the stack size, such as requiring a + minimum stack size > 32kB or requiring allocation in multiples of the system + memory page size - platform documentation should be referred to for more + information (4kB pages are common; using multiples of 4096 for the stack size is + the suggested approach in the absence of more specific information). + Availability: Windows, systems with POSIX threads. + + .. versionadded:: 2.5 + +Detailed interfaces for the objects are documented below. + +The design of this module is loosely based on Java's threading model. However, +where Java makes locks and condition variables basic behavior of every object, +they are separate objects in Python. Python's :class:`Thread` class supports a +subset of the behavior of Java's Thread class; currently, there are no +priorities, no thread groups, and threads cannot be destroyed, stopped, +suspended, resumed, or interrupted. The static methods of Java's Thread class, +when implemented, are mapped to module-level functions. + +All of the methods described below are executed atomically. + + +.. _lock-objects: + +Lock Objects +------------ + +A primitive lock is a synchronization primitive that is not owned by a +particular thread when locked. In Python, it is currently the lowest level +synchronization primitive available, implemented directly by the :mod:`thread` +extension module. + +A primitive lock is in one of two states, "locked" or "unlocked". It is created +in the unlocked state. It has two basic methods, :meth:`acquire` and +:meth:`release`. When the state is unlocked, :meth:`acquire` changes the state +to locked and returns immediately. When the state is locked, :meth:`acquire` +blocks until a call to :meth:`release` in another thread changes it to unlocked, +then the :meth:`acquire` call resets it to locked and returns. The +:meth:`release` method should only be called in the locked state; it changes the +state to unlocked and returns immediately. If an attempt is made to release an +unlocked lock, a :exc:`RuntimeError` will be raised. + +When more than one thread is blocked in :meth:`acquire` waiting for the state to +turn to unlocked, only one thread proceeds when a :meth:`release` call resets +the state to unlocked; which one of the waiting threads proceeds is not defined, +and may vary across implementations. + +All methods are executed atomically. + + +.. method:: Lock.acquire([blocking=1]) + + Acquire a lock, blocking or non-blocking. + + When invoked without arguments, block until the lock is unlocked, then set it to + locked, and return true. + + When invoked with the *blocking* argument set to true, do the same thing as when + called without arguments, and return true. + + When invoked with the *blocking* argument set to false, do not block. If a call + without an argument would block, return false immediately; otherwise, do the + same thing as when called without arguments, and return true. + + +.. method:: Lock.release() + + Release a lock. + + When the lock is locked, reset it to unlocked, and return. If any other threads + are blocked waiting for the lock to become unlocked, allow exactly one of them + to proceed. + + Do not call this method when the lock is unlocked. + + There is no return value. + + +.. _rlock-objects: + +RLock Objects +------------- + +A reentrant lock is a synchronization primitive that may be acquired multiple +times by the same thread. Internally, it uses the concepts of "owning thread" +and "recursion level" in addition to the locked/unlocked state used by primitive +locks. In the locked state, some thread owns the lock; in the unlocked state, +no thread owns it. + +To lock the lock, a thread calls its :meth:`acquire` method; this returns once +the thread owns the lock. To unlock the lock, a thread calls its +:meth:`release` method. :meth:`acquire`/:meth:`release` call pairs may be +nested; only the final :meth:`release` (the :meth:`release` of the outermost +pair) resets the lock to unlocked and allows another thread blocked in +:meth:`acquire` to proceed. + + +.. method:: RLock.acquire([blocking=1]) + + Acquire a lock, blocking or non-blocking. + + When invoked without arguments: if this thread already owns the lock, increment + the recursion level by one, and return immediately. Otherwise, if another + thread owns the lock, block until the lock is unlocked. Once the lock is + unlocked (not owned by any thread), then grab ownership, set the recursion level + to one, and return. If more than one thread is blocked waiting until the lock + is unlocked, only one at a time will be able to grab ownership of the lock. + There is no return value in this case. + + When invoked with the *blocking* argument set to true, do the same thing as when + called without arguments, and return true. + + When invoked with the *blocking* argument set to false, do not block. If a call + without an argument would block, return false immediately; otherwise, do the + same thing as when called without arguments, and return true. + + +.. method:: RLock.release() + + Release a lock, decrementing the recursion level. If after the decrement it is + zero, reset the lock to unlocked (not owned by any thread), and if any other + threads are blocked waiting for the lock to become unlocked, allow exactly one + of them to proceed. If after the decrement the recursion level is still + nonzero, the lock remains locked and owned by the calling thread. + + Only call this method when the calling thread owns the lock. A + :exc:`RuntimeError` is raised if this method is called when the lock is + unlocked. + + There is no return value. + + +.. _condition-objects: + +Condition Objects +----------------- + +A condition variable is always associated with some kind of lock; this can be +passed in or one will be created by default. (Passing one in is useful when +several condition variables must share the same lock.) + +A condition variable has :meth:`acquire` and :meth:`release` methods that call +the corresponding methods of the associated lock. It also has a :meth:`wait` +method, and :meth:`notify` and :meth:`notifyAll` methods. These three must only +be called when the calling thread has acquired the lock, otherwise a +:exc:`RuntimeError` is raised. + +The :meth:`wait` method releases the lock, and then blocks until it is awakened +by a :meth:`notify` or :meth:`notifyAll` call for the same condition variable in +another thread. Once awakened, it re-acquires the lock and returns. It is also +possible to specify a timeout. + +The :meth:`notify` method wakes up one of the threads waiting for the condition +variable, if any are waiting. The :meth:`notifyAll` method wakes up all threads +waiting for the condition variable. + +Note: the :meth:`notify` and :meth:`notifyAll` methods don't release the lock; +this means that the thread or threads awakened will not return from their +:meth:`wait` call immediately, but only when the thread that called +:meth:`notify` or :meth:`notifyAll` finally relinquishes ownership of the lock. + +Tip: the typical programming style using condition variables uses the lock to +synchronize access to some shared state; threads that are interested in a +particular change of state call :meth:`wait` repeatedly until they see the +desired state, while threads that modify the state call :meth:`notify` or +:meth:`notifyAll` when they change the state in such a way that it could +possibly be a desired state for one of the waiters. For example, the following +code is a generic producer-consumer situation with unlimited buffer capacity:: + + # Consume one item + cv.acquire() + while not an_item_is_available(): + cv.wait() + get_an_available_item() + cv.release() + + # Produce one item + cv.acquire() + make_an_item_available() + cv.notify() + cv.release() + +To choose between :meth:`notify` and :meth:`notifyAll`, consider whether one +state change can be interesting for only one or several waiting threads. E.g. +in a typical producer-consumer situation, adding one item to the buffer only +needs to wake up one consumer thread. + + +.. class:: Condition([lock]) + + If the *lock* argument is given and not ``None``, it must be a :class:`Lock` or + :class:`RLock` object, and it is used as the underlying lock. Otherwise, a new + :class:`RLock` object is created and used as the underlying lock. + + +.. method:: Condition.acquire(*args) + + Acquire the underlying lock. This method calls the corresponding method on the + underlying lock; the return value is whatever that method returns. + + +.. method:: Condition.release() + + Release the underlying lock. This method calls the corresponding method on the + underlying lock; there is no return value. + + +.. method:: Condition.wait([timeout]) + + Wait until notified or until a timeout occurs. If the calling thread has not + acquired the lock when this method is called, a :exc:`RuntimeError` is raised. + + This method releases the underlying lock, and then blocks until it is awakened + by a :meth:`notify` or :meth:`notifyAll` call for the same condition variable in + another thread, or until the optional timeout occurs. Once awakened or timed + out, it re-acquires the lock and returns. + + When the *timeout* argument is present and not ``None``, it should be a floating + point number specifying a timeout for the operation in seconds (or fractions + thereof). + + When the underlying lock is an :class:`RLock`, it is not released using its + :meth:`release` method, since this may not actually unlock the lock when it was + acquired multiple times recursively. Instead, an internal interface of the + :class:`RLock` class is used, which really unlocks it even when it has been + recursively acquired several times. Another internal interface is then used to + restore the recursion level when the lock is reacquired. + + +.. method:: Condition.notify() + + Wake up a thread waiting on this condition, if any. Wait until notified or until + a timeout occurs. If the calling thread has not acquired the lock when this + method is called, a :exc:`RuntimeError` is raised. + + This method wakes up one of the threads waiting for the condition variable, if + any are waiting; it is a no-op if no threads are waiting. + + The current implementation wakes up exactly one thread, if any are waiting. + However, it's not safe to rely on this behavior. A future, optimized + implementation may occasionally wake up more than one thread. + + Note: the awakened thread does not actually return from its :meth:`wait` call + until it can reacquire the lock. Since :meth:`notify` does not release the + lock, its caller should. + + +.. method:: Condition.notifyAll() + + Wake up all threads waiting on this condition. This method acts like + :meth:`notify`, but wakes up all waiting threads instead of one. If the calling + thread has not acquired the lock when this method is called, a + :exc:`RuntimeError` is raised. + + +.. _semaphore-objects: + +Semaphore Objects +----------------- + +This is one of the oldest synchronization primitives in the history of computer +science, invented by the early Dutch computer scientist Edsger W. Dijkstra (he +used :meth:`P` and :meth:`V` instead of :meth:`acquire` and :meth:`release`). + +A semaphore manages an internal counter which is decremented by each +:meth:`acquire` call and incremented by each :meth:`release` call. The counter +can never go below zero; when :meth:`acquire` finds that it is zero, it blocks, +waiting until some other thread calls :meth:`release`. + + +.. class:: Semaphore([value]) + + The optional argument gives the initial *value* for the internal counter; it + defaults to ``1``. If the *value* given is less than 0, :exc:`ValueError` is + raised. + + +.. method:: Semaphore.acquire([blocking]) + + Acquire a semaphore. + + When invoked without arguments: if the internal counter is larger than zero on + entry, decrement it by one and return immediately. If it is zero on entry, + block, waiting until some other thread has called :meth:`release` to make it + larger than zero. This is done with proper interlocking so that if multiple + :meth:`acquire` calls are blocked, :meth:`release` will wake exactly one of them + up. The implementation may pick one at random, so the order in which blocked + threads are awakened should not be relied on. There is no return value in this + case. + + When invoked with *blocking* set to true, do the same thing as when called + without arguments, and return true. + + When invoked with *blocking* set to false, do not block. If a call without an + argument would block, return false immediately; otherwise, do the same thing as + when called without arguments, and return true. + + +.. method:: Semaphore.release() + + Release a semaphore, incrementing the internal counter by one. When it was zero + on entry and another thread is waiting for it to become larger than zero again, + wake up that thread. + + +.. _semaphore-examples: + +:class:`Semaphore` Example +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Semaphores are often used to guard resources with limited capacity, for example, +a database server. In any situation where the size of the resource size is +fixed, you should use a bounded semaphore. Before spawning any worker threads, +your main thread would initialize the semaphore:: + + maxconnections = 5 + ... + pool_sema = BoundedSemaphore(value=maxconnections) + +Once spawned, worker threads call the semaphore's acquire and release methods +when they need to connect to the server:: + + pool_sema.acquire() + conn = connectdb() + ... use connection ... + conn.close() + pool_sema.release() + +The use of a bounded semaphore reduces the chance that a programming error which +causes the semaphore to be released more than it's acquired will go undetected. + + +.. _event-objects: + +Event Objects +------------- + +This is one of the simplest mechanisms for communication between threads: one +thread signals an event and other threads wait for it. + +An event object manages an internal flag that can be set to true with the +:meth:`set` method and reset to false with the :meth:`clear` method. The +:meth:`wait` method blocks until the flag is true. + + +.. class:: Event() + + The internal flag is initially false. + + +.. method:: Event.isSet() + + Return true if and only if the internal flag is true. + + +.. method:: Event.set() + + Set the internal flag to true. All threads waiting for it to become true are + awakened. Threads that call :meth:`wait` once the flag is true will not block at + all. + + +.. method:: Event.clear() + + Reset the internal flag to false. Subsequently, threads calling :meth:`wait` + will block until :meth:`set` is called to set the internal flag to true again. + + +.. method:: Event.wait([timeout]) + + Block until the internal flag is true. If the internal flag is true on entry, + return immediately. Otherwise, block until another thread calls :meth:`set` to + set the flag to true, or until the optional timeout occurs. + + When the timeout argument is present and not ``None``, it should be a floating + point number specifying a timeout for the operation in seconds (or fractions + thereof). + + +.. _thread-objects: + +Thread Objects +-------------- + +This class represents an activity that is run in a separate thread of control. +There are two ways to specify the activity: by passing a callable object to the +constructor, or by overriding the :meth:`run` method in a subclass. No other +methods (except for the constructor) should be overridden in a subclass. In +other words, *only* override the :meth:`__init__` and :meth:`run` methods of +this class. + +Once a thread object is created, its activity must be started by calling the +thread's :meth:`start` method. This invokes the :meth:`run` method in a +separate thread of control. + +Once the thread's activity is started, the thread is considered 'alive'. It +stops being alive when its :meth:`run` method terminates -- either normally, or +by raising an unhandled exception. The :meth:`isAlive` method tests whether the +thread is alive. + +Other threads can call a thread's :meth:`join` method. This blocks the calling +thread until the thread whose :meth:`join` method is called is terminated. + +A thread has a name. The name can be passed to the constructor, set with the +:meth:`setName` method, and retrieved with the :meth:`getName` method. + +A thread can be flagged as a "daemon thread". The significance of this flag is +that the entire Python program exits when only daemon threads are left. The +initial value is inherited from the creating thread. The flag can be set with +the :meth:`setDaemon` method and retrieved with the :meth:`isDaemon` method. + +There is a "main thread" object; this corresponds to the initial thread of +control in the Python program. It is not a daemon thread. + +There is the possibility that "dummy thread objects" are created. These are +thread objects corresponding to "alien threads", which are threads of control +started outside the threading module, such as directly from C code. Dummy +thread objects have limited functionality; they are always considered alive and +daemonic, and cannot be :meth:`join`\ ed. They are never deleted, since it is +impossible to detect the termination of alien threads. + + +.. class:: Thread(group=None, target=None, name=None, args=(), kwargs={}) + + This constructor should always be called with keyword arguments. Arguments are: + + *group* should be ``None``; reserved for future extension when a + :class:`ThreadGroup` class is implemented. + + *target* is the callable object to be invoked by the :meth:`run` method. + Defaults to ``None``, meaning nothing is called. + + *name* is the thread name. By default, a unique name is constructed of the form + "Thread-*N*" where *N* is a small decimal number. + + *args* is the argument tuple for the target invocation. Defaults to ``()``. + + *kwargs* is a dictionary of keyword arguments for the target invocation. + Defaults to ``{}``. + + If the subclass overrides the constructor, it must make sure to invoke the base + class constructor (``Thread.__init__()``) before doing anything else to the + thread. + + +.. method:: Thread.start() + + Start the thread's activity. + + It must be called at most once per thread object. It arranges for the object's + :meth:`run` method to be invoked in a separate thread of control. + + This method will raise a :exc:`RuntimeException` if called more than once on the + same thread object. + + +.. method:: Thread.run() + + Method representing the thread's activity. + + You may override this method in a subclass. The standard :meth:`run` method + invokes the callable object passed to the object's constructor as the *target* + argument, if any, with sequential and keyword arguments taken from the *args* + and *kwargs* arguments, respectively. + + +.. method:: Thread.join([timeout]) + + Wait until the thread terminates. This blocks the calling thread until the + thread whose :meth:`join` method is called terminates -- either normally or + through an unhandled exception -- or until the optional timeout occurs. + + When the *timeout* argument is present and not ``None``, it should be a floating + point number specifying a timeout for the operation in seconds (or fractions + thereof). As :meth:`join` always returns ``None``, you must call + :meth:`isAlive` to decide whether a timeout happened. + + When the *timeout* argument is not present or ``None``, the operation will block + until the thread terminates. + + A thread can be :meth:`join`\ ed many times. + + :meth:`join` may throw a :exc:`RuntimeError`, if an attempt is made to join the + current thread as that would cause a deadlock. It is also an error to + :meth:`join` a thread before it has been started and attempts to do so raises + same exception. + + +.. method:: Thread.getName() + + Return the thread's name. + + +.. method:: Thread.setName(name) + + Set the thread's name. + + The name is a string used for identification purposes only. It has no semantics. + Multiple threads may be given the same name. The initial name is set by the + constructor. + + +.. method:: Thread.isAlive() + + Return whether the thread is alive. + + Roughly, a thread is alive from the moment the :meth:`start` method returns + until its :meth:`run` method terminates. The module function :func:`enumerate` + returns a list of all alive threads. + + +.. method:: Thread.isDaemon() + + Return the thread's daemon flag. + + +.. method:: Thread.setDaemon(daemonic) + + Set the thread's daemon flag to the Boolean value *daemonic*. This must be + called before :meth:`start` is called, otherwise :exc:`RuntimeError` is raised. + + The initial value is inherited from the creating thread. + + The entire Python program exits when no alive non-daemon threads are left. + + +.. _timer-objects: + +Timer Objects +------------- + +This class represents an action that should be run only after a certain amount +of time has passed --- a timer. :class:`Timer` is a subclass of :class:`Thread` +and as such also functions as an example of creating custom threads. + +Timers are started, as with threads, by calling their :meth:`start` method. The +timer can be stopped (before its action has begun) by calling the :meth:`cancel` +method. The interval the timer will wait before executing its action may not be +exactly the same as the interval specified by the user. + +For example:: + + def hello(): + print "hello, world" + + t = Timer(30.0, hello) + t.start() # after 30 seconds, "hello, world" will be printed + + +.. class:: Timer(interval, function, args=[], kwargs={}) + + Create a timer that will run *function* with arguments *args* and keyword + arguments *kwargs*, after *interval* seconds have passed. + + +.. method:: Timer.cancel() + + Stop the timer, and cancel the execution of the timer's action. This will only + work if the timer is still in its waiting stage. + + +.. _with-locks: + +Using locks, conditions, and semaphores in the :keyword:`with` statement +------------------------------------------------------------------------ + +All of the objects provided by this module that have :meth:`acquire` and +:meth:`release` methods can be used as context managers for a :keyword:`with` +statement. The :meth:`acquire` method will be called when the block is entered, +and :meth:`release` will be called when the block is exited. + +Currently, :class:`Lock`, :class:`RLock`, :class:`Condition`, +:class:`Semaphore`, and :class:`BoundedSemaphore` objects may be used as +:keyword:`with` statement context managers. For example:: + + from __future__ import with_statement + import threading + + some_rlock = threading.RLock() + + with some_rlock: + print "some_rlock is locked while this executes" + diff --git a/Doc/library/time.rst b/Doc/library/time.rst new file mode 100644 index 0000000..04c8f66 --- /dev/null +++ b/Doc/library/time.rst @@ -0,0 +1,540 @@ + +:mod:`time` --- Time access and conversions +=========================================== + +.. module:: time + :synopsis: Time access and conversions. + + +This module provides various time-related functions. For related +functionality, see also the :mod:`datetime` and :mod:`calendar` modules. + +Although this module is always available, +not all functions are available on all platforms. Most of the functions +defined in this module call platform C library functions with the same name. It +may sometimes be helpful to consult the platform documentation, because the +semantics of these functions varies among platforms. + +An explanation of some terminology and conventions is in order. + + .. index:: single: epoch + +* The :dfn:`epoch` is the point where the time starts. On January 1st of that + year, at 0 hours, the "time since the epoch" is zero. For Unix, the epoch is + 1970. To find out what the epoch is, look at ``gmtime(0)``. + + .. index:: single: Year 2038 + +* The functions in this module do not handle dates and times before the epoch or + far in the future. The cut-off point in the future is determined by the C + library; for Unix, it is typically in 2038. + + .. index:: + single: Year 2000 + single: Y2K + +* **Year 2000 (Y2K) issues**: Python depends on the platform's C library, which + generally doesn't have year 2000 issues, since all dates and times are + represented internally as seconds since the epoch. Functions accepting a + :class:`struct_time` (see below) generally require a 4-digit year. For backward + compatibility, 2-digit years are supported if the module variable + ``accept2dyear`` is a non-zero integer; this variable is initialized to ``1`` + unless the environment variable :envvar:`PYTHONY2K` is set to a non-empty + string, in which case it is initialized to ``0``. Thus, you can set + :envvar:`PYTHONY2K` to a non-empty string in the environment to require 4-digit + years for all year input. When 2-digit years are accepted, they are converted + according to the POSIX or X/Open standard: values 69-99 are mapped to 1969-1999, + and values 0--68 are mapped to 2000--2068. Values 100--1899 are always illegal. + Note that this is new as of Python 1.5.2(a2); earlier versions, up to Python + 1.5.1 and 1.5.2a1, would add 1900 to year values below 1900. + + .. index:: + single: UTC + single: Coordinated Universal Time + single: Greenwich Mean Time + +* UTC is Coordinated Universal Time (formerly known as Greenwich Mean Time, or + GMT). The acronym UTC is not a mistake but a compromise between English and + French. + + .. index:: single: Daylight Saving Time + +* DST is Daylight Saving Time, an adjustment of the timezone by (usually) one + hour during part of the year. DST rules are magic (determined by local law) and + can change from year to year. The C library has a table containing the local + rules (often it is read from a system file for flexibility) and is the only + source of True Wisdom in this respect. + +* The precision of the various real-time functions may be less than suggested by + the units in which their value or argument is expressed. E.g. on most Unix + systems, the clock "ticks" only 50 or 100 times a second, and on the Mac, times + are only accurate to whole seconds. + +* On the other hand, the precision of :func:`time` and :func:`sleep` is better + than their Unix equivalents: times are expressed as floating point numbers, + :func:`time` returns the most accurate time available (using Unix + :cfunc:`gettimeofday` where available), and :func:`sleep` will accept a time + with a nonzero fraction (Unix :cfunc:`select` is used to implement this, where + available). + +* The time value as returned by :func:`gmtime`, :func:`localtime`, and + :func:`strptime`, and accepted by :func:`asctime`, :func:`mktime` and + :func:`strftime`, is a sequence of 9 integers. The return values of + :func:`gmtime`, :func:`localtime`, and :func:`strptime` also offer attribute + names for individual fields. + + +-------+------------------+------------------------------+ + | Index | Attribute | Values | + +=======+==================+==============================+ + | 0 | :attr:`tm_year` | (for example, 1993) | + +-------+------------------+------------------------------+ + | 1 | :attr:`tm_mon` | range [1,12] | + +-------+------------------+------------------------------+ + | 2 | :attr:`tm_mday` | range [1,31] | + +-------+------------------+------------------------------+ + | 3 | :attr:`tm_hour` | range [0,23] | + +-------+------------------+------------------------------+ + | 4 | :attr:`tm_min` | range [0,59] | + +-------+------------------+------------------------------+ + | 5 | :attr:`tm_sec` | range [0,61]; see **(1)** in | + | | | :func:`strftime` description | + +-------+------------------+------------------------------+ + | 6 | :attr:`tm_wday` | range [0,6], Monday is 0 | + +-------+------------------+------------------------------+ + | 7 | :attr:`tm_yday` | range [1,366] | + +-------+------------------+------------------------------+ + | 8 | :attr:`tm_isdst` | 0, 1 or -1; see below | + +-------+------------------+------------------------------+ + + Note that unlike the C structure, the month value is a range of 1-12, not 0-11. + A year value will be handled as described under "Year 2000 (Y2K) issues" above. + A ``-1`` argument as the daylight savings flag, passed to :func:`mktime` will + usually result in the correct daylight savings state to be filled in. + + When a tuple with an incorrect length is passed to a function expecting a + :class:`struct_time`, or having elements of the wrong type, a :exc:`TypeError` + is raised. + + .. versionchanged:: 2.2 + The time value sequence was changed from a tuple to a :class:`struct_time`, with + the addition of attribute names for the fields. + +The module defines the following functions and data items: + + +.. data:: accept2dyear + + Boolean value indicating whether two-digit year values will be accepted. This + is true by default, but will be set to false if the environment variable + :envvar:`PYTHONY2K` has been set to a non-empty string. It may also be modified + at run time. + + +.. data:: altzone + + The offset of the local DST timezone, in seconds west of UTC, if one is defined. + This is negative if the local DST timezone is east of UTC (as in Western Europe, + including the UK). Only use this if ``daylight`` is nonzero. + + +.. function:: asctime([t]) + + Convert a tuple or :class:`struct_time` representing a time as returned by + :func:`gmtime` or :func:`localtime` to a 24-character string of the following + form: ``'Sun Jun 20 23:21:05 1993'``. If *t* is not provided, the current time + as returned by :func:`localtime` is used. Locale information is not used by + :func:`asctime`. + + .. note:: + + Unlike the C function of the same name, there is no trailing newline. + + .. versionchanged:: 2.1 + Allowed *t* to be omitted. + + +.. function:: clock() + + .. index:: + single: CPU time + single: processor time + single: benchmarking + + On Unix, return the current processor time as a floating point number expressed + in seconds. The precision, and in fact the very definition of the meaning of + "processor time", depends on that of the C function of the same name, but in any + case, this is the function to use for benchmarking Python or timing algorithms. + + On Windows, this function returns wall-clock seconds elapsed since the first + call to this function, as a floating point number, based on the Win32 function + :cfunc:`QueryPerformanceCounter`. The resolution is typically better than one + microsecond. + + +.. function:: ctime([secs]) + + Convert a time expressed in seconds since the epoch to a string representing + local time. If *secs* is not provided or :const:`None`, the current time as + returned by :func:`time` is used. ``ctime(secs)`` is equivalent to + ``asctime(localtime(secs))``. Locale information is not used by :func:`ctime`. + + .. versionchanged:: 2.1 + Allowed *secs* to be omitted. + + .. versionchanged:: 2.4 + If *secs* is :const:`None`, the current time is used. + + +.. data:: daylight + + Nonzero if a DST timezone is defined. + + +.. function:: gmtime([secs]) + + Convert a time expressed in seconds since the epoch to a :class:`struct_time` in + UTC in which the dst flag is always zero. If *secs* is not provided or + :const:`None`, the current time as returned by :func:`time` is used. Fractions + of a second are ignored. See above for a description of the + :class:`struct_time` object. See :func:`calendar.timegm` for the inverse of this + function. + + .. versionchanged:: 2.1 + Allowed *secs* to be omitted. + + .. versionchanged:: 2.4 + If *secs* is :const:`None`, the current time is used. + + +.. function:: localtime([secs]) + + Like :func:`gmtime` but converts to local time. If *secs* is not provided or + :const:`None`, the current time as returned by :func:`time` is used. The dst + flag is set to ``1`` when DST applies to the given time. + + .. versionchanged:: 2.1 + Allowed *secs* to be omitted. + + .. versionchanged:: 2.4 + If *secs* is :const:`None`, the current time is used. + + +.. function:: mktime(t) + + This is the inverse function of :func:`localtime`. Its argument is the + :class:`struct_time` or full 9-tuple (since the dst flag is needed; use ``-1`` + as the dst flag if it is unknown) which expresses the time in *local* time, not + UTC. It returns a floating point number, for compatibility with :func:`time`. + If the input value cannot be represented as a valid time, either + :exc:`OverflowError` or :exc:`ValueError` will be raised (which depends on + whether the invalid value is caught by Python or the underlying C libraries). + The earliest date for which it can generate a time is platform-dependent. + + +.. function:: sleep(secs) + + Suspend execution for the given number of seconds. The argument may be a + floating point number to indicate a more precise sleep time. The actual + suspension time may be less than that requested because any caught signal will + terminate the :func:`sleep` following execution of that signal's catching + routine. Also, the suspension time may be longer than requested by an arbitrary + amount because of the scheduling of other activity in the system. + + +.. function:: strftime(format[, t]) + + Convert a tuple or :class:`struct_time` representing a time as returned by + :func:`gmtime` or :func:`localtime` to a string as specified by the *format* + argument. If *t* is not provided, the current time as returned by + :func:`localtime` is used. *format* must be a string. :exc:`ValueError` is + raised if any field in *t* is outside of the allowed range. + + .. versionchanged:: 2.1 + Allowed *t* to be omitted. + + .. versionchanged:: 2.4 + :exc:`ValueError` raised if a field in *t* is out of range. + + .. versionchanged:: 2.5 + 0 is now a legal argument for any position in the time tuple; if it is normally + illegal the value is forced to a correct one.. + + The following directives can be embedded in the *format* string. They are shown + without the optional field width and precision specification, and are replaced + by the indicated characters in the :func:`strftime` result: + + +-----------+--------------------------------+-------+ + | Directive | Meaning | Notes | + +===========+================================+=======+ + | ``%a`` | Locale's abbreviated weekday | | + | | name. | | + +-----------+--------------------------------+-------+ + | ``%A`` | Locale's full weekday name. | | + +-----------+--------------------------------+-------+ + | ``%b`` | Locale's abbreviated month | | + | | name. | | + +-----------+--------------------------------+-------+ + | ``%B`` | Locale's full month name. | | + +-----------+--------------------------------+-------+ + | ``%c`` | Locale's appropriate date and | | + | | time representation. | | + +-----------+--------------------------------+-------+ + | ``%d`` | Day of the month as a decimal | | + | | number [01,31]. | | + +-----------+--------------------------------+-------+ + | ``%H`` | Hour (24-hour clock) as a | | + | | decimal number [00,23]. | | + +-----------+--------------------------------+-------+ + | ``%I`` | Hour (12-hour clock) as a | | + | | decimal number [01,12]. | | + +-----------+--------------------------------+-------+ + | ``%j`` | Day of the year as a decimal | | + | | number [001,366]. | | + +-----------+--------------------------------+-------+ + | ``%m`` | Month as a decimal number | | + | | [01,12]. | | + +-----------+--------------------------------+-------+ + | ``%M`` | Minute as a decimal number | | + | | [00,59]. | | + +-----------+--------------------------------+-------+ + | ``%p`` | Locale's equivalent of either | \(1) | + | | AM or PM. | | + +-----------+--------------------------------+-------+ + | ``%S`` | Second as a decimal number | \(2) | + | | [00,61]. | | + +-----------+--------------------------------+-------+ + | ``%U`` | Week number of the year | \(3) | + | | (Sunday as the first day of | | + | | the week) as a decimal number | | + | | [00,53]. All days in a new | | + | | year preceding the first | | + | | Sunday are considered to be in | | + | | week 0. | | + +-----------+--------------------------------+-------+ + | ``%w`` | Weekday as a decimal number | | + | | [0(Sunday),6]. | | + +-----------+--------------------------------+-------+ + | ``%W`` | Week number of the year | \(3) | + | | (Monday as the first day of | | + | | the week) as a decimal number | | + | | [00,53]. All days in a new | | + | | year preceding the first | | + | | Monday are considered to be in | | + | | week 0. | | + +-----------+--------------------------------+-------+ + | ``%x`` | Locale's appropriate date | | + | | representation. | | + +-----------+--------------------------------+-------+ + | ``%X`` | Locale's appropriate time | | + | | representation. | | + +-----------+--------------------------------+-------+ + | ``%y`` | Year without century as a | | + | | decimal number [00,99]. | | + +-----------+--------------------------------+-------+ + | ``%Y`` | Year with century as a decimal | | + | | number. | | + +-----------+--------------------------------+-------+ + | ``%Z`` | Time zone name (no characters | | + | | if no time zone exists). | | + +-----------+--------------------------------+-------+ + | ``%%`` | A literal ``'%'`` character. | | + +-----------+--------------------------------+-------+ + + Notes: + + (1) + When used with the :func:`strptime` function, the ``%p`` directive only affects + the output hour field if the ``%I`` directive is used to parse the hour. + + (2) + The range really is ``0`` to ``61``; this accounts for leap seconds and the + (very rare) double leap seconds. + + (3) + When used with the :func:`strptime` function, ``%U`` and ``%W`` are only used in + calculations when the day of the week and the year are specified. + + Here is an example, a format for dates compatible with that specified in the + :rfc:`2822` Internet email standard. [#]_ :: + + >>> from time import gmtime, strftime + >>> strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime()) + 'Thu, 28 Jun 2001 14:17:15 +0000' + + Additional directives may be supported on certain platforms, but only the ones + listed here have a meaning standardized by ANSI C. + + On some platforms, an optional field width and precision specification can + immediately follow the initial ``'%'`` of a directive in the following order; + this is also not portable. The field width is normally 2 except for ``%j`` where + it is 3. + + +.. function:: strptime(string[, format]) + + Parse a string representing a time according to a format. The return value is + a :class:`struct_time` as returned by :func:`gmtime` or :func:`localtime`. + + The *format* parameter uses the same directives as those used by + :func:`strftime`; it defaults to ``"%a %b %d %H:%M:%S %Y"`` which matches the + formatting returned by :func:`ctime`. If *string* cannot be parsed according to + *format*, or if it has excess data after parsing, :exc:`ValueError` is raised. + The default values used to fill in any missing data when more accurate values + cannot be inferred are ``(1900, 1, 1, 0, 0, 0, 0, 1, -1)``. + + For example:: + + >>> import time + >>> time.strptime("30 Nov 00", "%d %b %y") + (2000, 11, 30, 0, 0, 0, 3, 335, -1) + + Support for the ``%Z`` directive is based on the values contained in ``tzname`` + and whether ``daylight`` is true. Because of this, it is platform-specific + except for recognizing UTC and GMT which are always known (and are considered to + be non-daylight savings timezones). + + Only the directives specified in the documentation are supported. Because + ``strftime()`` is implemented per platform it can sometimes offer more + directives than those listed. But ``strptime()`` is independent of any platform + and thus does not necessarily support all directives available that are not + documented as supported. + + +.. data:: struct_time + + The type of the time value sequence returned by :func:`gmtime`, + :func:`localtime`, and :func:`strptime`. + + .. versionadded:: 2.2 + + +.. function:: time() + + Return the time as a floating point number expressed in seconds since the epoch, + in UTC. Note that even though the time is always returned as a floating point + number, not all systems provide time with a better precision than 1 second. + While this function normally returns non-decreasing values, it can return a + lower value than a previous call if the system clock has been set back between + the two calls. + + +.. data:: timezone + + The offset of the local (non-DST) timezone, in seconds west of UTC (negative in + most of Western Europe, positive in the US, zero in the UK). + + +.. data:: tzname + + A tuple of two strings: the first is the name of the local non-DST timezone, the + second is the name of the local DST timezone. If no DST timezone is defined, + the second string should not be used. + + +.. function:: tzset() + + Resets the time conversion rules used by the library routines. The environment + variable :envvar:`TZ` specifies how this is done. + + .. versionadded:: 2.3 + + Availability: Unix. + + .. note:: + + Although in many cases, changing the :envvar:`TZ` environment variable may + affect the output of functions like :func:`localtime` without calling + :func:`tzset`, this behavior should not be relied on. + + The :envvar:`TZ` environment variable should contain no whitespace. + + The standard format of the :envvar:`TZ` environment variable is (whitespace + added for clarity):: + + std offset [dst [offset [,start[/time], end[/time]]]] + + Where the components are: + + ``std`` and ``dst`` + Three or more alphanumerics giving the timezone abbreviations. These will be + propagated into time.tzname + + ``offset`` + The offset has the form: ``± hh[:mm[:ss]]``. This indicates the value + added the local time to arrive at UTC. If preceded by a '-', the timezone + is east of the Prime Meridian; otherwise, it is west. If no offset follows + dst, summer time is assumed to be one hour ahead of standard time. + + ``start[/time], end[/time]`` + Indicates when to change to and back from DST. The format of the + start and end dates are one of the following: + + :samp:`J{n}` + The Julian day *n* (1 <= *n* <= 365). Leap days are not counted, so in + all years February 28 is day 59 and March 1 is day 60. + + :samp:`{n}` + The zero-based Julian day (0 <= *n* <= 365). Leap days are counted, and + it is possible to refer to February 29. + + :samp:`M{m}.{n}.{d}` + The *d*'th day (0 <= *d* <= 6) or week *n* of month *m* of the year (1 + <= *n* <= 5, 1 <= *m* <= 12, where week 5 means "the last *d* day in + month *m*" which may occur in either the fourth or the fifth + week). Week 1 is the first week in which the *d*'th day occurs. Day + zero is Sunday. + + ``time`` has the same format as ``offset`` except that no leading sign + ('-' or '+') is allowed. The default, if time is not given, is 02:00:00. + + :: + + >>> os.environ['TZ'] = 'EST+05EDT,M4.1.0,M10.5.0' + >>> time.tzset() + >>> time.strftime('%X %x %Z') + '02:07:36 05/08/03 EDT' + >>> os.environ['TZ'] = 'AEST-10AEDT-11,M10.5.0,M3.5.0' + >>> time.tzset() + >>> time.strftime('%X %x %Z') + '16:08:12 05/08/03 AEST' + + On many Unix systems (including \*BSD, Linux, Solaris, and Darwin), it is more + convenient to use the system's zoneinfo (:manpage:`tzfile(5)`) database to + specify the timezone rules. To do this, set the :envvar:`TZ` environment + variable to the path of the required timezone datafile, relative to the root of + the systems 'zoneinfo' timezone database, usually located at + :file:`/usr/share/zoneinfo`. For example, ``'US/Eastern'``, + ``'Australia/Melbourne'``, ``'Egypt'`` or ``'Europe/Amsterdam'``. :: + + >>> os.environ['TZ'] = 'US/Eastern' + >>> time.tzset() + >>> time.tzname + ('EST', 'EDT') + >>> os.environ['TZ'] = 'Egypt' + >>> time.tzset() + >>> time.tzname + ('EET', 'EEST') + + +.. seealso:: + + Module :mod:`datetime` + More object-oriented interface to dates and times. + + Module :mod:`locale` + Internationalization services. The locale settings can affect the return values + for some of the functions in the :mod:`time` module. + + Module :mod:`calendar` + General calendar-related functions. :func:`timegm` is the inverse of + :func:`gmtime` from this module. + +.. rubric:: Footnotes + +.. [#] The use of ``%Z`` is now deprecated, but the ``%z`` escape that expands to the + preferred hour/minute offset is not supported by all ANSI C libraries. Also, a + strict reading of the original 1982 :rfc:`822` standard calls for a two-digit + year (%y rather than %Y), but practice moved to 4-digit years long before the + year 2000. The 4-digit year has been mandated by :rfc:`2822`, which obsoletes + :rfc:`822`. + diff --git a/Doc/library/timeit.rst b/Doc/library/timeit.rst new file mode 100644 index 0000000..8c0cda3 --- /dev/null +++ b/Doc/library/timeit.rst @@ -0,0 +1,243 @@ + +:mod:`timeit` --- Measure execution time of small code snippets +=============================================================== + +.. module:: timeit + :synopsis: Measure the execution time of small code snippets. + + +.. versionadded:: 2.3 + +.. index:: + single: Benchmarking + single: Performance + +This module provides a simple way to time small bits of Python code. It has both +command line as well as callable interfaces. It avoids a number of common traps +for measuring execution times. See also Tim Peters' introduction to the +"Algorithms" chapter in the Python Cookbook, published by O'Reilly. + +The module defines the following public class: + + +.. class:: Timer([stmt='pass' [, setup='pass' [, timer=<timer function>]]]) + + Class for timing execution speed of small code snippets. + + The constructor takes a statement to be timed, an additional statement used for + setup, and a timer function. Both statements default to ``'pass'``; the timer + function is platform-dependent (see the module doc string). The statements may + contain newlines, as long as they don't contain multi-line string literals. + + To measure the execution time of the first statement, use the :meth:`timeit` + method. The :meth:`repeat` method is a convenience to call :meth:`timeit` + multiple times and return a list of results. + + .. versionchanged:: 2.6 + The *stmt* and *setup* parameters can now also take objects that are callable + without arguments. This will embed calls to them in a timer function that will + then be executed by :meth:`timeit`. Note that the timing overhead is a little + larger in this case because of the extra function calls. + + +.. method:: Timer.print_exc([file=None]) + + Helper to print a traceback from the timed code. + + Typical use:: + + t = Timer(...) # outside the try/except + try: + t.timeit(...) # or t.repeat(...) + except: + t.print_exc() + + The advantage over the standard traceback is that source lines in the compiled + template will be displayed. The optional *file* argument directs where the + traceback is sent; it defaults to ``sys.stderr``. + + +.. method:: Timer.repeat([repeat=3 [, number=1000000]]) + + Call :meth:`timeit` a few times. + + This is a convenience function that calls the :meth:`timeit` repeatedly, + returning a list of results. The first argument specifies how many times to + call :meth:`timeit`. The second argument specifies the *number* argument for + :func:`timeit`. + + .. note:: + + It's tempting to calculate mean and standard deviation from the result vector + and report these. However, this is not very useful. In a typical case, the + lowest value gives a lower bound for how fast your machine can run the given + code snippet; higher values in the result vector are typically not caused by + variability in Python's speed, but by other processes interfering with your + timing accuracy. So the :func:`min` of the result is probably the only number + you should be interested in. After that, you should look at the entire vector + and apply common sense rather than statistics. + + +.. method:: Timer.timeit([number=1000000]) + + Time *number* executions of the main statement. This executes the setup + statement once, and then returns the time it takes to execute the main statement + a number of times, measured in seconds as a float. The argument is the number + of times through the loop, defaulting to one million. The main statement, the + setup statement and the timer function to be used are passed to the constructor. + + .. note:: + + By default, :meth:`timeit` temporarily turns off garbage collection during the + timing. The advantage of this approach is that it makes independent timings + more comparable. This disadvantage is that GC may be an important component of + the performance of the function being measured. If so, GC can be re-enabled as + the first statement in the *setup* string. For example:: + + timeit.Timer('for i in range(10): oct(i)', 'gc.enable()').timeit() + +Starting with version 2.6, the module also defines two convenience functions: + + +.. function:: repeat(stmt[, setup[, timer[, repeat=3 [, number=1000000]]]]) + + Create a :class:`Timer` instance with the given statement, setup code and timer + function and run its :meth:`repeat` method with the given repeat count and + *number* executions. + + .. versionadded:: 2.6 + + +.. function:: timeit(stmt[, setup[, timer[, number=1000000]]]) + + Create a :class:`Timer` instance with the given statement, setup code and timer + function and run its :meth:`timeit` method with *number* executions. + + .. versionadded:: 2.6 + + +Command Line Interface +---------------------- + +When called as a program from the command line, the following form is used:: + + python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h] [statement ...] + +where the following options are understood: + +-n N/:option:`--number=N` + how many times to execute 'statement' + +-r N/:option:`--repeat=N` + how many times to repeat the timer (default 3) + +-s S/:option:`--setup=S` + statement to be executed once initially (default ``'pass'``) + +-t/:option:`--time` + use :func:`time.time` (default on all platforms but Windows) + +-c/:option:`--clock` + use :func:`time.clock` (default on Windows) + +-v/:option:`--verbose` + print raw timing results; repeat for more digits precision + +-h/:option:`--help` + print a short usage message and exit + +A multi-line statement may be given by specifying each line as a separate +statement argument; indented lines are possible by enclosing an argument in +quotes and using leading spaces. Multiple :option:`-s` options are treated +similarly. + +If :option:`-n` is not given, a suitable number of loops is calculated by trying +successive powers of 10 until the total time is at least 0.2 seconds. + +The default timer function is platform dependent. On Windows, +:func:`time.clock` has microsecond granularity but :func:`time.time`'s +granularity is 1/60th of a second; on Unix, :func:`time.clock` has 1/100th of a +second granularity and :func:`time.time` is much more precise. On either +platform, the default timer functions measure wall clock time, not the CPU time. +This means that other processes running on the same computer may interfere with +the timing. The best thing to do when accurate timing is necessary is to repeat +the timing a few times and use the best time. The :option:`-r` option is good +for this; the default of 3 repetitions is probably enough in most cases. On +Unix, you can use :func:`time.clock` to measure CPU time. + +.. note:: + + There is a certain baseline overhead associated with executing a pass statement. + The code here doesn't try to hide it, but you should be aware of it. The + baseline overhead can be measured by invoking the program without arguments. + +The baseline overhead differs between Python versions! Also, to fairly compare +older Python versions to Python 2.3, you may want to use Python's :option:`-O` +option for the older versions to avoid timing ``SET_LINENO`` instructions. + + +Examples +-------- + +Here are two example sessions (one using the command line, one using the module +interface) that compare the cost of using :func:`hasattr` vs. +:keyword:`try`/:keyword:`except` to test for missing and present object +attributes. :: + + % timeit.py 'try:' ' str.__bool__' 'except AttributeError:' ' pass' + 100000 loops, best of 3: 15.7 usec per loop + % timeit.py 'if hasattr(str, "__bool__"): pass' + 100000 loops, best of 3: 4.26 usec per loop + % timeit.py 'try:' ' int.__bool__' 'except AttributeError:' ' pass' + 1000000 loops, best of 3: 1.43 usec per loop + % timeit.py 'if hasattr(int, "__bool__"): pass' + 100000 loops, best of 3: 2.23 usec per loop + +:: + + >>> import timeit + >>> s = """\ + ... try: + ... str.__bool__ + ... except AttributeError: + ... pass + ... """ + >>> t = timeit.Timer(stmt=s) + >>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000) + 17.09 usec/pass + >>> s = """\ + ... if hasattr(str, '__bool__'): pass + ... """ + >>> t = timeit.Timer(stmt=s) + >>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000) + 4.85 usec/pass + >>> s = """\ + ... try: + ... int.__bool__ + ... except AttributeError: + ... pass + ... """ + >>> t = timeit.Timer(stmt=s) + >>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000) + 1.97 usec/pass + >>> s = """\ + ... if hasattr(int, '__bool__'): pass + ... """ + >>> t = timeit.Timer(stmt=s) + >>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000) + 3.15 usec/pass + +To give the :mod:`timeit` module access to functions you define, you can pass a +``setup`` parameter which contains an import statement:: + + def test(): + "Stupid test function" + L = [] + for i in range(100): + L.append(i) + + if __name__=='__main__': + from timeit import Timer + t = Timer("test()", "from __main__ import test") + print t.timeit() + diff --git a/Doc/library/tix.rst b/Doc/library/tix.rst new file mode 100644 index 0000000..4701c15 --- /dev/null +++ b/Doc/library/tix.rst @@ -0,0 +1,602 @@ +:mod:`Tix` --- Extension widgets for Tk +======================================= + +.. module:: Tix + :synopsis: Tk Extension Widgets for Tkinter +.. sectionauthor:: Mike Clarkson <mikeclarkson@users.sourceforge.net> + + +.. index:: single: Tix + +The :mod:`Tix` (Tk Interface Extension) module provides an additional rich set +of widgets. Although the standard Tk library has many useful widgets, they are +far from complete. The :mod:`Tix` library provides most of the commonly needed +widgets that are missing from standard Tk: :class:`HList`, :class:`ComboBox`, +:class:`Control` (a.k.a. SpinBox) and an assortment of scrollable widgets. +:mod:`Tix` also includes many more widgets that are generally useful in a wide +range of applications: :class:`NoteBook`, :class:`FileEntry`, +:class:`PanedWindow`, etc; there are more than 40 of them. + +With all these new widgets, you can introduce new interaction techniques into +applications, creating more useful and more intuitive user interfaces. You can +design your application by choosing the most appropriate widgets to match the +special needs of your application and users. + + +.. seealso:: + + `Tix Homepage <http://tix.sourceforge.net/>`_ + The home page for :mod:`Tix`. This includes links to additional documentation + and downloads. + + `Tix Man Pages <http://tix.sourceforge.net/dist/current/man/>`_ + On-line version of the man pages and reference material. + + `Tix Programming Guide <http://tix.sourceforge.net/dist/current/docs/tix-book/tix.book.html>`_ + On-line version of the programmer's reference material. + + `Tix Development Applications <http://tix.sourceforge.net/Tide/>`_ + Tix applications for development of Tix and Tkinter programs. Tide applications + work under Tk or Tkinter, and include :program:`TixInspect`, an inspector to + remotely modify and debug Tix/Tk/Tkinter applications. + + +Using Tix +--------- + + +.. class:: Tix(screenName[, baseName[, className]]) + + Toplevel widget of Tix which represents mostly the main window of an + application. It has an associated Tcl interpreter. + + Classes in the :mod:`Tix` module subclasses the classes in the :mod:`Tkinter` + module. The former imports the latter, so to use :mod:`Tix` with Tkinter, all + you need to do is to import one module. In general, you can just import + :mod:`Tix`, and replace the toplevel call to :class:`Tkinter.Tk` with + :class:`Tix.Tk`:: + + import Tix + from Tkconstants import * + root = Tix.Tk() + +To use :mod:`Tix`, you must have the :mod:`Tix` widgets installed, usually +alongside your installation of the Tk widgets. To test your installation, try +the following:: + + import Tix + root = Tix.Tk() + root.tk.eval('package require Tix') + +If this fails, you have a Tk installation problem which must be resolved before +proceeding. Use the environment variable :envvar:`TIX_LIBRARY` to point to the +installed :mod:`Tix` library directory, and make sure you have the dynamic +object library (:file:`tix8183.dll` or :file:`libtix8183.so`) in the same +directory that contains your Tk dynamic object library (:file:`tk8183.dll` or +:file:`libtk8183.so`). The directory with the dynamic object library should also +have a file called :file:`pkgIndex.tcl` (case sensitive), which contains the +line:: + + package ifneeded Tix 8.1 [list load "[file join $dir tix8183.dll]" Tix] + +.. % $ <-- bow to font-lock + + +Tix Widgets +----------- + +`Tix <http://tix.sourceforge.net/dist/current/man/html/TixCmd/TixIntro.htm>`_ +introduces over 40 widget classes to the :mod:`Tkinter` repertoire. There is a +demo of all the :mod:`Tix` widgets in the :file:`Demo/tix` directory of the +standard distribution. + +.. % The Python sample code is still being added to Python, hence commented out + + +Basic Widgets +^^^^^^^^^^^^^ + + +.. class:: Balloon() + + A `Balloon + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixBalloon.htm>`_ that + pops up over a widget to provide help. When the user moves the cursor inside a + widget to which a Balloon widget has been bound, a small pop-up window with a + descriptive message will be shown on the screen. + +.. % Python Demo of: +.. % \ulink{Balloon}{http://tix.sourceforge.net/dist/current/demos/samples/Balloon.tcl} + + +.. class:: ButtonBox() + + The `ButtonBox + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixButtonBox.htm>`_ + widget creates a box of buttons, such as is commonly used for ``Ok Cancel``. + +.. % Python Demo of: +.. % \ulink{ButtonBox}{http://tix.sourceforge.net/dist/current/demos/samples/BtnBox.tcl} + + +.. class:: ComboBox() + + The `ComboBox + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixComboBox.htm>`_ + widget is similar to the combo box control in MS Windows. The user can select a + choice by either typing in the entry subwdget or selecting from the listbox + subwidget. + +.. % Python Demo of: +.. % \ulink{ComboBox}{http://tix.sourceforge.net/dist/current/demos/samples/ComboBox.tcl} + + +.. class:: Control() + + The `Control + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixControl.htm>`_ + widget is also known as the :class:`SpinBox` widget. The user can adjust the + value by pressing the two arrow buttons or by entering the value directly into + the entry. The new value will be checked against the user-defined upper and + lower limits. + +.. % Python Demo of: +.. % \ulink{Control}{http://tix.sourceforge.net/dist/current/demos/samples/Control.tcl} + + +.. class:: LabelEntry() + + The `LabelEntry + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixLabelEntry.htm>`_ + widget packages an entry widget and a label into one mega widget. It can be used + be used to simplify the creation of "entry-form" type of interface. + +.. % Python Demo of: +.. % \ulink{LabelEntry}{http://tix.sourceforge.net/dist/current/demos/samples/LabEntry.tcl} + + +.. class:: LabelFrame() + + The `LabelFrame + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixLabelFrame.htm>`_ + widget packages a frame widget and a label into one mega widget. To create + widgets inside a LabelFrame widget, one creates the new widgets relative to the + :attr:`frame` subwidget and manage them inside the :attr:`frame` subwidget. + +.. % Python Demo of: +.. % \ulink{LabelFrame}{http://tix.sourceforge.net/dist/current/demos/samples/LabFrame.tcl} + + +.. class:: Meter() + + The `Meter + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixMeter.htm>`_ widget + can be used to show the progress of a background job which may take a long time + to execute. + +.. % Python Demo of: +.. % \ulink{Meter}{http://tix.sourceforge.net/dist/current/demos/samples/Meter.tcl} + + +.. class:: OptionMenu() + + The `OptionMenu + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixOptionMenu.htm>`_ + creates a menu button of options. + +.. % Python Demo of: +.. % \ulink{OptionMenu}{http://tix.sourceforge.net/dist/current/demos/samples/OptMenu.tcl} + + +.. class:: PopupMenu() + + The `PopupMenu + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixPopupMenu.htm>`_ + widget can be used as a replacement of the ``tk_popup`` command. The advantage + of the :mod:`Tix` :class:`PopupMenu` widget is it requires less application code + to manipulate. + +.. % Python Demo of: +.. % \ulink{PopupMenu}{http://tix.sourceforge.net/dist/current/demos/samples/PopMenu.tcl} + + +.. class:: Select() + + The `Select + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixSelect.htm>`_ widget + is a container of button subwidgets. It can be used to provide radio-box or + check-box style of selection options for the user. + +.. % Python Demo of: +.. % \ulink{Select}{http://tix.sourceforge.net/dist/current/demos/samples/Select.tcl} + + +.. class:: StdButtonBox() + + The `StdButtonBox + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixStdButtonBox.htm>`_ + widget is a group of standard buttons for Motif-like dialog boxes. + +.. % Python Demo of: +.. % \ulink{StdButtonBox}{http://tix.sourceforge.net/dist/current/demos/samples/StdBBox.tcl} + + +File Selectors +^^^^^^^^^^^^^^ + + +.. class:: DirList() + + The `DirList + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixDirList.htm>`_ + widget displays a list view of a directory, its previous directories and its + sub-directories. The user can choose one of the directories displayed in the + list or change to another directory. + +.. % Python Demo of: +.. % \ulink{DirList}{http://tix.sourceforge.net/dist/current/demos/samples/DirList.tcl} + + +.. class:: DirTree() + + The `DirTree + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixDirTree.htm>`_ + widget displays a tree view of a directory, its previous directories and its + sub-directories. The user can choose one of the directories displayed in the + list or change to another directory. + +.. % Python Demo of: +.. % \ulink{DirTree}{http://tix.sourceforge.net/dist/current/demos/samples/DirTree.tcl} + + +.. class:: DirSelectDialog() + + The `DirSelectDialog + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixDirSelectDialog.htm>`_ + widget presents the directories in the file system in a dialog window. The user + can use this dialog window to navigate through the file system to select the + desired directory. + +.. % Python Demo of: +.. % \ulink{DirSelectDialog}{http://tix.sourceforge.net/dist/current/demos/samples/DirDlg.tcl} + + +.. class:: DirSelectBox() + + The :class:`DirSelectBox` is similar to the standard Motif(TM) + directory-selection box. It is generally used for the user to choose a + directory. DirSelectBox stores the directories mostly recently selected into + a ComboBox widget so that they can be quickly selected again. + + +.. class:: ExFileSelectBox() + + The `ExFileSelectBox + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixExFileSelectBox.htm>`_ + widget is usually embedded in a tixExFileSelectDialog widget. It provides an + convenient method for the user to select files. The style of the + :class:`ExFileSelectBox` widget is very similar to the standard file dialog on + MS Windows 3.1. + +.. % Python Demo of: +.. % \ulink{ExFileSelectDialog}{http://tix.sourceforge.net/dist/current/demos/samples/EFileDlg.tcl} + + +.. class:: FileSelectBox() + + The `FileSelectBox + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixFileSelectBox.htm>`_ + is similar to the standard Motif(TM) file-selection box. It is generally used + for the user to choose a file. FileSelectBox stores the files mostly recently + selected into a :class:`ComboBox` widget so that they can be quickly selected + again. + +.. % Python Demo of: +.. % \ulink{FileSelectDialog}{http://tix.sourceforge.net/dist/current/demos/samples/FileDlg.tcl} + + +.. class:: FileEntry() + + The `FileEntry + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixFileEntry.htm>`_ + widget can be used to input a filename. The user can type in the filename + manually. Alternatively, the user can press the button widget that sits next to + the entry, which will bring up a file selection dialog. + +.. % Python Demo of: +.. % \ulink{FileEntry}{http://tix.sourceforge.net/dist/current/demos/samples/FileEnt.tcl} + + +Hierachical ListBox +^^^^^^^^^^^^^^^^^^^ + + +.. class:: HList() + + The `HList + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixHList.htm>`_ widget + can be used to display any data that have a hierarchical structure, for example, + file system directory trees. The list entries are indented and connected by + branch lines according to their places in the hierarchy. + +.. % Python Demo of: +.. % \ulink{HList}{http://tix.sourceforge.net/dist/current/demos/samples/HList1.tcl} + + +.. class:: CheckList() + + The `CheckList + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixCheckList.htm>`_ + widget displays a list of items to be selected by the user. CheckList acts + similarly to the Tk checkbutton or radiobutton widgets, except it is capable of + handling many more items than checkbuttons or radiobuttons. + +.. % Python Demo of: +.. % \ulink{ CheckList}{http://tix.sourceforge.net/dist/current/demos/samples/ChkList.tcl} +.. % Python Demo of: +.. % \ulink{ScrolledHList (1)}{http://tix.sourceforge.net/dist/current/demos/samples/SHList.tcl} +.. % Python Demo of: +.. % \ulink{ScrolledHList (2)}{http://tix.sourceforge.net/dist/current/demos/samples/SHList2.tcl} + + +.. class:: Tree() + + The `Tree + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixTree.htm>`_ widget + can be used to display hierarchical data in a tree form. The user can adjust the + view of the tree by opening or closing parts of the tree. + +.. % Python Demo of: +.. % \ulink{Tree}{http://tix.sourceforge.net/dist/current/demos/samples/Tree.tcl} +.. % Python Demo of: +.. % \ulink{Tree (Dynamic)}{http://tix.sourceforge.net/dist/current/demos/samples/DynTree.tcl} + + +Tabular ListBox +^^^^^^^^^^^^^^^ + + +.. class:: TList() + + The `TList + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixTList.htm>`_ widget + can be used to display data in a tabular format. The list entries of a + :class:`TList` widget are similar to the entries in the Tk listbox widget. The + main differences are (1) the :class:`TList` widget can display the list entries + in a two dimensional format and (2) you can use graphical images as well as + multiple colors and fonts for the list entries. + +.. % Python Demo of: +.. % \ulink{ScrolledTList (1)}{http://tix.sourceforge.net/dist/current/demos/samples/STList1.tcl} +.. % Python Demo of: +.. % \ulink{ScrolledTList (2)}{http://tix.sourceforge.net/dist/current/demos/samples/STList2.tcl} +.. % Grid has yet to be added to Python +.. % \subsubsection{Grid Widget} +.. % Python Demo of: +.. % \ulink{Simple Grid}{http://tix.sourceforge.net/dist/current/demos/samples/SGrid0.tcl} +.. % Python Demo of: +.. % \ulink{ScrolledGrid}{http://tix.sourceforge.net/dist/current/demos/samples/SGrid1.tcl} +.. % Python Demo of: +.. % \ulink{Editable Grid}{http://tix.sourceforge.net/dist/current/demos/samples/EditGrid.tcl} + + +Manager Widgets +^^^^^^^^^^^^^^^ + + +.. class:: PanedWindow() + + The `PanedWindow + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixPanedWindow.htm>`_ + widget allows the user to interactively manipulate the sizes of several panes. + The panes can be arranged either vertically or horizontally. The user changes + the sizes of the panes by dragging the resize handle between two panes. + +.. % Python Demo of: +.. % \ulink{PanedWindow}{http://tix.sourceforge.net/dist/current/demos/samples/PanedWin.tcl} + + +.. class:: ListNoteBook() + + The `ListNoteBook + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixListNoteBook.htm>`_ + widget is very similar to the :class:`TixNoteBook` widget: it can be used to + display many windows in a limited space using a notebook metaphor. The notebook + is divided into a stack of pages (windows). At one time only one of these pages + can be shown. The user can navigate through these pages by choosing the name of + the desired page in the :attr:`hlist` subwidget. + +.. % Python Demo of: +.. % \ulink{ListNoteBook}{http://tix.sourceforge.net/dist/current/demos/samples/ListNBK.tcl} + + +.. class:: NoteBook() + + The `NoteBook + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixNoteBook.htm>`_ + widget can be used to display many windows in a limited space using a notebook + metaphor. The notebook is divided into a stack of pages. At one time only one of + these pages can be shown. The user can navigate through these pages by choosing + the visual "tabs" at the top of the NoteBook widget. + +.. % Python Demo of: +.. % \ulink{NoteBook}{http://tix.sourceforge.net/dist/current/demos/samples/NoteBook.tcl} + +.. % \subsubsection{Scrolled Widgets} +.. % Python Demo of: +.. % \ulink{ScrolledListBox}{http://tix.sourceforge.net/dist/current/demos/samples/SListBox.tcl} +.. % Python Demo of: +.. % \ulink{ScrolledText}{http://tix.sourceforge.net/dist/current/demos/samples/SText.tcl} +.. % Python Demo of: +.. % \ulink{ScrolledWindow}{http://tix.sourceforge.net/dist/current/demos/samples/SWindow.tcl} +.. % Python Demo of: +.. % \ulink{Canvas Object View}{http://tix.sourceforge.net/dist/current/demos/samples/CObjView.tcl} + + +Image Types +^^^^^^^^^^^ + +The :mod:`Tix` module adds: + +* `pixmap <http://tix.sourceforge.net/dist/current/man/html/TixCmd/pixmap.htm>`_ + capabilities to all :mod:`Tix` and :mod:`Tkinter` widgets to create color images + from XPM files. + + .. % Python Demo of: + .. % \ulink{XPM Image In Button}{http://tix.sourceforge.net/dist/current/demos/samples/Xpm.tcl} + .. % Python Demo of: + .. % \ulink{XPM Image In Menu}{http://tix.sourceforge.net/dist/current/demos/samples/Xpm1.tcl} + +* `Compound + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/compound.htm>`_ image + types can be used to create images that consists of multiple horizontal lines; + each line is composed of a series of items (texts, bitmaps, images or spaces) + arranged from left to right. For example, a compound image can be used to + display a bitmap and a text string simultaneously in a Tk :class:`Button` + widget. + + .. % Python Demo of: + .. % \ulink{Compound Image In Buttons}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg.tcl} + .. % Python Demo of: + .. % \ulink{Compound Image In NoteBook}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg2.tcl} + .. % Python Demo of: + .. % \ulink{Compound Image Notebook Color Tabs}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg4.tcl} + .. % Python Demo of: + .. % \ulink{Compound Image Icons}{http://tix.sourceforge.net/dist/current/demos/samples/CmpImg3.tcl} + + +Miscellaneous Widgets +^^^^^^^^^^^^^^^^^^^^^ + + +.. class:: InputOnly() + + The `InputOnly + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixInputOnly.htm>`_ + widgets are to accept inputs from the user, which can be done with the ``bind`` + command (Unix only). + + +Form Geometry Manager +^^^^^^^^^^^^^^^^^^^^^ + +In addition, :mod:`Tix` augments :mod:`Tkinter` by providing: + + +.. class:: Form() + + The `Form + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tixForm.htm>`_ geometry + manager based on attachment rules for all Tk widgets. + +.. % begin{latexonly} +.. % \subsection{Tix Class Structure} +.. % +.. % \begin{figure}[hbtp] +.. % \centerline{\epsfig{file=hierarchy.png,width=.9\textwidth}} +.. % \vspace{.5cm} +.. % \caption{The Class Hierarchy of Tix Widgets} +.. % \end{figure} +.. % end{latexonly} + + +Tix Commands +------------ + + +.. class:: tixCommand() + + The `tix commands + <http://tix.sourceforge.net/dist/current/man/html/TixCmd/tix.htm>`_ provide + access to miscellaneous elements of :mod:`Tix`'s internal state and the + :mod:`Tix` application context. Most of the information manipulated by these + methods pertains to the application as a whole, or to a screen or display, + rather than to a particular window. + + To view the current settings, the common usage is:: + + import Tix + root = Tix.Tk() + print root.tix_configure() + + +.. method:: tixCommand.tix_configure([cnf,] **kw) + + Query or modify the configuration options of the Tix application context. If no + option is specified, returns a dictionary all of the available options. If + option is specified with no value, then the method returns a list describing the + one named option (this list will be identical to the corresponding sublist of + the value returned if no option is specified). If one or more option-value + pairs are specified, then the method modifies the given option(s) to have the + given value(s); in this case the method returns an empty string. Option may be + any of the configuration options. + + +.. method:: tixCommand.tix_cget(option) + + Returns the current value of the configuration option given by *option*. Option + may be any of the configuration options. + + +.. method:: tixCommand.tix_getbitmap(name) + + Locates a bitmap file of the name ``name.xpm`` or ``name`` in one of the bitmap + directories (see the :meth:`tix_addbitmapdir` method). By using + :meth:`tix_getbitmap`, you can avoid hard coding the pathnames of the bitmap + files in your application. When successful, it returns the complete pathname of + the bitmap file, prefixed with the character ``@``. The returned value can be + used to configure the ``bitmap`` option of the Tk and Tix widgets. + + +.. method:: tixCommand.tix_addbitmapdir(directory) + + Tix maintains a list of directories under which the :meth:`tix_getimage` and + :meth:`tix_getbitmap` methods will search for image files. The standard bitmap + directory is :file:`$TIX_LIBRARY/bitmaps`. The :meth:`tix_addbitmapdir` method + adds *directory* into this list. By using this method, the image files of an + applications can also be located using the :meth:`tix_getimage` or + :meth:`tix_getbitmap` method. + + +.. method:: tixCommand.tix_filedialog([dlgclass]) + + Returns the file selection dialog that may be shared among different calls from + this application. This method will create a file selection dialog widget when + it is called the first time. This dialog will be returned by all subsequent + calls to :meth:`tix_filedialog`. An optional dlgclass parameter can be passed + as a string to specified what type of file selection dialog widget is desired. + Possible options are ``tix``, ``FileSelectDialog`` or ``tixExFileSelectDialog``. + + +.. method:: tixCommand.tix_getimage(self, name) + + Locates an image file of the name :file:`name.xpm`, :file:`name.xbm` or + :file:`name.ppm` in one of the bitmap directories (see the + :meth:`tix_addbitmapdir` method above). If more than one file with the same name + (but different extensions) exist, then the image type is chosen according to the + depth of the X display: xbm images are chosen on monochrome displays and color + images are chosen on color displays. By using :meth:`tix_getimage`, you can + avoid hard coding the pathnames of the image files in your application. When + successful, this method returns the name of the newly created image, which can + be used to configure the ``image`` option of the Tk and Tix widgets. + + +.. method:: tixCommand.tix_option_get(name) + + Gets the options maintained by the Tix scheme mechanism. + + +.. method:: tixCommand.tix_resetoptions(newScheme, newFontSet[, newScmPrio]) + + Resets the scheme and fontset of the Tix application to *newScheme* and + *newFontSet*, respectively. This affects only those widgets created after this + call. Therefore, it is best to call the resetoptions method before the creation + of any widgets in a Tix application. + + The optional parameter *newScmPrio* can be given to reset the priority level of + the Tk options set by the Tix schemes. + + Because of the way Tk handles the X option database, after Tix has been has + imported and inited, it is not possible to reset the color schemes and font sets + using the :meth:`tix_config` method. Instead, the :meth:`tix_resetoptions` + method must be used. diff --git a/Doc/library/tk.rst b/Doc/library/tk.rst new file mode 100644 index 0000000..bb852d2 --- /dev/null +++ b/Doc/library/tk.rst @@ -0,0 +1,43 @@ +.. _tkinter: + +********************************* +Graphical User Interfaces with Tk +********************************* + +.. index:: + single: GUI + single: Graphical User Interface + single: Tkinter + single: Tk + +Tk/Tcl has long been an integral part of Python. It provides a robust and +platform independent windowing toolkit, that is available to Python programmers +using the :mod:`Tkinter` module, and its extension, the :mod:`Tix` module. + +The :mod:`Tkinter` module is a thin object-oriented layer on top of Tcl/Tk. To +use :mod:`Tkinter`, you don't need to write Tcl code, but you will need to +consult the Tk documentation, and occasionally the Tcl documentation. +:mod:`Tkinter` is a set of wrappers that implement the Tk widgets as Python +classes. In addition, the internal module :mod:`_tkinter` provides a threadsafe +mechanism which allows Python and Tcl to interact. + +:mod:`Tkinter`'s chief virtues are that it is fast, and that it usually comes +bundled with Python. Although it has been used to create some very good +applications, including IDLE, it has weak documentation and an outdated look and +feel. For more modern, better documented, and much more extensive GUI +libraries, see the :ref:`other-gui-packages` section. + +.. toctree:: + + tkinter.rst + tix.rst + scrolledtext.rst + turtle.rst + idle.rst + othergui.rst + +.. % Other sections I have in mind are +.. % Tkinter internals +.. % Freezing Tkinter applications + + diff --git a/Doc/library/tkinter.rst b/Doc/library/tkinter.rst new file mode 100644 index 0000000..d52c1e0 --- /dev/null +++ b/Doc/library/tkinter.rst @@ -0,0 +1,840 @@ +:mod:`Tkinter` --- Python interface to Tcl/Tk +============================================= + +.. module:: Tkinter + :synopsis: Interface to Tcl/Tk for graphical user interfaces +.. moduleauthor:: Guido van Rossum <guido@Python.org> + + +The :mod:`Tkinter` module ("Tk interface") is the standard Python interface to +the Tk GUI toolkit. Both Tk and :mod:`Tkinter` are available on most Unix +platforms, as well as on Windows and Macintosh systems. (Tk itself is not part +of Python; it is maintained at ActiveState.) + + +.. seealso:: + + `Python Tkinter Resources <http://www.python.org/topics/tkinter/>`_ + The Python Tkinter Topic Guide provides a great deal of information on using Tk + from Python and links to other sources of information on Tk. + + `An Introduction to Tkinter <http://www.pythonware.com/library/an-introduction-to-tkinter.htm>`_ + Fredrik Lundh's on-line reference material. + + `Tkinter reference: a GUI for Python <http://www.nmt.edu/tcc/help/pubs/lang.html>`_ + On-line reference material. + + `Tkinter for JPython <http://jtkinter.sourceforge.net>`_ + The Jython interface to Tkinter. + + `Python and Tkinter Programming <http://www.amazon.com/exec/obidos/ASIN/1884777813>`_ + The book by John Grayson (ISBN 1-884777-81-3). + + +Tkinter Modules +--------------- + +Most of the time, the :mod:`Tkinter` module is all you really need, but a number +of additional modules are available as well. The Tk interface is located in a +binary module named :mod:`_tkinter`. This module contains the low-level +interface to Tk, and should never be used directly by application programmers. +It is usually a shared library (or DLL), but might in some cases be statically +linked with the Python interpreter. + +In addition to the Tk interface module, :mod:`Tkinter` includes a number of +Python modules. The two most important modules are the :mod:`Tkinter` module +itself, and a module called :mod:`Tkconstants`. The former automatically imports +the latter, so to use Tkinter, all you need to do is to import one module:: + + import Tkinter + +Or, more often:: + + from Tkinter import * + + +.. class:: Tk(screenName=None, baseName=None, className='Tk', useTk=1) + + The :class:`Tk` class is instantiated without arguments. This creates a toplevel + widget of Tk which usually is the main window of an application. Each instance + has its own associated Tcl interpreter. + + .. % FIXME: The following keyword arguments are currently recognized: + + .. versionchanged:: 2.4 + The *useTk* parameter was added. + + +.. function:: Tcl(screenName=None, baseName=None, className='Tk', useTk=0) + + The :func:`Tcl` function is a factory function which creates an object much like + that created by the :class:`Tk` class, except that it does not initialize the Tk + subsystem. This is most often useful when driving the Tcl interpreter in an + environment where one doesn't want to create extraneous toplevel windows, or + where one cannot (such as Unix/Linux systems without an X server). An object + created by the :func:`Tcl` object can have a Toplevel window created (and the Tk + subsystem initialized) by calling its :meth:`loadtk` method. + + .. versionadded:: 2.4 + +Other modules that provide Tk support include: + +:mod:`ScrolledText` + Text widget with a vertical scroll bar built in. + +:mod:`tkColorChooser` + Dialog to let the user choose a color. + +:mod:`tkCommonDialog` + Base class for the dialogs defined in the other modules listed here. + +:mod:`tkFileDialog` + Common dialogs to allow the user to specify a file to open or save. + +:mod:`tkFont` + Utilities to help work with fonts. + +:mod:`tkMessageBox` + Access to standard Tk dialog boxes. + +:mod:`tkSimpleDialog` + Basic dialogs and convenience functions. + +:mod:`Tkdnd` + Drag-and-drop support for :mod:`Tkinter`. This is experimental and should become + deprecated when it is replaced with the Tk DND. + +:mod:`turtle` + Turtle graphics in a Tk window. + + +Tkinter Life Preserver +---------------------- + +.. sectionauthor:: Matt Conway + + +This section is not designed to be an exhaustive tutorial on either Tk or +Tkinter. Rather, it is intended as a stop gap, providing some introductory +orientation on the system. + +.. % Converted to LaTeX by Mike Clarkson. + +Credits: + +* Tkinter was written by Steen Lumholt and Guido van Rossum. + +* Tk was written by John Ousterhout while at Berkeley. + +* This Life Preserver was written by Matt Conway at the University of Virginia. + +* The html rendering, and some liberal editing, was produced from a FrameMaker + version by Ken Manheimer. + +* Fredrik Lundh elaborated and revised the class interface descriptions, to get + them current with Tk 4.2. + +* Mike Clarkson converted the documentation to LaTeX, and compiled the User + Interface chapter of the reference manual. + + +How To Use This Section +^^^^^^^^^^^^^^^^^^^^^^^ + +This section is designed in two parts: the first half (roughly) covers +background material, while the second half can be taken to the keyboard as a +handy reference. + +When trying to answer questions of the form "how do I do blah", it is often best +to find out how to do"blah" in straight Tk, and then convert this back into the +corresponding :mod:`Tkinter` call. Python programmers can often guess at the +correct Python command by looking at the Tk documentation. This means that in +order to use Tkinter, you will have to know a little bit about Tk. This document +can't fulfill that role, so the best we can do is point you to the best +documentation that exists. Here are some hints: + +* The authors strongly suggest getting a copy of the Tk man pages. Specifically, + the man pages in the ``mann`` directory are most useful. The ``man3`` man pages + describe the C interface to the Tk library and thus are not especially helpful + for script writers. + +* Addison-Wesley publishes a book called Tcl and the Tk Toolkit by John + Ousterhout (ISBN 0-201-63337-X) which is a good introduction to Tcl and Tk for + the novice. The book is not exhaustive, and for many details it defers to the + man pages. + +* :file:`Tkinter.py` is a last resort for most, but can be a good place to go + when nothing else makes sense. + + +.. seealso:: + + `ActiveState Tcl Home Page <http://tcl.activestate.com/>`_ + The Tk/Tcl development is largely taking place at ActiveState. + + `Tcl and the Tk Toolkit <http://www.amazon.com/exec/obidos/ASIN/020163337X>`_ + The book by John Ousterhout, the inventor of Tcl . + + `Practical Programming in Tcl and Tk <http://www.amazon.com/exec/obidos/ASIN/0130220280>`_ + Brent Welch's encyclopedic book. + + +A Simple Hello World Program +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. % HelloWorld.html +.. % begin{latexonly} +.. % \begin{figure}[hbtp] +.. % \centerline{\epsfig{file=HelloWorld.gif,width=.9\textwidth}} +.. % \vspace{.5cm} +.. % \caption{HelloWorld gadget image} +.. % \end{figure} +.. % See also the hello-world \ulink{notes}{classes/HelloWorld-notes.html} and +.. % \ulink{summary}{classes/HelloWorld-summary.html}. +.. % end{latexonly} + +:: + + from Tkinter import * + + class Application(Frame): + def say_hi(self): + print "hi there, everyone!" + + def createWidgets(self): + self.QUIT = Button(self) + self.QUIT["text"] = "QUIT" + self.QUIT["fg"] = "red" + self.QUIT["command"] = self.quit + + self.QUIT.pack({"side": "left"}) + + self.hi_there = Button(self) + self.hi_there["text"] = "Hello", + self.hi_there["command"] = self.say_hi + + self.hi_there.pack({"side": "left"}) + + def __init__(self, master=None): + Frame.__init__(self, master) + self.pack() + self.createWidgets() + + root = Tk() + app = Application(master=root) + app.mainloop() + root.destroy() + + +A (Very) Quick Look at Tcl/Tk +----------------------------- + +The class hierarchy looks complicated, but in actual practice, application +programmers almost always refer to the classes at the very bottom of the +hierarchy. + +.. % BriefTclTk.html + +Notes: + +* These classes are provided for the purposes of organizing certain functions + under one namespace. They aren't meant to be instantiated independently. + +* The :class:`Tk` class is meant to be instantiated only once in an application. + Application programmers need not instantiate one explicitly, the system creates + one whenever any of the other classes are instantiated. + +* The :class:`Widget` class is not meant to be instantiated, it is meant only + for subclassing to make "real" widgets (in C++, this is called an 'abstract + class'). + +To make use of this reference material, there will be times when you will need +to know how to read short passages of Tk and how to identify the various parts +of a Tk command. (See section :ref:`tkinter-basic-mapping` for the +:mod:`Tkinter` equivalents of what's below.) + +Tk scripts are Tcl programs. Like all Tcl programs, Tk scripts are just lists +of tokens separated by spaces. A Tk widget is just its *class*, the *options* +that help configure it, and the *actions* that make it do useful things. + +To make a widget in Tk, the command is always of the form:: + + classCommand newPathname options + +*classCommand* + denotes which kind of widget to make (a button, a label, a menu...) + +*newPathname* + is the new name for this widget. All names in Tk must be unique. To help + enforce this, widgets in Tk are named with *pathnames*, just like files in a + file system. The top level widget, the *root*, is called ``.`` (period) and + children are delimited by more periods. For example, + ``.myApp.controlPanel.okButton`` might be the name of a widget. + +*options* + configure the widget's appearance and in some cases, its behavior. The options + come in the form of a list of flags and values. Flags are preceded by a '-', + like Unix shell command flags, and values are put in quotes if they are more + than one word. + +For example:: + + button .fred -fg red -text "hi there" + ^ ^ \_____________________/ + | | | + class new options + command widget (-opt val -opt val ...) + +Once created, the pathname to the widget becomes a new command. This new +*widget command* is the programmer's handle for getting the new widget to +perform some *action*. In C, you'd express this as someAction(fred, +someOptions), in C++, you would express this as fred.someAction(someOptions), +and in Tk, you say:: + + .fred someAction someOptions + +Note that the object name, ``.fred``, starts with a dot. + +As you'd expect, the legal values for *someAction* will depend on the widget's +class: ``.fred disable`` works if fred is a button (fred gets greyed out), but +does not work if fred is a label (disabling of labels is not supported in Tk). + +The legal values of *someOptions* is action dependent. Some actions, like +``disable``, require no arguments, others, like a text-entry box's ``delete`` +command, would need arguments to specify what range of text to delete. + + +.. _tkinter-basic-mapping: + +Mapping Basic Tk into Tkinter +----------------------------- + +Class commands in Tk correspond to class constructors in Tkinter. :: + + button .fred =====> fred = Button() + +The master of an object is implicit in the new name given to it at creation +time. In Tkinter, masters are specified explicitly. :: + + button .panel.fred =====> fred = Button(panel) + +The configuration options in Tk are given in lists of hyphened tags followed by +values. In Tkinter, options are specified as keyword-arguments in the instance +constructor, and keyword-args for configure calls or as instance indices, in +dictionary style, for established instances. See section +:ref:`tkinter-setting-options` on setting options. :: + + button .fred -fg red =====> fred = Button(panel, fg = "red") + .fred configure -fg red =====> fred["fg"] = red + OR ==> fred.config(fg = "red") + +In Tk, to perform an action on a widget, use the widget name as a command, and +follow it with an action name, possibly with arguments (options). In Tkinter, +you call methods on the class instance to invoke actions on the widget. The +actions (methods) that a given widget can perform are listed in the Tkinter.py +module. :: + + .fred invoke =====> fred.invoke() + +To give a widget to the packer (geometry manager), you call pack with optional +arguments. In Tkinter, the Pack class holds all this functionality, and the +various forms of the pack command are implemented as methods. All widgets in +:mod:`Tkinter` are subclassed from the Packer, and so inherit all the packing +methods. See the :mod:`Tix` module documentation for additional information on +the Form geometry manager. :: + + pack .fred -side left =====> fred.pack(side = "left") + + +How Tk and Tkinter are Related +------------------------------ + +.. % Relationship.html + +.. note:: + + This was derived from a graphical image; the image will be used more directly in + a subsequent version of this document. + +From the top down: + +Your App Here (Python) + A Python application makes a :mod:`Tkinter` call. + +Tkinter (Python Module) + This call (say, for example, creating a button widget), is implemented in the + *Tkinter* module, which is written in Python. This Python function will parse + the commands and the arguments and convert them into a form that makes them look + as if they had come from a Tk script instead of a Python script. + +tkinter (C) + These commands and their arguments will be passed to a C function in the + *tkinter* - note the lowercase - extension module. + +Tk Widgets (C and Tcl) + This C function is able to make calls into other C modules, including the C + functions that make up the Tk library. Tk is implemented in C and some Tcl. + The Tcl part of the Tk widgets is used to bind certain default behaviors to + widgets, and is executed once at the point where the Python :mod:`Tkinter` + module is imported. (The user never sees this stage). + +Tk (C) + The Tk part of the Tk Widgets implement the final mapping to ... + +Xlib (C) + the Xlib library to draw graphics on the screen. + + +Handy Reference +--------------- + + +.. _tkinter-setting-options: + +Setting Options +^^^^^^^^^^^^^^^ + +Options control things like the color and border width of a widget. Options can +be set in three ways: + +At object creation time, using keyword arguments + :: + + fred = Button(self, fg = "red", bg = "blue") + +After object creation, treating the option name like a dictionary index + :: + + fred["fg"] = "red" + fred["bg"] = "blue" + +Use the config() method to update multiple attrs subsequent to object creation + :: + + fred.config(fg = "red", bg = "blue") + +For a complete explanation of a given option and its behavior, see the Tk man +pages for the widget in question. + +Note that the man pages list "STANDARD OPTIONS" and "WIDGET SPECIFIC OPTIONS" +for each widget. The former is a list of options that are common to many +widgets, the latter are the options that are idiosyncratic to that particular +widget. The Standard Options are documented on the :manpage:`options(3)` man +page. + +No distinction between standard and widget-specific options is made in this +document. Some options don't apply to some kinds of widgets. Whether a given +widget responds to a particular option depends on the class of the widget; +buttons have a ``command`` option, labels do not. + +The options supported by a given widget are listed in that widget's man page, or +can be queried at runtime by calling the :meth:`config` method without +arguments, or by calling the :meth:`keys` method on that widget. The return +value of these calls is a dictionary whose key is the name of the option as a +string (for example, ``'relief'``) and whose values are 5-tuples. + +Some options, like ``bg`` are synonyms for common options with long names +(``bg`` is shorthand for "background"). Passing the ``config()`` method the name +of a shorthand option will return a 2-tuple, not 5-tuple. The 2-tuple passed +back will contain the name of the synonym and the "real" option (such as +``('bg', 'background')``). + ++-------+---------------------------------+--------------+ +| Index | Meaning | Example | ++=======+=================================+==============+ +| 0 | option name | ``'relief'`` | ++-------+---------------------------------+--------------+ +| 1 | option name for database lookup | ``'relief'`` | ++-------+---------------------------------+--------------+ +| 2 | option class for database | ``'Relief'`` | +| | lookup | | ++-------+---------------------------------+--------------+ +| 3 | default value | ``'raised'`` | ++-------+---------------------------------+--------------+ +| 4 | current value | ``'groove'`` | ++-------+---------------------------------+--------------+ + +Example:: + + >>> print fred.config() + {'relief' : ('relief', 'relief', 'Relief', 'raised', 'groove')} + +Of course, the dictionary printed will include all the options available and +their values. This is meant only as an example. + + +The Packer +^^^^^^^^^^ + +.. index:: single: packing (widgets) + +.. % Packer.html + +The packer is one of Tk's geometry-management mechanisms. Geometry managers +are used to specify the relative positioning of the positioning of widgets +within their container - their mutual *master*. In contrast to the more +cumbersome *placer* (which is used less commonly, and we do not cover here), the +packer takes qualitative relationship specification - *above*, *to the left of*, +*filling*, etc - and works everything out to determine the exact placement +coordinates for you. + +.. % See also \citetitle[classes/ClassPacker.html]{the Packer class interface}. + +The size of any *master* widget is determined by the size of the "slave widgets" +inside. The packer is used to control where slave widgets appear inside the +master into which they are packed. You can pack widgets into frames, and frames +into other frames, in order to achieve the kind of layout you desire. +Additionally, the arrangement is dynamically adjusted to accommodate incremental +changes to the configuration, once it is packed. + +Note that widgets do not appear until they have had their geometry specified +with a geometry manager. It's a common early mistake to leave out the geometry +specification, and then be surprised when the widget is created but nothing +appears. A widget will appear only after it has had, for example, the packer's +:meth:`pack` method applied to it. + +The pack() method can be called with keyword-option/value pairs that control +where the widget is to appear within its container, and how it is to behave when +the main application window is resized. Here are some examples:: + + fred.pack() # defaults to side = "top" + fred.pack(side = "left") + fred.pack(expand = 1) + + +Packer Options +^^^^^^^^^^^^^^ + +For more extensive information on the packer and the options that it can take, +see the man pages and page 183 of John Ousterhout's book. + +anchor + Anchor type. Denotes where the packer is to place each slave in its parcel. + +expand + Boolean, ``0`` or ``1``. + +fill + Legal values: ``'x'``, ``'y'``, ``'both'``, ``'none'``. + +ipadx and ipady + A distance - designating internal padding on each side of the slave widget. + +padx and pady + A distance - designating external padding on each side of the slave widget. + +side + Legal values are: ``'left'``, ``'right'``, ``'top'``, ``'bottom'``. + + +Coupling Widget Variables +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The current-value setting of some widgets (like text entry widgets) can be +connected directly to application variables by using special options. These +options are ``variable``, ``textvariable``, ``onvalue``, ``offvalue``, and +``value``. This connection works both ways: if the variable changes for any +reason, the widget it's connected to will be updated to reflect the new value. + +.. % VarCouplings.html + +Unfortunately, in the current implementation of :mod:`Tkinter` it is not +possible to hand over an arbitrary Python variable to a widget through a +``variable`` or ``textvariable`` option. The only kinds of variables for which +this works are variables that are subclassed from a class called Variable, +defined in the :mod:`Tkinter` module. + +There are many useful subclasses of Variable already defined: +:class:`StringVar`, :class:`IntVar`, :class:`DoubleVar`, and +:class:`BooleanVar`. To read the current value of such a variable, call the +:meth:`get` method on it, and to change its value you call the :meth:`set` +method. If you follow this protocol, the widget will always track the value of +the variable, with no further intervention on your part. + +For example:: + + class App(Frame): + def __init__(self, master=None): + Frame.__init__(self, master) + self.pack() + + self.entrythingy = Entry() + self.entrythingy.pack() + + # here is the application variable + self.contents = StringVar() + # set it to some value + self.contents.set("this is a variable") + # tell the entry widget to watch this variable + self.entrythingy["textvariable"] = self.contents + + # and here we get a callback when the user hits return. + # we will have the program print out the value of the + # application variable when the user hits return + self.entrythingy.bind('<Key-Return>', + self.print_contents) + + def print_contents(self, event): + print "hi. contents of entry is now ---->", \ + self.contents.get() + + +The Window Manager +^^^^^^^^^^^^^^^^^^ + +.. index:: single: window manager (widgets) + +.. % WindowMgr.html + +In Tk, there is a utility command, ``wm``, for interacting with the window +manager. Options to the ``wm`` command allow you to control things like titles, +placement, icon bitmaps, and the like. In :mod:`Tkinter`, these commands have +been implemented as methods on the :class:`Wm` class. Toplevel widgets are +subclassed from the :class:`Wm` class, and so can call the :class:`Wm` methods +directly. + +To get at the toplevel window that contains a given widget, you can often just +refer to the widget's master. Of course if the widget has been packed inside of +a frame, the master won't represent a toplevel window. To get at the toplevel +window that contains an arbitrary widget, you can call the :meth:`_root` method. +This method begins with an underscore to denote the fact that this function is +part of the implementation, and not an interface to Tk functionality. + +.. % See also \citetitle[classes/ClassWm.html]{the Wm class interface}. + +Here are some examples of typical usage:: + + from Tkinter import * + class App(Frame): + def __init__(self, master=None): + Frame.__init__(self, master) + self.pack() + + + # create the application + myapp = App() + + # + # here are method calls to the window manager class + # + myapp.master.title("My Do-Nothing Application") + myapp.master.maxsize(1000, 400) + + # start the program + myapp.mainloop() + + +Tk Option Data Types +^^^^^^^^^^^^^^^^^^^^ + +.. index:: single: Tk Option Data Types + +.. % OptionTypes.html + +anchor + Legal values are points of the compass: ``"n"``, ``"ne"``, ``"e"``, ``"se"``, + ``"s"``, ``"sw"``, ``"w"``, ``"nw"``, and also ``"center"``. + +bitmap + There are eight built-in, named bitmaps: ``'error'``, ``'gray25'``, + ``'gray50'``, ``'hourglass'``, ``'info'``, ``'questhead'``, ``'question'``, + ``'warning'``. To specify an X bitmap filename, give the full path to the file, + preceded with an ``@``, as in ``"@/usr/contrib/bitmap/gumby.bit"``. + +boolean + You can pass integers 0 or 1 or the strings ``"yes"`` or ``"no"`` . + +callback + This is any Python function that takes no arguments. For example:: + + def print_it(): + print "hi there" + fred["command"] = print_it + +color + Colors can be given as the names of X colors in the rgb.txt file, or as strings + representing RGB values in 4 bit: ``"#RGB"``, 8 bit: ``"#RRGGBB"``, 12 bit" + ``"#RRRGGGBBB"``, or 16 bit ``"#RRRRGGGGBBBB"`` ranges, where R,G,B here + represent any legal hex digit. See page 160 of Ousterhout's book for details. + +cursor + The standard X cursor names from :file:`cursorfont.h` can be used, without the + ``XC_`` prefix. For example to get a hand cursor (:const:`XC_hand2`), use the + string ``"hand2"``. You can also specify a bitmap and mask file of your own. + See page 179 of Ousterhout's book. + +distance + Screen distances can be specified in either pixels or absolute distances. + Pixels are given as numbers and absolute distances as strings, with the trailing + character denoting units: ``c`` for centimetres, ``i`` for inches, ``m`` for + millimetres, ``p`` for printer's points. For example, 3.5 inches is expressed + as ``"3.5i"``. + +font + Tk uses a list font name format, such as ``{courier 10 bold}``. Font sizes with + positive numbers are measured in points; sizes with negative numbers are + measured in pixels. + +geometry + This is a string of the form ``widthxheight``, where width and height are + measured in pixels for most widgets (in characters for widgets displaying text). + For example: ``fred["geometry"] = "200x100"``. + +justify + Legal values are the strings: ``"left"``, ``"center"``, ``"right"``, and + ``"fill"``. + +region + This is a string with four space-delimited elements, each of which is a legal + distance (see above). For example: ``"2 3 4 5"`` and ``"3i 2i 4.5i 2i"`` and + ``"3c 2c 4c 10.43c"`` are all legal regions. + +relief + Determines what the border style of a widget will be. Legal values are: + ``"raised"``, ``"sunken"``, ``"flat"``, ``"groove"``, and ``"ridge"``. + +scrollcommand + This is almost always the :meth:`set` method of some scrollbar widget, but can + be any widget method that takes a single argument. Refer to the file + :file:`Demo/tkinter/matt/canvas-with-scrollbars.py` in the Python source + distribution for an example. + +wrap: + Must be one of: ``"none"``, ``"char"``, or ``"word"``. + + +Bindings and Events +^^^^^^^^^^^^^^^^^^^ + +.. index:: + single: bind (widgets) + single: events (widgets) + +.. % Bindings.html + +The bind method from the widget command allows you to watch for certain events +and to have a callback function trigger when that event type occurs. The form +of the bind method is:: + + def bind(self, sequence, func, add=''): + +where: + +sequence + is a string that denotes the target kind of event. (See the bind man page and + page 201 of John Ousterhout's book for details). + +func + is a Python function, taking one argument, to be invoked when the event occurs. + An Event instance will be passed as the argument. (Functions deployed this way + are commonly known as *callbacks*.) + +add + is optional, either ``''`` or ``'+'``. Passing an empty string denotes that + this binding is to replace any other bindings that this event is associated + with. Passing a ``'+'`` means that this function is to be added to the list + of functions bound to this event type. + +For example:: + + def turnRed(self, event): + event.widget["activeforeground"] = "red" + + self.button.bind("<Enter>", self.turnRed) + +Notice how the widget field of the event is being accessed in the +:meth:`turnRed` callback. This field contains the widget that caught the X +event. The following table lists the other event fields you can access, and how +they are denoted in Tk, which can be useful when referring to the Tk man pages. +:: + + Tk Tkinter Event Field Tk Tkinter Event Field + -- ------------------- -- ------------------- + %f focus %A char + %h height %E send_event + %k keycode %K keysym + %s state %N keysym_num + %t time %T type + %w width %W widget + %x x %X x_root + %y y %Y y_root + + +The index Parameter +^^^^^^^^^^^^^^^^^^^ + +A number of widgets require"index" parameters to be passed. These are used to +point at a specific place in a Text widget, or to particular characters in an +Entry widget, or to particular menu items in a Menu widget. + +.. % Index.html + +Entry widget indexes (index, view index, etc.) + Entry widgets have options that refer to character positions in the text being + displayed. You can use these :mod:`Tkinter` functions to access these special + points in text widgets: + + AtEnd() + refers to the last position in the text + + AtInsert() + refers to the point where the text cursor is + + AtSelFirst() + indicates the beginning point of the selected text + + AtSelLast() + denotes the last point of the selected text and finally + + At(x[, y]) + refers to the character at pixel location *x*, *y* (with *y* not used in the + case of a text entry widget, which contains a single line of text). + +Text widget indexes + The index notation for Text widgets is very rich and is best described in the Tk + man pages. + +Menu indexes (menu.invoke(), menu.entryconfig(), etc.) + Some options and methods for menus manipulate specific menu entries. Anytime a + menu index is needed for an option or a parameter, you may pass in: + + * an integer which refers to the numeric position of the entry in the widget, + counted from the top, starting with 0; + + * the string ``'active'``, which refers to the menu position that is currently + under the cursor; + + * the string ``"last"`` which refers to the last menu item; + + * An integer preceded by ``@``, as in ``@6``, where the integer is interpreted + as a y pixel coordinate in the menu's coordinate system; + + * the string ``"none"``, which indicates no menu entry at all, most often used + with menu.activate() to deactivate all entries, and finally, + + * a text string that is pattern matched against the label of the menu entry, as + scanned from the top of the menu to the bottom. Note that this index type is + considered after all the others, which means that matches for menu items + labelled ``last``, ``active``, or ``none`` may be interpreted as the above + literals, instead. + + +Images +^^^^^^ + +Bitmap/Pixelmap images can be created through the subclasses of +:class:`Tkinter.Image`: + +* :class:`BitmapImage` can be used for X11 bitmap data. + +* :class:`PhotoImage` can be used for GIF and PPM/PGM color bitmaps. + +Either type of image is created through either the ``file`` or the ``data`` +option (other options are available as well). + +The image object can then be used wherever an ``image`` option is supported by +some widget (e.g. labels, buttons, menus). In these cases, Tk will not keep a +reference to the image. When the last Python reference to the image object is +deleted, the image data is deleted as well, and Tk will display an empty box +wherever the image was used. + diff --git a/Doc/library/token.rst b/Doc/library/token.rst new file mode 100644 index 0000000..5bf0ea8 --- /dev/null +++ b/Doc/library/token.rst @@ -0,0 +1,47 @@ + +:mod:`token` --- Constants used with Python parse trees +======================================================= + +.. module:: token + :synopsis: Constants representing terminal nodes of the parse tree. +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +This module provides constants which represent the numeric values of leaf nodes +of the parse tree (terminal tokens). Refer to the file :file:`Grammar/Grammar` +in the Python distribution for the definitions of the names in the context of +the language grammar. The specific numeric values which the names map to may +change between Python versions. + +This module also provides one data object and some functions. The functions +mirror definitions in the Python C header files. + + +.. data:: tok_name + + Dictionary mapping the numeric values of the constants defined in this module + back to name strings, allowing more human-readable representation of parse trees + to be generated. + + +.. function:: ISTERMINAL(x) + + Return true for terminal token values. + + +.. function:: ISNONTERMINAL(x) + + Return true for non-terminal token values. + + +.. function:: ISEOF(x) + + Return true if *x* is the marker indicating the end of input. + + +.. seealso:: + + Module :mod:`parser` + The second example for the :mod:`parser` module shows how to use the + :mod:`symbol` module. + diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst new file mode 100644 index 0000000..61f2c4d --- /dev/null +++ b/Doc/library/tokenize.rst @@ -0,0 +1,122 @@ + +:mod:`tokenize` --- Tokenizer for Python source +=============================================== + +.. module:: tokenize + :synopsis: Lexical scanner for Python source code. +.. moduleauthor:: Ka Ping Yee +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`tokenize` module provides a lexical scanner for Python source code, +implemented in Python. The scanner in this module returns comments as tokens as +well, making it useful for implementing "pretty-printers," including colorizers +for on-screen displays. + +The primary entry point is a generator: + + +.. function:: generate_tokens(readline) + + The :func:`generate_tokens` generator requires one argment, *readline*, which + must be a callable object which provides the same interface as the + :meth:`readline` method of built-in file objects (see section + :ref:`bltin-file-objects`). Each call to the function should return one line of + input as a string. + + The generator produces 5-tuples with these members: the token type; the token + string; a 2-tuple ``(srow, scol)`` of ints specifying the row and column where + the token begins in the source; a 2-tuple ``(erow, ecol)`` of ints specifying + the row and column where the token ends in the source; and the line on which the + token was found. The line passed is the *logical* line; continuation lines are + included. + + .. versionadded:: 2.2 + +An older entry point is retained for backward compatibility: + + +.. function:: tokenize(readline[, tokeneater]) + + The :func:`tokenize` function accepts two parameters: one representing the input + stream, and one providing an output mechanism for :func:`tokenize`. + + The first parameter, *readline*, must be a callable object which provides the + same interface as the :meth:`readline` method of built-in file objects (see + section :ref:`bltin-file-objects`). Each call to the function should return one + line of input as a string. Alternately, *readline* may be a callable object that + signals completion by raising :exc:`StopIteration`. + + .. versionchanged:: 2.5 + Added :exc:`StopIteration` support. + + The second parameter, *tokeneater*, must also be a callable object. It is + called once for each token, with five arguments, corresponding to the tuples + generated by :func:`generate_tokens`. + +All constants from the :mod:`token` module are also exported from +:mod:`tokenize`, as are two additional token type values that might be passed to +the *tokeneater* function by :func:`tokenize`: + + +.. data:: COMMENT + + Token value used to indicate a comment. + + +.. data:: NL + + Token value used to indicate a non-terminating newline. The NEWLINE token + indicates the end of a logical line of Python code; NL tokens are generated when + a logical line of code is continued over multiple physical lines. + +Another function is provided to reverse the tokenization process. This is useful +for creating tools that tokenize a script, modify the token stream, and write +back the modified script. + + +.. function:: untokenize(iterable) + + Converts tokens back into Python source code. The *iterable* must return + sequences with at least two elements, the token type and the token string. Any + additional sequence elements are ignored. + + The reconstructed script is returned as a single string. The result is + guaranteed to tokenize back to match the input so that the conversion is + lossless and round-trips are assured. The guarantee applies only to the token + type and token string as the spacing between tokens (column positions) may + change. + + .. versionadded:: 2.5 + +Example of a script re-writer that transforms float literals into Decimal +objects:: + + def decistmt(s): + """Substitute Decimals for floats in a string of statements. + + >>> from decimal import Decimal + >>> s = 'print +21.3e-5*-.1234/81.7' + >>> decistmt(s) + "print +Decimal ('21.3e-5')*-Decimal ('.1234')/Decimal ('81.7')" + + >>> exec(s) + -3.21716034272e-007 + >>> exec(decistmt(s)) + -3.217160342717258261933904529E-7 + + """ + result = [] + g = generate_tokens(StringIO(s).readline) # tokenize the string + for toknum, tokval, _, _, _ in g: + if toknum == NUMBER and '.' in tokval: # replace NUMBER tokens + result.extend([ + (NAME, 'Decimal'), + (OP, '('), + (STRING, repr(tokval)), + (OP, ')') + ]) + else: + result.append((toknum, tokval)) + return untokenize(result) + diff --git a/Doc/library/trace.rst b/Doc/library/trace.rst new file mode 100644 index 0000000..91cf1a4 --- /dev/null +++ b/Doc/library/trace.rst @@ -0,0 +1,128 @@ + +:mod:`trace` --- Trace or track Python statement execution +========================================================== + +.. module:: trace + :synopsis: Trace or track Python statement execution. + + +The :mod:`trace` module allows you to trace program execution, generate +annotated statement coverage listings, print caller/callee relationships and +list functions executed during a program run. It can be used in another program +or from the command line. + + +.. _trace-cli: + +Command Line Usage +------------------ + +The :mod:`trace` module can be invoked from the command line. It can be as +simple as :: + + python -m trace --count somefile.py ... + +The above will generate annotated listings of all Python modules imported during +the execution of :file:`somefile.py`. + +The following command-line arguments are supported: + +:option:`--trace`, :option:`-t` + Display lines as they are executed. + +:option:`--count`, :option:`-c` + Produce a set of annotated listing files upon program completion that shows how + many times each statement was executed. + +:option:`--report`, :option:`-r` + Produce an annotated list from an earlier program run that used the + :option:`--count` and :option:`--file` arguments. + +:option:`--no-report`, :option:`-R` + Do not generate annotated listings. This is useful if you intend to make + several runs with :option:`--count` then produce a single set of annotated + listings at the end. + +:option:`--listfuncs`, :option:`-l` + List the functions executed by running the program. + +:option:`--trackcalls`, :option:`-T` + Generate calling relationships exposed by running the program. + +:option:`--file`, :option:`-f` + Name a file containing (or to contain) counts. + +:option:`--coverdir`, :option:`-C` + Name a directory in which to save annotated listing files. + +:option:`--missing`, :option:`-m` + When generating annotated listings, mark lines which were not executed with + '``>>>>>>``'. + +:option:`--summary`, :option:`-s` + When using :option:`--count` or :option:`--report`, write a brief summary to + stdout for each file processed. + +:option:`--ignore-module` + Ignore the named module and its submodules (if it is a package). May be given + multiple times. + +:option:`--ignore-dir` + Ignore all modules and packages in the named directory and subdirectories. May + be given multiple times. + + +.. _trace-api: + +Programming Interface +--------------------- + + +.. class:: Trace([count=1[, trace=1[, countfuncs=0[, countcallers=0[, ignoremods=()[, ignoredirs=()[, infile=None[, outfile=None]]]]]]]]) + + Create an object to trace execution of a single statement or expression. All + parameters are optional. *count* enables counting of line numbers. *trace* + enables line execution tracing. *countfuncs* enables listing of the functions + called during the run. *countcallers* enables call relationship tracking. + *ignoremods* is a list of modules or packages to ignore. *ignoredirs* is a list + of directories whose modules or packages should be ignored. *infile* is the + file from which to read stored count information. *outfile* is a file in which + to write updated count information. + + +.. method:: Trace.run(cmd) + + Run *cmd* under control of the Trace object with the current tracing parameters. + + +.. method:: Trace.runctx(cmd[, globals=None[, locals=None]]) + + Run *cmd* under control of the Trace object with the current tracing parameters + in the defined global and local environments. If not defined, *globals* and + *locals* default to empty dictionaries. + + +.. method:: Trace.runfunc(func, *args, **kwds) + + Call *func* with the given arguments under control of the :class:`Trace` object + with the current tracing parameters. + +This is a simple example showing the use of this module:: + + import sys + import trace + + # create a Trace object, telling it what to ignore, and whether to + # do tracing or line-counting or both. + tracer = trace.Trace( + ignoredirs=[sys.prefix, sys.exec_prefix], + trace=0, + count=1) + + # run the new command using the given tracer + tracer.run('main()') + + # make a report, placing output in /tmp + r = tracer.results() + r.write_results(show_missing=True, coverdir="/tmp") + diff --git a/Doc/library/traceback.rst b/Doc/library/traceback.rst new file mode 100644 index 0000000..ec8687f --- /dev/null +++ b/Doc/library/traceback.rst @@ -0,0 +1,160 @@ + +:mod:`traceback` --- Print or retrieve a stack traceback +======================================================== + +.. module:: traceback + :synopsis: Print or retrieve a stack traceback. + + +This module provides a standard interface to extract, format and print stack +traces of Python programs. It exactly mimics the behavior of the Python +interpreter when it prints a stack trace. This is useful when you want to print +stack traces under program control, such as in a "wrapper" around the +interpreter. + +.. index:: object: traceback + +The module uses traceback objects --- this is the object type that is stored in +the ``sys.last_traceback`` variable and returned as the third item from +:func:`sys.exc_info`. + +The module defines the following functions: + + +.. function:: print_tb(traceback[, limit[, file]]) + + Print up to *limit* stack trace entries from *traceback*. If *limit* is omitted + or ``None``, all entries are printed. If *file* is omitted or ``None``, the + output goes to ``sys.stderr``; otherwise it should be an open file or file-like + object to receive the output. + + +.. function:: print_exception(type, value, traceback[, limit[, file]]) + + Print exception information and up to *limit* stack trace entries from + *traceback* to *file*. This differs from :func:`print_tb` in the following ways: + (1) if *traceback* is not ``None``, it prints a header ``Traceback (most recent + call last):``; (2) it prints the exception *type* and *value* after the stack + trace; (3) if *type* is :exc:`SyntaxError` and *value* has the appropriate + format, it prints the line where the syntax error occurred with a caret + indicating the approximate position of the error. + + +.. function:: print_exc([limit[, file]]) + + This is a shorthand for ``print_exception(*sys.exc_info()``. + + +.. function:: format_exc([limit]) + + This is like ``print_exc(limit)`` but returns a string instead of printing to a + file. + + .. versionadded:: 2.4 + + +.. function:: print_last([limit[, file]]) + + This is a shorthand for ``print_exception(sys.last_type, sys.last_value, + sys.last_traceback, limit, file)``. + + +.. function:: print_stack([f[, limit[, file]]]) + + This function prints a stack trace from its invocation point. The optional *f* + argument can be used to specify an alternate stack frame to start. The optional + *limit* and *file* arguments have the same meaning as for + :func:`print_exception`. + + +.. function:: extract_tb(traceback[, limit]) + + Return a list of up to *limit* "pre-processed" stack trace entries extracted + from the traceback object *traceback*. It is useful for alternate formatting of + stack traces. If *limit* is omitted or ``None``, all entries are extracted. A + "pre-processed" stack trace entry is a quadruple (*filename*, *line number*, + *function name*, *text*) representing the information that is usually printed + for a stack trace. The *text* is a string with leading and trailing whitespace + stripped; if the source is not available it is ``None``. + + +.. function:: extract_stack([f[, limit]]) + + Extract the raw traceback from the current stack frame. The return value has + the same format as for :func:`extract_tb`. The optional *f* and *limit* + arguments have the same meaning as for :func:`print_stack`. + + +.. function:: format_list(list) + + Given a list of tuples as returned by :func:`extract_tb` or + :func:`extract_stack`, return a list of strings ready for printing. Each string + in the resulting list corresponds to the item with the same index in the + argument list. Each string ends in a newline; the strings may contain internal + newlines as well, for those items whose source text line is not ``None``. + + +.. function:: format_exception_only(type, value) + + Format the exception part of a traceback. The arguments are the exception type + and value such as given by ``sys.last_type`` and ``sys.last_value``. The return + value is a list of strings, each ending in a newline. Normally, the list + contains a single string; however, for :exc:`SyntaxError` exceptions, it + contains several lines that (when printed) display detailed information about + where the syntax error occurred. The message indicating which exception + occurred is the always last string in the list. + + +.. function:: format_exception(type, value, tb[, limit]) + + Format a stack trace and the exception information. The arguments have the + same meaning as the corresponding arguments to :func:`print_exception`. The + return value is a list of strings, each ending in a newline and some containing + internal newlines. When these lines are concatenated and printed, exactly the + same text is printed as does :func:`print_exception`. + + +.. function:: format_tb(tb[, limit]) + + A shorthand for ``format_list(extract_tb(tb, limit))``. + + +.. function:: format_stack([f[, limit]]) + + A shorthand for ``format_list(extract_stack(f, limit))``. + + +.. function:: tb_lineno(tb) + + This function returns the current line number set in the traceback object. This + function was necessary because in versions of Python prior to 2.3 when the + :option:`-O` flag was passed to Python the ``tb.tb_lineno`` was not updated + correctly. This function has no use in versions past 2.3. + + +.. _traceback-example: + +Traceback Example +----------------- + +This simple example implements a basic read-eval-print loop, similar to (but +less useful than) the standard Python interactive interpreter loop. For a more +complete implementation of the interpreter loop, refer to the :mod:`code` +module. :: + + import sys, traceback + + def run_user_code(envdir): + source = raw_input(">>> ") + try: + exec(source, envdir) + except: + print "Exception in user code:" + print '-'*60 + traceback.print_exc(file=sys.stdout) + print '-'*60 + + envdir = {} + while 1: + run_user_code(envdir) + diff --git a/Doc/library/tty.rst b/Doc/library/tty.rst new file mode 100644 index 0000000..688faee --- /dev/null +++ b/Doc/library/tty.rst @@ -0,0 +1,38 @@ + +:mod:`tty` --- Terminal control functions +========================================= + +.. module:: tty + :platform: Unix + :synopsis: Utility functions that perform common terminal control operations. +.. moduleauthor:: Steen Lumholt +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`tty` module defines functions for putting the tty into cbreak and raw +modes. + +Because it requires the :mod:`termios` module, it will work only on Unix. + +The :mod:`tty` module defines the following functions: + + +.. function:: setraw(fd[, when]) + + Change the mode of the file descriptor *fd* to raw. If *when* is omitted, it + defaults to :const:`termios.TCSAFLUSH`, and is passed to + :func:`termios.tcsetattr`. + + +.. function:: setcbreak(fd[, when]) + + Change the mode of file descriptor *fd* to cbreak. If *when* is omitted, it + defaults to :const:`termios.TCSAFLUSH`, and is passed to + :func:`termios.tcsetattr`. + + +.. seealso:: + + Module :mod:`termios` + Low-level terminal control interface. + diff --git a/Doc/library/turtle.rst b/Doc/library/turtle.rst new file mode 100644 index 0000000..354bb11 --- /dev/null +++ b/Doc/library/turtle.rst @@ -0,0 +1,312 @@ + +:mod:`turtle` --- Turtle graphics for Tk +======================================== + +.. module:: turtle + :platform: Tk + :synopsis: An environment for turtle graphics. +.. moduleauthor:: Guido van Rossum <guido@python.org> + + +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`turtle` module provides turtle graphics primitives, in both an +object-oriented and procedure-oriented ways. Because it uses :mod:`Tkinter` for +the underlying graphics, it needs a version of python installed with Tk support. + +The procedural interface uses a pen and a canvas which are automagically created +when any of the functions are called. + +The :mod:`turtle` module defines the following functions: + + +.. function:: degrees() + + Set angle measurement units to degrees. + + +.. function:: radians() + + Set angle measurement units to radians. + + +.. function:: setup(**kwargs) + + Sets the size and position of the main window. Keywords are: + + * ``width``: either a size in pixels or a fraction of the screen. The default is + 50% of the screen. + + * ``height``: either a size in pixels or a fraction of the screen. The default + is 50% of the screen. + + * ``startx``: starting position in pixels from the left edge of the screen. + ``None`` is the default value and centers the window horizontally on screen. + + * ``starty``: starting position in pixels from the top edge of the screen. + ``None`` is the default value and centers the window vertically on screen. + + Examples:: + + # Uses default geometry: 50% x 50% of screen, centered. + setup() + + # Sets window to 200x200 pixels, in upper left of screen + setup (width=200, height=200, startx=0, starty=0) + + # Sets window to 75% of screen by 50% of screen, and centers it. + setup(width=.75, height=0.5, startx=None, starty=None) + + +.. function:: title(title_str) + + Set the window's title to *title*. + + +.. function:: done() + + Enters the Tk main loop. The window will continue to be displayed until the + user closes it or the process is killed. + + +.. function:: reset() + + Clear the screen, re-center the pen, and set variables to the default values. + + +.. function:: clear() + + Clear the screen. + + +.. function:: tracer(flag) + + Set tracing on/off (according to whether flag is true or not). Tracing means + line are drawn more slowly, with an animation of an arrow along the line. + + +.. function:: speed(speed) + + Set the speed of the turtle. Valid values for the parameter *speed* are + ``'fastest'`` (no delay), ``'fast'``, (delay 5ms), ``'normal'`` (delay 10ms), + ``'slow'`` (delay 15ms), and ``'slowest'`` (delay 20ms). + + .. versionadded:: 2.5 + + +.. function:: delay(delay) + + Set the speed of the turtle to *delay*, which is given in ms. + + .. versionadded:: 2.5 + + +.. function:: forward(distance) + + Go forward *distance* steps. + + +.. function:: backward(distance) + + Go backward *distance* steps. + + +.. function:: left(angle) + + Turn left *angle* units. Units are by default degrees, but can be set via the + :func:`degrees` and :func:`radians` functions. + + +.. function:: right(angle) + + Turn right *angle* units. Units are by default degrees, but can be set via the + :func:`degrees` and :func:`radians` functions. + + +.. function:: up() + + Move the pen up --- stop drawing. + + +.. function:: down() + + Move the pen down --- draw when moving. + + +.. function:: width(width) + + Set the line width to *width*. + + +.. function:: color(s) + color((r, g, b)) + color(r, g, b) + + Set the pen color. In the first form, the color is specified as a Tk color + specification as a string. The second form specifies the color as a tuple of + the RGB values, each in the range [0..1]. For the third form, the color is + specified giving the RGB values as three separate parameters (each in the range + [0..1]). + + +.. function:: write(text[, move]) + + Write *text* at the current pen position. If *move* is true, the pen is moved to + the bottom-right corner of the text. By default, *move* is false. + + +.. function:: fill(flag) + + The complete specifications are rather complex, but the recommended usage is: + call ``fill(1)`` before drawing a path you want to fill, and call ``fill(0)`` + when you finish to draw the path. + + +.. function:: begin_fill() + + Switch turtle into filling mode; Must eventually be followed by a corresponding + end_fill() call. Otherwise it will be ignored. + + .. versionadded:: 2.5 + + +.. function:: end_fill() + + End filling mode, and fill the shape; equivalent to ``fill(0)``. + + .. versionadded:: 2.5 + + +.. function:: circle(radius[, extent]) + + Draw a circle with radius *radius* whose center-point is *radius* units left of + the turtle. *extent* determines which part of a circle is drawn: if not given it + defaults to a full circle. + + If *extent* is not a full circle, one endpoint of the arc is the current pen + position. The arc is drawn in a counter clockwise direction if *radius* is + positive, otherwise in a clockwise direction. In the process, the direction of + the turtle is changed by the amount of the *extent*. + + +.. function:: goto(x, y) + goto((x, y)) + + Go to co-ordinates *x*, *y*. The co-ordinates may be specified either as two + separate arguments or as a 2-tuple. + + +.. function:: towards(x, y) + + Return the angle of the line from the turtle's position to the point *x*, *y*. + The co-ordinates may be specified either as two separate arguments, as a + 2-tuple, or as another pen object. + + .. versionadded:: 2.5 + + +.. function:: heading() + + Return the current orientation of the turtle. + + .. versionadded:: 2.3 + + +.. function:: setheading(angle) + + Set the orientation of the turtle to *angle*. + + .. versionadded:: 2.3 + + +.. function:: position() + + Return the current location of the turtle as an ``(x,y)`` pair. + + .. versionadded:: 2.3 + + +.. function:: setx(x) + + Set the x coordinate of the turtle to *x*. + + .. versionadded:: 2.3 + + +.. function:: sety(y) + + Set the y coordinate of the turtle to *y*. + + .. versionadded:: 2.3 + + +.. function:: window_width() + + Return the width of the canvas window. + + .. versionadded:: 2.3 + + +.. function:: window_height() + + Return the height of the canvas window. + + .. versionadded:: 2.3 + +This module also does ``from math import *``, so see the documentation for the +:mod:`math` module for additional constants and functions useful for turtle +graphics. + + +.. function:: demo() + + Exercise the module a bit. + + +.. exception:: Error + + Exception raised on any error caught by this module. + +For examples, see the code of the :func:`demo` function. + +This module defines the following classes: + + +.. class:: Pen() + + Define a pen. All above functions can be called as a methods on the given pen. + The constructor automatically creates a canvas do be drawn on. + + +.. class:: Turtle() + + Define a pen. This is essentially a synonym for ``Pen()``; :class:`Turtle` is an + empty subclass of :class:`Pen`. + + +.. class:: RawPen(canvas) + + Define a pen which draws on a canvas *canvas*. This is useful if you want to + use the module to create graphics in a "real" program. + + +.. _pen-rawpen-objects: + +Turtle, Pen and RawPen Objects +------------------------------ + +Most of the global functions available in the module are also available as +methods of the :class:`Turtle`, :class:`Pen` and :class:`RawPen` classes, +affecting only the state of the given pen. + +The only method which is more powerful as a method is :func:`degrees`, which +takes an optional argument letting you specify the number of units +corresponding to a full circle: + + +.. method:: Turtle.degrees([fullcircle]) + + *fullcircle* is by default 360. This can cause the pen to have any angular units + whatever: give *fullcircle* 2\*$π for radians, or 400 for gradians. + diff --git a/Doc/library/types.rst b/Doc/library/types.rst new file mode 100644 index 0000000..c636a73 --- /dev/null +++ b/Doc/library/types.rst @@ -0,0 +1,257 @@ + +:mod:`types` --- Names for built-in types +========================================= + +.. module:: types + :synopsis: Names for built-in types. + + +This module defines names for some object types that are used by the standard +Python interpreter, but not for the types defined by various extension modules. +Also, it does not include some of the types that arise during processing such as +the ``listiterator`` type. It is safe to use ``from types import *`` --- the +module does not export any names besides the ones listed here. New names +exported by future versions of this module will all end in ``Type``. + +Typical use is for functions that do different things depending on their +argument types, like the following:: + + from types import * + def delete(mylist, item): + if type(item) is IntType: + del mylist[item] + else: + mylist.remove(item) + +Starting in Python 2.2, built-in factory functions such as :func:`int` and +:func:`str` are also names for the corresponding types. This is now the +preferred way to access the type instead of using the :mod:`types` module. +Accordingly, the example above should be written as follows:: + + def delete(mylist, item): + if isinstance(item, int): + del mylist[item] + else: + mylist.remove(item) + +The module defines the following names: + + +.. data:: NoneType + + The type of ``None``. + + +.. data:: TypeType + + .. index:: builtin: type + + The type of type objects (such as returned by :func:`type`); alias of the + built-in :class:`type`. + + +.. data:: BooleanType + + The type of the :class:`bool` values ``True`` and ``False``; alias of the + built-in :class:`bool`. + + .. versionadded:: 2.3 + + +.. data:: IntType + + The type of integers (e.g. ``1``); alias of the built-in :class:`int`. + + +.. data:: LongType + + The type of long integers (e.g. ``1L``); alias of the built-in :class:`long`. + + +.. data:: FloatType + + The type of floating point numbers (e.g. ``1.0``); alias of the built-in + :class:`float`. + + +.. data:: ComplexType + + The type of complex numbers (e.g. ``1.0j``). This is not defined if Python was + built without complex number support. + + +.. data:: StringType + + The type of character strings (e.g. ``'Spam'``); alias of the built-in + :class:`str`. + + +.. data:: UnicodeType + + The type of Unicode character strings (e.g. ``u'Spam'``). This is not defined + if Python was built without Unicode support. It's an alias of the built-in + :class:`unicode`. + + +.. data:: TupleType + + The type of tuples (e.g. ``(1, 2, 3, 'Spam')``); alias of the built-in + :class:`tuple`. + + +.. data:: ListType + + The type of lists (e.g. ``[0, 1, 2, 3]``); alias of the built-in + :class:`list`. + + +.. data:: DictType + + The type of dictionaries (e.g. ``{'Bacon': 1, 'Ham': 0}``); alias of the + built-in :class:`dict`. + + +.. data:: DictionaryType + + An alternate name for ``DictType``. + + +.. data:: FunctionType + + The type of user-defined functions and lambdas. + + +.. data:: LambdaType + + An alternate name for ``FunctionType``. + + +.. data:: GeneratorType + + The type of generator-iterator objects, produced by calling a generator + function. + + .. versionadded:: 2.2 + + +.. data:: CodeType + + .. index:: builtin: compile + + The type for code objects such as returned by :func:`compile`. + + +.. data:: ClassType + + The type of user-defined classes. + + +.. data:: MethodType + + The type of methods of user-defined class instances. + + +.. data:: UnboundMethodType + + An alternate name for ``MethodType``. + + +.. data:: BuiltinFunctionType + + The type of built-in functions like :func:`len` or :func:`sys.exit`. + + +.. data:: BuiltinMethodType + + An alternate name for ``BuiltinFunction``. + + +.. data:: ModuleType + + The type of modules. + + +.. data:: FileType + + The type of open file objects such as ``sys.stdout``; alias of the built-in + :class:`file`. + + +.. data:: RangeType + + .. index:: builtin: range + + The type of range objects returned by :func:`range`; alias of the built-in + :class:`range`. + + +.. data:: SliceType + + .. index:: builtin: slice + + The type of objects returned by :func:`slice`; alias of the built-in + :class:`slice`. + + +.. data:: EllipsisType + + The type of ``Ellipsis``. + + +.. data:: TracebackType + + The type of traceback objects such as found in ``sys.exc_info()[2]``. + + +.. data:: FrameType + + The type of frame objects such as found in ``tb.tb_frame`` if ``tb`` is a + traceback object. + + +.. data:: BufferType + + .. index:: builtin: buffer + + The type of buffer objects created by the :func:`buffer` function. + + +.. data:: DictProxyType + + The type of dict proxies, such as ``TypeType.__dict__``. + + +.. data:: NotImplementedType + + The type of ``NotImplemented`` + + +.. data:: GetSetDescriptorType + + The type of objects defined in extension modules with ``PyGetSetDef``, such as + ``FrameType.f_locals`` or ``array.array.typecode``. This constant is not + defined in implementations of Python that do not have such extension types, so + for portable code use ``hasattr(types, 'GetSetDescriptorType')``. + + .. versionadded:: 2.5 + + +.. data:: MemberDescriptorType + + The type of objects defined in extension modules with ``PyMemberDef``, such as + ``datetime.timedelta.days``. This constant is not defined in implementations of + Python that do not have such extension types, so for portable code use + ``hasattr(types, 'MemberDescriptorType')``. + + .. versionadded:: 2.5 + + +.. data:: StringTypes + + A sequence containing ``StringType`` and ``UnicodeType`` used to facilitate + easier checking for any string object. Using this is more portable than using a + sequence of the two string types constructed elsewhere since it only contains + ``UnicodeType`` if it has been built in the running version of Python. For + example: ``isinstance(s, types.StringTypes)``. + + .. versionadded:: 2.2 diff --git a/Doc/library/undoc.rst b/Doc/library/undoc.rst new file mode 100644 index 0000000..ad46fc8 --- /dev/null +++ b/Doc/library/undoc.rst @@ -0,0 +1,186 @@ + +.. _undoc: + +******************** +Undocumented Modules +******************** + +Here's a quick listing of modules that are currently undocumented, but that +should be documented. Feel free to contribute documentation for them! (Send +via email to docs@python.org.) + +The idea and original contents for this chapter were taken from a posting by +Fredrik Lundh; the specific contents of this chapter have been substantially +revised. + + +Miscellaneous useful utilities +============================== + +Some of these are very old and/or not very robust; marked with "hmm." + +:mod:`bdb` + --- A generic Python debugger base class (used by pdb). + +:mod:`ihooks` + --- Import hook support (for :mod:`rexec`; may become obsolete). + + +Platform specific modules +========================= + +These modules are used to implement the :mod:`os.path` module, and are not +documented beyond this mention. There's little need to document these. + +:mod:`ntpath` + --- Implementation of :mod:`os.path` on Win32, Win64, WinCE, and OS/2 platforms. + +:mod:`posixpath` + --- Implementation of :mod:`os.path` on POSIX. + + +Multimedia +========== + +:mod:`linuxaudiodev` + --- Play audio data on the Linux audio device. Replaced in Python 2.3 by the + :mod:`ossaudiodev` module. + +:mod:`sunaudio` + --- Interpret Sun audio headers (may become obsolete or a tool/demo). + + +.. _undoc-mac-modules: + +Undocumented Mac OS modules +=========================== + + +:mod:`applesingle` --- AppleSingle decoder +------------------------------------------ + +.. module:: applesingle + :platform: Mac + :synopsis: Rudimentary decoder for AppleSingle format files. + + + +:mod:`buildtools` --- Helper module for BuildApplet and Friends +--------------------------------------------------------------- + +.. module:: buildtools + :platform: Mac + :synopsis: Helper module for BuildApplet, BuildApplication and macfreeze. + + +.. deprecated:: 2.4 + + +:mod:`icopen` --- Internet Config replacement for :meth:`open` +-------------------------------------------------------------- + +.. module:: icopen + :platform: Mac + :synopsis: Internet Config replacement for open(). + + +Importing :mod:`icopen` will replace the builtin :meth:`open` with a version +that uses Internet Config to set file type and creator for new files. + + +:mod:`macerrors` --- Mac OS Errors +---------------------------------- + +.. module:: macerrors + :platform: Mac + :synopsis: Constant definitions for many Mac OS error codes. + + +:mod:`macerrors` contains constant definitions for many Mac OS error codes. + + +:mod:`macresource` --- Locate script resources +---------------------------------------------- + +.. module:: macresource + :platform: Mac + :synopsis: Locate script resources. + + +:mod:`macresource` helps scripts finding their resources, such as dialogs and +menus, without requiring special case code for when the script is run under +MacPython, as a MacPython applet or under OSX Python. + + +:mod:`Nav` --- NavServices calls +-------------------------------- + +.. module:: Nav + :platform: Mac + :synopsis: Interface to Navigation Services. + + +A low-level interface to Navigation Services. + + +:mod:`PixMapWrapper` --- Wrapper for PixMap objects +--------------------------------------------------- + +.. module:: PixMapWrapper + :platform: Mac + :synopsis: Wrapper for PixMap objects. + + +:mod:`PixMapWrapper` wraps a PixMap object with a Python object that allows +access to the fields by name. It also has methods to convert to and from +:mod:`PIL` images. + + +:mod:`videoreader` --- Read QuickTime movies +-------------------------------------------- + +.. module:: videoreader + :platform: Mac + :synopsis: Read QuickTime movies frame by frame for further processing. + + +:mod:`videoreader` reads and decodes QuickTime movies and passes a stream of +images to your program. It also provides some support for audio tracks. + + +:mod:`W` --- Widgets built on :mod:`FrameWork` +---------------------------------------------- + +.. module:: W + :platform: Mac + :synopsis: Widgets for the Mac, built on top of FrameWork. + + +The :mod:`W` widgets are used extensively in the :program:`IDE`. + + +.. _obsolete-modules: + +Obsolete +======== + +These modules are not normally available for import; additional work must be +done to make them available. + +These extension modules written in C are not built by default. Under Unix, these +must be enabled by uncommenting the appropriate lines in :file:`Modules/Setup` +in the build tree and either rebuilding Python if the modules are statically +linked, or building and installing the shared object if using dynamically-loaded +extensions. + +.. % %% lib-old is empty as of Python 2.5 +.. % Those which are written in Python will be installed into the directory +.. % \file{lib-old/} installed as part of the standard library. To use +.. % these, the directory must be added to \code{sys.path}, possibly using +.. % \envvar{PYTHONPATH}. + +.. % XXX need Windows instructions! + + + --- This section should be empty for Python 3.0. + diff --git a/Doc/library/unicodedata.rst b/Doc/library/unicodedata.rst new file mode 100644 index 0000000..017d4ee --- /dev/null +++ b/Doc/library/unicodedata.rst @@ -0,0 +1,165 @@ + +:mod:`unicodedata` --- Unicode Database +======================================= + +.. module:: unicodedata + :synopsis: Access the Unicode Database. +.. moduleauthor:: Marc-Andre Lemburg <mal@lemburg.com> +.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. index:: + single: Unicode + single: character + pair: Unicode; database + +This module provides access to the Unicode Character Database which defines +character properties for all Unicode characters. The data in this database is +based on the :file:`UnicodeData.txt` file version 4.1.0 which is publicly +available from ftp://ftp.unicode.org/. + +The module uses the same names and symbols as defined by the UnicodeData File +Format 4.1.0 (see http://www.unicode.org/Public/4.1.0/ucd/UCD.html). It defines +the following functions: + + +.. function:: lookup(name) + + Look up character by name. If a character with the given name is found, return + the corresponding Unicode character. If not found, :exc:`KeyError` is raised. + + +.. function:: name(unichr[, default]) + + Returns the name assigned to the Unicode character *unichr* as a string. If no + name is defined, *default* is returned, or, if not given, :exc:`ValueError` is + raised. + + +.. function:: decimal(unichr[, default]) + + Returns the decimal value assigned to the Unicode character *unichr* as integer. + If no such value is defined, *default* is returned, or, if not given, + :exc:`ValueError` is raised. + + +.. function:: digit(unichr[, default]) + + Returns the digit value assigned to the Unicode character *unichr* as integer. + If no such value is defined, *default* is returned, or, if not given, + :exc:`ValueError` is raised. + + +.. function:: numeric(unichr[, default]) + + Returns the numeric value assigned to the Unicode character *unichr* as float. + If no such value is defined, *default* is returned, or, if not given, + :exc:`ValueError` is raised. + + +.. function:: category(unichr) + + Returns the general category assigned to the Unicode character *unichr* as + string. + + +.. function:: bidirectional(unichr) + + Returns the bidirectional category assigned to the Unicode character *unichr* as + string. If no such value is defined, an empty string is returned. + + +.. function:: combining(unichr) + + Returns the canonical combining class assigned to the Unicode character *unichr* + as integer. Returns ``0`` if no combining class is defined. + + +.. function:: east_asian_width(unichr) + + Returns the east asian width assigned to the Unicode character *unichr* as + string. + + .. versionadded:: 2.4 + + +.. function:: mirrored(unichr) + + Returns the mirrored property assigned to the Unicode character *unichr* as + integer. Returns ``1`` if the character has been identified as a "mirrored" + character in bidirectional text, ``0`` otherwise. + + +.. function:: decomposition(unichr) + + Returns the character decomposition mapping assigned to the Unicode character + *unichr* as string. An empty string is returned in case no such mapping is + defined. + + +.. function:: normalize(form, unistr) + + Return the normal form *form* for the Unicode string *unistr*. Valid values for + *form* are 'NFC', 'NFKC', 'NFD', and 'NFKD'. + + The Unicode standard defines various normalization forms of a Unicode string, + based on the definition of canonical equivalence and compatibility equivalence. + In Unicode, several characters can be expressed in various way. For example, the + character U+00C7 (LATIN CAPITAL LETTER C WITH CEDILLA) can also be expressed as + the sequence U+0043 (LATIN CAPITAL LETTER C) U+0327 (COMBINING CEDILLA). + + For each character, there are two normal forms: normal form C and normal form D. + Normal form D (NFD) is also known as canonical decomposition, and translates + each character into its decomposed form. Normal form C (NFC) first applies a + canonical decomposition, then composes pre-combined characters again. + + In addition to these two forms, there are two additional normal forms based on + compatibility equivalence. In Unicode, certain characters are supported which + normally would be unified with other characters. For example, U+2160 (ROMAN + NUMERAL ONE) is really the same thing as U+0049 (LATIN CAPITAL LETTER I). + However, it is supported in Unicode for compatibility with existing character + sets (e.g. gb2312). + + The normal form KD (NFKD) will apply the compatibility decomposition, i.e. + replace all compatibility characters with their equivalents. The normal form KC + (NFKC) first applies the compatibility decomposition, followed by the canonical + composition. + + .. versionadded:: 2.3 + +In addition, the module exposes the following constant: + + +.. data:: unidata_version + + The version of the Unicode database used in this module. + + .. versionadded:: 2.3 + + +.. data:: ucd_3_2_0 + + This is an object that has the same methods as the entire module, but uses the + Unicode database version 3.2 instead, for applications that require this + specific version of the Unicode database (such as IDNA). + + .. versionadded:: 2.5 + +Examples:: + + >>> unicodedata.lookup('LEFT CURLY BRACKET') + u'{' + >>> unicodedata.name(u'/') + 'SOLIDUS' + >>> unicodedata.decimal(u'9') + 9 + >>> unicodedata.decimal(u'a') + Traceback (most recent call last): + File "<stdin>", line 1, in ? + ValueError: not a decimal + >>> unicodedata.category(u'A') # 'L'etter, 'u'ppercase + 'Lu' + >>> unicodedata.bidirectional(u'\u0660') # 'A'rabic, 'N'umber + 'AN' + diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst new file mode 100644 index 0000000..3d3727f --- /dev/null +++ b/Doc/library/unittest.rst @@ -0,0 +1,936 @@ + +:mod:`unittest` --- Unit testing framework +========================================== + +.. module:: unittest + :synopsis: Unit testing framework for Python. +.. moduleauthor:: Steve Purcell <stephen_purcell@yahoo.com> +.. sectionauthor:: Steve Purcell <stephen_purcell@yahoo.com> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> +.. sectionauthor:: Raymond Hettinger <python@rcn.com> + + +.. versionadded:: 2.1 + +The Python unit testing framework, sometimes referred to as "PyUnit," is a +Python language version of JUnit, by Kent Beck and Erich Gamma. JUnit is, in +turn, a Java version of Kent's Smalltalk testing framework. Each is the de +facto standard unit testing framework for its respective language. + +:mod:`unittest` supports test automation, sharing of setup and shutdown code for +tests, aggregation of tests into collections, and independence of the tests from +the reporting framework. The :mod:`unittest` module provides classes that make +it easy to support these qualities for a set of tests. + +To achieve this, :mod:`unittest` supports some important concepts: + +test fixture + A :dfn:`test fixture` represents the preparation needed to perform one or more + tests, and any associate cleanup actions. This may involve, for example, + creating temporary or proxy databases, directories, or starting a server + process. + +test case + A :dfn:`test case` is the smallest unit of testing. It checks for a specific + response to a particular set of inputs. :mod:`unittest` provides a base class, + :class:`TestCase`, which may be used to create new test cases. + +test suite + A :dfn:`test suite` is a collection of test cases, test suites, or both. It is + used to aggregate tests that should be executed together. + +test runner + A :dfn:`test runner` is a component which orchestrates the execution of tests + and provides the outcome to the user. The runner may use a graphical interface, + a textual interface, or return a special value to indicate the results of + executing the tests. + +The test case and test fixture concepts are supported through the +:class:`TestCase` and :class:`FunctionTestCase` classes; the former should be +used when creating new tests, and the latter can be used when integrating +existing test code with a :mod:`unittest`\ -driven framework. When building test +fixtures using :class:`TestCase`, the :meth:`setUp` and :meth:`tearDown` methods +can be overridden to provide initialization and cleanup for the fixture. With +:class:`FunctionTestCase`, existing functions can be passed to the constructor +for these purposes. When the test is run, the fixture initialization is run +first; if it succeeds, the cleanup method is run after the test has been +executed, regardless of the outcome of the test. Each instance of the +:class:`TestCase` will only be used to run a single test method, so a new +fixture is created for each test. + +Test suites are implemented by the :class:`TestSuite` class. This class allows +individual tests and test suites to be aggregated; when the suite is executed, +all tests added directly to the suite and in "child" test suites are run. + +A test runner is an object that provides a single method, :meth:`run`, which +accepts a :class:`TestCase` or :class:`TestSuite` object as a parameter, and +returns a result object. The class :class:`TestResult` is provided for use as +the result object. :mod:`unittest` provides the :class:`TextTestRunner` as an +example test runner which reports test results on the standard error stream by +default. Alternate runners can be implemented for other environments (such as +graphical environments) without any need to derive from a specific class. + + +.. seealso:: + + Module :mod:`doctest` + Another test-support module with a very different flavor. + + `Simple Smalltalk Testing: With Patterns <http://www.XProgramming.com/testfram.htm>`_ + Kent Beck's original paper on testing frameworks using the pattern shared by + :mod:`unittest`. + + +.. _unittest-minimal-example: + +Basic example +------------- + +The :mod:`unittest` module provides a rich set of tools for constructing and +running tests. This section demonstrates that a small subset of the tools +suffice to meet the needs of most users. + +Here is a short script to test three functions from the :mod:`random` module:: + + import random + import unittest + + class TestSequenceFunctions(unittest.TestCase): + + def setUp(self): + self.seq = range(10) + + def testshuffle(self): + # make sure the shuffled sequence does not lose any elements + random.shuffle(self.seq) + self.seq.sort() + self.assertEqual(self.seq, range(10)) + + def testchoice(self): + element = random.choice(self.seq) + self.assert_(element in self.seq) + + def testsample(self): + self.assertRaises(ValueError, random.sample, self.seq, 20) + for element in random.sample(self.seq, 5): + self.assert_(element in self.seq) + + if __name__ == '__main__': + unittest.main() + +A testcase is created by subclassing :class:`unittest.TestCase`. The three +individual tests are defined with methods whose names start with the letters +``test``. This naming convention informs the test runner about which methods +represent tests. + +The crux of each test is a call to :meth:`assertEqual` to check for an expected +result; :meth:`assert_` to verify a condition; or :meth:`assertRaises` to verify +that an expected exception gets raised. These methods are used instead of the +:keyword:`assert` statement so the test runner can accumulate all test results +and produce a report. + +When a :meth:`setUp` method is defined, the test runner will run that method +prior to each test. Likewise, if a :meth:`tearDown` method is defined, the test +runner will invoke that method after each test. In the example, :meth:`setUp` +was used to create a fresh sequence for each test. + +The final block shows a simple way to run the tests. :func:`unittest.main` +provides a command line interface to the test script. When run from the command +line, the above script produces an output that looks like this:: + + ... + ---------------------------------------------------------------------- + Ran 3 tests in 0.000s + + OK + +Instead of :func:`unittest.main`, there are other ways to run the tests with a +finer level of control, less terse output, and no requirement to be run from the +command line. For example, the last two lines may be replaced with:: + + suite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions) + unittest.TextTestRunner(verbosity=2).run(suite) + +Running the revised script from the interpreter or another script produces the +following output:: + + testchoice (__main__.TestSequenceFunctions) ... ok + testsample (__main__.TestSequenceFunctions) ... ok + testshuffle (__main__.TestSequenceFunctions) ... ok + + ---------------------------------------------------------------------- + Ran 3 tests in 0.110s + + OK + +The above examples show the most commonly used :mod:`unittest` features which +are sufficient to meet many everyday testing needs. The remainder of the +documentation explores the full feature set from first principles. + + +.. _organizing-tests: + +Organizing test code +-------------------- + +The basic building blocks of unit testing are :dfn:`test cases` --- single +scenarios that must be set up and checked for correctness. In :mod:`unittest`, +test cases are represented by instances of :mod:`unittest`'s :class:`TestCase` +class. To make your own test cases you must write subclasses of +:class:`TestCase`, or use :class:`FunctionTestCase`. + +An instance of a :class:`TestCase`\ -derived class is an object that can +completely run a single test method, together with optional set-up and tidy-up +code. + +The testing code of a :class:`TestCase` instance should be entirely self +contained, such that it can be run either in isolation or in arbitrary +combination with any number of other test cases. + +The simplest :class:`TestCase` subclass will simply override the :meth:`runTest` +method in order to perform specific testing code:: + + import unittest + + class DefaultWidgetSizeTestCase(unittest.TestCase): + def runTest(self): + widget = Widget('The widget') + self.assertEqual(widget.size(), (50, 50), 'incorrect default size') + +Note that in order to test something, we use the one of the :meth:`assert\*` or +:meth:`fail\*` methods provided by the :class:`TestCase` base class. If the +test fails, an exception will be raised, and :mod:`unittest` will identify the +test case as a :dfn:`failure`. Any other exceptions will be treated as +:dfn:`errors`. This helps you identify where the problem is: :dfn:`failures` are +caused by incorrect results - a 5 where you expected a 6. :dfn:`Errors` are +caused by incorrect code - e.g., a :exc:`TypeError` caused by an incorrect +function call. + +The way to run a test case will be described later. For now, note that to +construct an instance of such a test case, we call its constructor without +arguments:: + + testCase = DefaultWidgetSizeTestCase() + +Now, such test cases can be numerous, and their set-up can be repetitive. In +the above case, constructing a :class:`Widget` in each of 100 Widget test case +subclasses would mean unsightly duplication. + +Luckily, we can factor out such set-up code by implementing a method called +:meth:`setUp`, which the testing framework will automatically call for us when +we run the test:: + + import unittest + + class SimpleWidgetTestCase(unittest.TestCase): + def setUp(self): + self.widget = Widget('The widget') + + class DefaultWidgetSizeTestCase(SimpleWidgetTestCase): + def runTest(self): + self.failUnless(self.widget.size() == (50,50), + 'incorrect default size') + + class WidgetResizeTestCase(SimpleWidgetTestCase): + def runTest(self): + self.widget.resize(100,150) + self.failUnless(self.widget.size() == (100,150), + 'wrong size after resize') + +If the :meth:`setUp` method raises an exception while the test is running, the +framework will consider the test to have suffered an error, and the +:meth:`runTest` method will not be executed. + +Similarly, we can provide a :meth:`tearDown` method that tidies up after the +:meth:`runTest` method has been run:: + + import unittest + + class SimpleWidgetTestCase(unittest.TestCase): + def setUp(self): + self.widget = Widget('The widget') + + def tearDown(self): + self.widget.dispose() + self.widget = None + +If :meth:`setUp` succeeded, the :meth:`tearDown` method will be run whether +:meth:`runTest` succeeded or not. + +Such a working environment for the testing code is called a :dfn:`fixture`. + +Often, many small test cases will use the same fixture. In this case, we would +end up subclassing :class:`SimpleWidgetTestCase` into many small one-method +classes such as :class:`DefaultWidgetSizeTestCase`. This is time-consuming and + +discouraging, so in the same vein as JUnit, :mod:`unittest` provides a simpler +mechanism:: + + import unittest + + class WidgetTestCase(unittest.TestCase): + def setUp(self): + self.widget = Widget('The widget') + + def tearDown(self): + self.widget.dispose() + self.widget = None + + def testDefaultSize(self): + self.failUnless(self.widget.size() == (50,50), + 'incorrect default size') + + def testResize(self): + self.widget.resize(100,150) + self.failUnless(self.widget.size() == (100,150), + 'wrong size after resize') + +Here we have not provided a :meth:`runTest` method, but have instead provided +two different test methods. Class instances will now each run one of the +:meth:`test\*` methods, with ``self.widget`` created and destroyed separately +for each instance. When creating an instance we must specify the test method it +is to run. We do this by passing the method name in the constructor:: + + defaultSizeTestCase = WidgetTestCase('testDefaultSize') + resizeTestCase = WidgetTestCase('testResize') + +Test case instances are grouped together according to the features they test. +:mod:`unittest` provides a mechanism for this: the :dfn:`test suite`, +represented by :mod:`unittest`'s :class:`TestSuite` class:: + + widgetTestSuite = unittest.TestSuite() + widgetTestSuite.addTest(WidgetTestCase('testDefaultSize')) + widgetTestSuite.addTest(WidgetTestCase('testResize')) + +For the ease of running tests, as we will see later, it is a good idea to +provide in each test module a callable object that returns a pre-built test +suite:: + + def suite(): + suite = unittest.TestSuite() + suite.addTest(WidgetTestCase('testDefaultSize')) + suite.addTest(WidgetTestCase('testResize')) + return suite + +or even:: + + def suite(): + tests = ['testDefaultSize', 'testResize'] + + return unittest.TestSuite(map(WidgetTestCase, tests)) + +Since it is a common pattern to create a :class:`TestCase` subclass with many +similarly named test functions, :mod:`unittest` provides a :class:`TestLoader` +class that can be used to automate the process of creating a test suite and +populating it with individual tests. For example, :: + + suite = unittest.TestLoader().loadTestsFromTestCase(WidgetTestCase) + +will create a test suite that will run ``WidgetTestCase.testDefaultSize()`` and +``WidgetTestCase.testResize``. :class:`TestLoader` uses the ``'test'`` method +name prefix to identify test methods automatically. + +Note that the order in which the various test cases will be run is determined by +sorting the test function names with the built-in :func:`cmp` function. + +Often it is desirable to group suites of test cases together, so as to run tests +for the whole system at once. This is easy, since :class:`TestSuite` instances +can be added to a :class:`TestSuite` just as :class:`TestCase` instances can be +added to a :class:`TestSuite`:: + + suite1 = module1.TheTestSuite() + suite2 = module2.TheTestSuite() + alltests = unittest.TestSuite([suite1, suite2]) + +You can place the definitions of test cases and test suites in the same modules +as the code they are to test (such as :file:`widget.py`), but there are several +advantages to placing the test code in a separate module, such as +:file:`test_widget.py`: + +* The test module can be run standalone from the command line. + +* The test code can more easily be separated from shipped code. + +* There is less temptation to change test code to fit the code it tests without + a good reason. + +* Test code should be modified much less frequently than the code it tests. + +* Tested code can be refactored more easily. + +* Tests for modules written in C must be in separate modules anyway, so why not + be consistent? + +* If the testing strategy changes, there is no need to change the source code. + + +.. _legacy-unit-tests: + +Re-using old test code +---------------------- + +Some users will find that they have existing test code that they would like to +run from :mod:`unittest`, without converting every old test function to a +:class:`TestCase` subclass. + +For this reason, :mod:`unittest` provides a :class:`FunctionTestCase` class. +This subclass of :class:`TestCase` can be used to wrap an existing test +function. Set-up and tear-down functions can also be provided. + +Given the following test function:: + + def testSomething(): + something = makeSomething() + assert something.name is not None + # ... + +one can create an equivalent test case instance as follows:: + + testcase = unittest.FunctionTestCase(testSomething) + +If there are additional set-up and tear-down methods that should be called as +part of the test case's operation, they can also be provided like so:: + + testcase = unittest.FunctionTestCase(testSomething, + setUp=makeSomethingDB, + tearDown=deleteSomethingDB) + +To make migrating existing test suites easier, :mod:`unittest` supports tests +raising :exc:`AssertionError` to indicate test failure. However, it is +recommended that you use the explicit :meth:`TestCase.fail\*` and +:meth:`TestCase.assert\*` methods instead, as future versions of :mod:`unittest` +may treat :exc:`AssertionError` differently. + +.. note:: + + Even though :class:`FunctionTestCase` can be used to quickly convert an existing + test base over to a :mod:`unittest`\ -based system, this approach is not + recommended. Taking the time to set up proper :class:`TestCase` subclasses will + make future test refactorings infinitely easier. + + +.. _unittest-contents: + +Classes and functions +--------------------- + + +.. class:: TestCase([methodName]) + + Instances of the :class:`TestCase` class represent the smallest testable units + in the :mod:`unittest` universe. This class is intended to be used as a base + class, with specific tests being implemented by concrete subclasses. This class + implements the interface needed by the test runner to allow it to drive the + test, and methods that the test code can use to check for and report various + kinds of failure. + + Each instance of :class:`TestCase` will run a single test method: the method + named *methodName*. If you remember, we had an earlier example that went + something like this:: + + def suite(): + suite = unittest.TestSuite() + suite.addTest(WidgetTestCase('testDefaultSize')) + suite.addTest(WidgetTestCase('testResize')) + return suite + + Here, we create two instances of :class:`WidgetTestCase`, each of which runs a + single test. + + *methodName* defaults to ``'runTest'``. + + +.. class:: FunctionTestCase(testFunc[, setUp[, tearDown[, description]]]) + + This class implements the portion of the :class:`TestCase` interface which + allows the test runner to drive the test, but does not provide the methods which + test code can use to check and report errors. This is used to create test cases + using legacy test code, allowing it to be integrated into a :mod:`unittest`\ + -based test framework. + + +.. class:: TestSuite([tests]) + + This class represents an aggregation of individual tests cases and test suites. + The class presents the interface needed by the test runner to allow it to be run + as any other test case. Running a :class:`TestSuite` instance is the same as + iterating over the suite, running each test individually. + + If *tests* is given, it must be an iterable of individual test cases or other + test suites that will be used to build the suite initially. Additional methods + are provided to add test cases and suites to the collection later on. + + +.. class:: TestLoader() + + This class is responsible for loading tests according to various criteria and + returning them wrapped in a :class:`TestSuite`. It can load all tests within a + given module or :class:`TestCase` subclass. + + +.. class:: TestResult() + + This class is used to compile information about which tests have succeeded and + which have failed. + + +.. data:: defaultTestLoader + + Instance of the :class:`TestLoader` class intended to be shared. If no + customization of the :class:`TestLoader` is needed, this instance can be used + instead of repeatedly creating new instances. + + +.. class:: TextTestRunner([stream[, descriptions[, verbosity]]]) + + A basic test runner implementation which prints results on standard error. It + has a few configurable parameters, but is essentially very simple. Graphical + applications which run test suites should provide alternate implementations. + + +.. function:: main([module[, defaultTest[, argv[, testRunner[, testLoader]]]]]) + + A command-line program that runs a set of tests; this is primarily for making + test modules conveniently executable. The simplest use for this function is to + include the following line at the end of a test script:: + + if __name__ == '__main__': + unittest.main() + + The *testRunner* argument can either be a test runner class or an already + created instance of it. + +In some cases, the existing tests may have been written using the :mod:`doctest` +module. If so, that module provides a :class:`DocTestSuite` class that can +automatically build :class:`unittest.TestSuite` instances from the existing +:mod:`doctest`\ -based tests. + +.. versionadded:: 2.3 + + +.. _testcase-objects: + +TestCase Objects +---------------- + +Each :class:`TestCase` instance represents a single test, but each concrete +subclass may be used to define multiple tests --- the concrete class represents +a single test fixture. The fixture is created and cleaned up for each test +case. + +:class:`TestCase` instances provide three groups of methods: one group used to +run the test, another used by the test implementation to check conditions and +report failures, and some inquiry methods allowing information about the test +itself to be gathered. + +Methods in the first group (running the test) are: + + +.. method:: TestCase.setUp() + + Method called to prepare the test fixture. This is called immediately before + calling the test method; any exception raised by this method will be considered + an error rather than a test failure. The default implementation does nothing. + + +.. method:: TestCase.tearDown() + + Method called immediately after the test method has been called and the result + recorded. This is called even if the test method raised an exception, so the + implementation in subclasses may need to be particularly careful about checking + internal state. Any exception raised by this method will be considered an error + rather than a test failure. This method will only be called if the + :meth:`setUp` succeeds, regardless of the outcome of the test method. The + default implementation does nothing. + + +.. method:: TestCase.run([result]) + + Run the test, collecting the result into the test result object passed as + *result*. If *result* is omitted or :const:`None`, a temporary result object is + created (by calling the :meth:`defaultTestCase` method) and used; this result + object is not returned to :meth:`run`'s caller. + + The same effect may be had by simply calling the :class:`TestCase` instance. + + +.. method:: TestCase.debug() + + Run the test without collecting the result. This allows exceptions raised by + the test to be propagated to the caller, and can be used to support running + tests under a debugger. + +The test code can use any of the following methods to check for and report +failures. + + +.. method:: TestCase.assert_(expr[, msg]) + TestCase.failUnless(expr[, msg]) + + Signal a test failure if *expr* is false; the explanation for the error will be + *msg* if given, otherwise it will be :const:`None`. + + +.. method:: TestCase.assertEqual(first, second[, msg]) + TestCase.failUnlessEqual(first, second[, msg]) + + Test that *first* and *second* are equal. If the values do not compare equal, + the test will fail with the explanation given by *msg*, or :const:`None`. Note + that using :meth:`failUnlessEqual` improves upon doing the comparison as the + first parameter to :meth:`failUnless`: the default value for *msg* can be + computed to include representations of both *first* and *second*. + + +.. method:: TestCase.assertNotEqual(first, second[, msg]) + TestCase.failIfEqual(first, second[, msg]) + + Test that *first* and *second* are not equal. If the values do compare equal, + the test will fail with the explanation given by *msg*, or :const:`None`. Note + that using :meth:`failIfEqual` improves upon doing the comparison as the first + parameter to :meth:`failUnless` is that the default value for *msg* can be + computed to include representations of both *first* and *second*. + + +.. method:: TestCase.assertAlmostEqual(first, second[, places[, msg]]) + TestCase.failUnlessAlmostEqual(first, second[, places[, msg]]) + + Test that *first* and *second* are approximately equal by computing the + difference, rounding to the given number of *places*, and comparing to zero. + Note that comparing a given number of decimal places is not the same as + comparing a given number of significant digits. If the values do not compare + equal, the test will fail with the explanation given by *msg*, or :const:`None`. + + +.. method:: TestCase.assertNotAlmostEqual(first, second[, places[, msg]]) + TestCase.failIfAlmostEqual(first, second[, places[, msg]]) + + Test that *first* and *second* are not approximately equal by computing the + difference, rounding to the given number of *places*, and comparing to zero. + Note that comparing a given number of decimal places is not the same as + comparing a given number of significant digits. If the values do not compare + equal, the test will fail with the explanation given by *msg*, or :const:`None`. + + +.. method:: TestCase.assertRaises(exception, callable, ...) + TestCase.failUnlessRaises(exception, callable, ...) + + Test that an exception is raised when *callable* is called with any positional + or keyword arguments that are also passed to :meth:`assertRaises`. The test + passes if *exception* is raised, is an error if another exception is raised, or + fails if no exception is raised. To catch any of a group of exceptions, a tuple + containing the exception classes may be passed as *exception*. + + +.. method:: TestCase.failIf(expr[, msg]) + + The inverse of the :meth:`failUnless` method is the :meth:`failIf` method. This + signals a test failure if *expr* is true, with *msg* or :const:`None` for the + error message. + + +.. method:: TestCase.fail([msg]) + + Signals a test failure unconditionally, with *msg* or :const:`None` for the + error message. + + +.. attribute:: TestCase.failureException + + This class attribute gives the exception raised by the :meth:`test` method. If + a test framework needs to use a specialized exception, possibly to carry + additional information, it must subclass this exception in order to "play fair" + with the framework. The initial value of this attribute is + :exc:`AssertionError`. + +Testing frameworks can use the following methods to collect information on the +test: + + +.. method:: TestCase.countTestCases() + + Return the number of tests represented by this test object. For + :class:`TestCase` instances, this will always be ``1``. + + +.. method:: TestCase.defaultTestResult() + + Return an instance of the test result class that should be used for this test + case class (if no other result instance is provided to the :meth:`run` method). + + For :class:`TestCase` instances, this will always be an instance of + :class:`TestResult`; subclasses of :class:`TestCase` should override this as + necessary. + + +.. method:: TestCase.id() + + Return a string identifying the specific test case. This is usually the full + name of the test method, including the module and class name. + + +.. method:: TestCase.shortDescription() + + Returns a one-line description of the test, or :const:`None` if no description + has been provided. The default implementation of this method returns the first + line of the test method's docstring, if available, or :const:`None`. + + +.. _testsuite-objects: + +TestSuite Objects +----------------- + +:class:`TestSuite` objects behave much like :class:`TestCase` objects, except +they do not actually implement a test. Instead, they are used to aggregate +tests into groups of tests that should be run together. Some additional methods +are available to add tests to :class:`TestSuite` instances: + + +.. method:: TestSuite.addTest(test) + + Add a :class:`TestCase` or :class:`TestSuite` to the suite. + + +.. method:: TestSuite.addTests(tests) + + Add all the tests from an iterable of :class:`TestCase` and :class:`TestSuite` + instances to this test suite. + + This is equivalent to iterating over *tests*, calling :meth:`addTest` for each + element. + +:class:`TestSuite` shares the following methods with :class:`TestCase`: + + +.. method:: TestSuite.run(result) + + Run the tests associated with this suite, collecting the result into the test + result object passed as *result*. Note that unlike :meth:`TestCase.run`, + :meth:`TestSuite.run` requires the result object to be passed in. + + +.. method:: TestSuite.debug() + + Run the tests associated with this suite without collecting the result. This + allows exceptions raised by the test to be propagated to the caller and can be + used to support running tests under a debugger. + + +.. method:: TestSuite.countTestCases() + + Return the number of tests represented by this test object, including all + individual tests and sub-suites. + +In the typical usage of a :class:`TestSuite` object, the :meth:`run` method is +invoked by a :class:`TestRunner` rather than by the end-user test harness. + + +.. _testresult-objects: + +TestResult Objects +------------------ + +A :class:`TestResult` object stores the results of a set of tests. The +:class:`TestCase` and :class:`TestSuite` classes ensure that results are +properly recorded; test authors do not need to worry about recording the outcome +of tests. + +Testing frameworks built on top of :mod:`unittest` may want access to the +:class:`TestResult` object generated by running a set of tests for reporting +purposes; a :class:`TestResult` instance is returned by the +:meth:`TestRunner.run` method for this purpose. + +:class:`TestResult` instances have the following attributes that will be of +interest when inspecting the results of running a set of tests: + + +.. attribute:: TestResult.errors + + A list containing 2-tuples of :class:`TestCase` instances and strings holding + formatted tracebacks. Each tuple represents a test which raised an unexpected + exception. + + .. versionchanged:: 2.2 + Contains formatted tracebacks instead of :func:`sys.exc_info` results. + + +.. attribute:: TestResult.failures + + A list containing 2-tuples of :class:`TestCase` instances and strings holding + formatted tracebacks. Each tuple represents a test where a failure was + explicitly signalled using the :meth:`TestCase.fail\*` or + :meth:`TestCase.assert\*` methods. + + .. versionchanged:: 2.2 + Contains formatted tracebacks instead of :func:`sys.exc_info` results. + + +.. attribute:: TestResult.testsRun + + The total number of tests run so far. + + +.. method:: TestResult.wasSuccessful() + + Returns :const:`True` if all tests run so far have passed, otherwise returns + :const:`False`. + + +.. method:: TestResult.stop() + + This method can be called to signal that the set of tests being run should be + aborted by setting the :class:`TestResult`'s ``shouldStop`` attribute to + :const:`True`. :class:`TestRunner` objects should respect this flag and return + without running any additional tests. + + For example, this feature is used by the :class:`TextTestRunner` class to stop + the test framework when the user signals an interrupt from the keyboard. + Interactive tools which provide :class:`TestRunner` implementations can use this + in a similar manner. + +The following methods of the :class:`TestResult` class are used to maintain the +internal data structures, and may be extended in subclasses to support +additional reporting requirements. This is particularly useful in building +tools which support interactive reporting while tests are being run. + + +.. method:: TestResult.startTest(test) + + Called when the test case *test* is about to be run. + + The default implementation simply increments the instance's ``testsRun`` + counter. + + +.. method:: TestResult.stopTest(test) + + Called after the test case *test* has been executed, regardless of the outcome. + + The default implementation does nothing. + + +.. method:: TestResult.addError(test, err) + + Called when the test case *test* raises an unexpected exception *err* is a tuple + of the form returned by :func:`sys.exc_info`: ``(type, value, traceback)``. + + The default implementation appends ``(test, err)`` to the instance's ``errors`` + attribute. + + +.. method:: TestResult.addFailure(test, err) + + Called when the test case *test* signals a failure. *err* is a tuple of the form + returned by :func:`sys.exc_info`: ``(type, value, traceback)``. + + The default implementation appends ``(test, err)`` to the instance's + ``failures`` attribute. + + +.. method:: TestResult.addSuccess(test) + + Called when the test case *test* succeeds. + + The default implementation does nothing. + + +.. _testloader-objects: + +TestLoader Objects +------------------ + +The :class:`TestLoader` class is used to create test suites from classes and +modules. Normally, there is no need to create an instance of this class; the +:mod:`unittest` module provides an instance that can be shared as +``unittest.defaultTestLoader``. Using a subclass or instance, however, allows +customization of some configurable properties. + +:class:`TestLoader` objects have the following methods: + + +.. method:: TestLoader.loadTestsFromTestCase(testCaseClass) + + Return a suite of all tests cases contained in the :class:`TestCase`\ -derived + :class:`testCaseClass`. + + +.. method:: TestLoader.loadTestsFromModule(module) + + Return a suite of all tests cases contained in the given module. This method + searches *module* for classes derived from :class:`TestCase` and creates an + instance of the class for each test method defined for the class. + + .. warning:: + + While using a hierarchy of :class:`TestCase`\ -derived classes can be convenient + in sharing fixtures and helper functions, defining test methods on base classes + that are not intended to be instantiated directly does not play well with this + method. Doing so, however, can be useful when the fixtures are different and + defined in subclasses. + + +.. method:: TestLoader.loadTestsFromName(name[, module]) + + Return a suite of all tests cases given a string specifier. + + The specifier *name* is a "dotted name" that may resolve either to a module, a + test case class, a test method within a test case class, a :class:`TestSuite` + instance, or a callable object which returns a :class:`TestCase` or + :class:`TestSuite` instance. These checks are applied in the order listed here; + that is, a method on a possible test case class will be picked up as "a test + method within a test case class", rather than "a callable object". + + For example, if you have a module :mod:`SampleTests` containing a + :class:`TestCase`\ -derived class :class:`SampleTestCase` with three test + methods (:meth:`test_one`, :meth:`test_two`, and :meth:`test_three`), the + specifier ``'SampleTests.SampleTestCase'`` would cause this method to return a + suite which will run all three test methods. Using the specifier + ``'SampleTests.SampleTestCase.test_two'`` would cause it to return a test suite + which will run only the :meth:`test_two` test method. The specifier can refer + to modules and packages which have not been imported; they will be imported as a + side-effect. + + The method optionally resolves *name* relative to the given *module*. + + +.. method:: TestLoader.loadTestsFromNames(names[, module]) + + Similar to :meth:`loadTestsFromName`, but takes a sequence of names rather than + a single name. The return value is a test suite which supports all the tests + defined for each name. + + +.. method:: TestLoader.getTestCaseNames(testCaseClass) + + Return a sorted sequence of method names found within *testCaseClass*; this + should be a subclass of :class:`TestCase`. + +The following attributes of a :class:`TestLoader` can be configured either by +subclassing or assignment on an instance: + + +.. attribute:: TestLoader.testMethodPrefix + + String giving the prefix of method names which will be interpreted as test + methods. The default value is ``'test'``. + + This affects :meth:`getTestCaseNames` and all the :meth:`loadTestsFrom\*` + methods. + + +.. attribute:: TestLoader.sortTestMethodsUsing + + Function to be used to compare method names when sorting them in + :meth:`getTestCaseNames` and all the :meth:`loadTestsFrom\*` methods. The + default value is the built-in :func:`cmp` function; the attribute can also be + set to :const:`None` to disable the sort. + + +.. attribute:: TestLoader.suiteClass + + Callable object that constructs a test suite from a list of tests. No methods on + the resulting object are needed. The default value is the :class:`TestSuite` + class. + + This affects all the :meth:`loadTestsFrom\*` methods. + diff --git a/Doc/library/unix.rst b/Doc/library/unix.rst new file mode 100644 index 0000000..b60af0f --- /dev/null +++ b/Doc/library/unix.rst @@ -0,0 +1,29 @@ + +.. _unix: + +********************** +Unix Specific Services +********************** + +The modules described in this chapter provide interfaces to features that are +unique to the Unix operating system, or in some cases to some or many variants +of it. Here's an overview: + + +.. toctree:: + + posix.rst + pwd.rst + spwd.rst + grp.rst + crypt.rst + dl.rst + termios.rst + tty.rst + pty.rst + fcntl.rst + pipes.rst + resource.rst + nis.rst + syslog.rst + commands.rst diff --git a/Doc/library/urllib.rst b/Doc/library/urllib.rst new file mode 100644 index 0000000..ef8264f --- /dev/null +++ b/Doc/library/urllib.rst @@ -0,0 +1,471 @@ + +:mod:`urllib` --- Open arbitrary resources by URL +================================================= + +.. module:: urllib + :synopsis: Open an arbitrary network resource by URL (requires sockets). + + +.. index:: + single: WWW + single: World Wide Web + single: URL + +This module provides a high-level interface for fetching data across the World +Wide Web. In particular, the :func:`urlopen` function is similar to the +built-in function :func:`open`, but accepts Universal Resource Locators (URLs) +instead of filenames. Some restrictions apply --- it can only open URLs for +reading, and no seek operations are available. + +It defines the following public functions: + + +.. function:: urlopen(url[, data[, proxies]]) + + Open a network object denoted by a URL for reading. If the URL does not have a + scheme identifier, or if it has :file:`file:` as its scheme identifier, this + opens a local file (without universal newlines); otherwise it opens a socket to + a server somewhere on the network. If the connection cannot be made the + :exc:`IOError` exception is raised. If all went well, a file-like object is + returned. This supports the following methods: :meth:`read`, :meth:`readline`, + :meth:`readlines`, :meth:`fileno`, :meth:`close`, :meth:`info` and + :meth:`geturl`. It also has proper support for the iterator protocol. One + caveat: the :meth:`read` method, if the size argument is omitted or negative, + may not read until the end of the data stream; there is no good way to determine + that the entire stream from a socket has been read in the general case. + + Except for the :meth:`info` and :meth:`geturl` methods, these methods have the + same interface as for file objects --- see section :ref:`bltin-file-objects` in + this manual. (It is not a built-in file object, however, so it can't be used at + those few places where a true built-in file object is required.) + + .. index:: module: mimetools + + The :meth:`info` method returns an instance of the class + :class:`mimetools.Message` containing meta-information associated with the + URL. When the method is HTTP, these headers are those returned by the server + at the head of the retrieved HTML page (including Content-Length and + Content-Type). When the method is FTP, a Content-Length header will be + present if (as is now usual) the server passed back a file length in response + to the FTP retrieval request. A Content-Type header will be present if the + MIME type can be guessed. When the method is local-file, returned headers + will include a Date representing the file's last-modified time, a + Content-Length giving file size, and a Content-Type containing a guess at the + file's type. See also the description of the :mod:`mimetools` module. + + The :meth:`geturl` method returns the real URL of the page. In some cases, the + HTTP server redirects a client to another URL. The :func:`urlopen` function + handles this transparently, but in some cases the caller needs to know which URL + the client was redirected to. The :meth:`geturl` method can be used to get at + this redirected URL. + + If the *url* uses the :file:`http:` scheme identifier, the optional *data* + argument may be given to specify a ``POST`` request (normally the request type + is ``GET``). The *data* argument must be in standard + :mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode` + function below. + + The :func:`urlopen` function works transparently with proxies which do not + require authentication. In a Unix or Windows environment, set the + :envvar:`http_proxy`, or :envvar:`ftp_proxy` environment variables to a URL that + identifies the proxy server before starting the Python interpreter. For example + (the ``'%'`` is the command prompt):: + + % http_proxy="http://www.someproxy.com:3128" + % export http_proxy + % python + ... + + In a Windows environment, if no proxy environment variables are set, proxy + settings are obtained from the registry's Internet Settings section. + + .. index:: single: Internet Config + + In a Macintosh environment, :func:`urlopen` will retrieve proxy information from + Internet Config. + + Alternatively, the optional *proxies* argument may be used to explicitly specify + proxies. It must be a dictionary mapping scheme names to proxy URLs, where an + empty dictionary causes no proxies to be used, and ``None`` (the default value) + causes environmental proxy settings to be used as discussed above. For + example:: + + # Use http://www.someproxy.com:3128 for http proxying + proxies = {'http': 'http://www.someproxy.com:3128'} + filehandle = urllib.urlopen(some_url, proxies=proxies) + # Don't use any proxies + filehandle = urllib.urlopen(some_url, proxies={}) + # Use proxies from environment - both versions are equivalent + filehandle = urllib.urlopen(some_url, proxies=None) + filehandle = urllib.urlopen(some_url) + + The :func:`urlopen` function does not support explicit proxy specification. If + you need to override environmental proxy settings, use :class:`URLopener`, or a + subclass such as :class:`FancyURLopener`. + + Proxies which require authentication for use are not currently supported; this + is considered an implementation limitation. + + .. versionchanged:: 2.3 + Added the *proxies* support. + + +.. function:: urlretrieve(url[, filename[, reporthook[, data]]]) + + Copy a network object denoted by a URL to a local file, if necessary. If the URL + points to a local file, or a valid cached copy of the object exists, the object + is not copied. Return a tuple ``(filename, headers)`` where *filename* is the + local file name under which the object can be found, and *headers* is whatever + the :meth:`info` method of the object returned by :func:`urlopen` returned (for + a remote object, possibly cached). Exceptions are the same as for + :func:`urlopen`. + + The second argument, if present, specifies the file location to copy to (if + absent, the location will be a tempfile with a generated name). The third + argument, if present, is a hook function that will be called once on + establishment of the network connection and once after each block read + thereafter. The hook will be passed three arguments; a count of blocks + transferred so far, a block size in bytes, and the total size of the file. The + third argument may be ``-1`` on older FTP servers which do not return a file + size in response to a retrieval request. + + If the *url* uses the :file:`http:` scheme identifier, the optional *data* + argument may be given to specify a ``POST`` request (normally the request type + is ``GET``). The *data* argument must in standard + :mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode` + function below. + + .. versionchanged:: 2.5 + :func:`urlretrieve` will raise :exc:`ContentTooShortError` when it detects that + the amount of data available was less than the expected amount (which is the + size reported by a *Content-Length* header). This can occur, for example, when + the download is interrupted. + + The *Content-Length* is treated as a lower bound: if there's more data to read, + urlretrieve reads more data, but if less data is available, it raises the + exception. + + You can still retrieve the downloaded data in this case, it is stored in the + :attr:`content` attribute of the exception instance. + + If no *Content-Length* header was supplied, urlretrieve can not check the size + of the data it has downloaded, and just returns it. In this case you just have + to assume that the download was successful. + + +.. data:: _urlopener + + The public functions :func:`urlopen` and :func:`urlretrieve` create an instance + of the :class:`FancyURLopener` class and use it to perform their requested + actions. To override this functionality, programmers can create a subclass of + :class:`URLopener` or :class:`FancyURLopener`, then assign an instance of that + class to the ``urllib._urlopener`` variable before calling the desired function. + For example, applications may want to specify a different + :mailheader:`User-Agent` header than :class:`URLopener` defines. This can be + accomplished with the following code:: + + import urllib + + class AppURLopener(urllib.FancyURLopener): + version = "App/1.7" + + urllib._urlopener = AppURLopener() + + +.. function:: urlcleanup() + + Clear the cache that may have been built up by previous calls to + :func:`urlretrieve`. + + +.. function:: quote(string[, safe]) + + Replace special characters in *string* using the ``%xx`` escape. Letters, + digits, and the characters ``'_.-'`` are never quoted. The optional *safe* + parameter specifies additional characters that should not be quoted --- its + default value is ``'/'``. + + Example: ``quote('/~connolly/')`` yields ``'/%7econnolly/'``. + + +.. function:: quote_plus(string[, safe]) + + Like :func:`quote`, but also replaces spaces by plus signs, as required for + quoting HTML form values. Plus signs in the original string are escaped unless + they are included in *safe*. It also does not have *safe* default to ``'/'``. + + +.. function:: unquote(string) + + Replace ``%xx`` escapes by their single-character equivalent. + + Example: ``unquote('/%7Econnolly/')`` yields ``'/~connolly/'``. + + +.. function:: unquote_plus(string) + + Like :func:`unquote`, but also replaces plus signs by spaces, as required for + unquoting HTML form values. + + +.. function:: urlencode(query[, doseq]) + + Convert a mapping object or a sequence of two-element tuples to a "url-encoded" + string, suitable to pass to :func:`urlopen` above as the optional *data* + argument. This is useful to pass a dictionary of form fields to a ``POST`` + request. The resulting string is a series of ``key=value`` pairs separated by + ``'&'`` characters, where both *key* and *value* are quoted using + :func:`quote_plus` above. If the optional parameter *doseq* is present and + evaluates to true, individual ``key=value`` pairs are generated for each element + of the sequence. When a sequence of two-element tuples is used as the *query* + argument, the first element of each tuple is a key and the second is a value. + The order of parameters in the encoded string will match the order of parameter + tuples in the sequence. The :mod:`cgi` module provides the functions + :func:`parse_qs` and :func:`parse_qsl` which are used to parse query strings + into Python data structures. + + +.. function:: pathname2url(path) + + Convert the pathname *path* from the local syntax for a path to the form used in + the path component of a URL. This does not produce a complete URL. The return + value will already be quoted using the :func:`quote` function. + + +.. function:: url2pathname(path) + + Convert the path component *path* from an encoded URL to the local syntax for a + path. This does not accept a complete URL. This function uses :func:`unquote` + to decode *path*. + + +.. class:: URLopener([proxies[, **x509]]) + + Base class for opening and reading URLs. Unless you need to support opening + objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`, + you probably want to use :class:`FancyURLopener`. + + By default, the :class:`URLopener` class sends a :mailheader:`User-Agent` header + of ``urllib/VVV``, where *VVV* is the :mod:`urllib` version number. + Applications can define their own :mailheader:`User-Agent` header by subclassing + :class:`URLopener` or :class:`FancyURLopener` and setting the class attribute + :attr:`version` to an appropriate string value in the subclass definition. + + The optional *proxies* parameter should be a dictionary mapping scheme names to + proxy URLs, where an empty dictionary turns proxies off completely. Its default + value is ``None``, in which case environmental proxy settings will be used if + present, as discussed in the definition of :func:`urlopen`, above. + + Additional keyword parameters, collected in *x509*, may be used for + authentication of the client when using the :file:`https:` scheme. The keywords + *key_file* and *cert_file* are supported to provide an SSL key and certificate; + both are needed to support client authentication. + + :class:`URLopener` objects will raise an :exc:`IOError` exception if the server + returns an error code. + + +.. class:: FancyURLopener(...) + + :class:`FancyURLopener` subclasses :class:`URLopener` providing default handling + for the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x + response codes listed above, the :mailheader:`Location` header is used to fetch + the actual URL. For 401 response codes (authentication required), basic HTTP + authentication is performed. For the 30x response codes, recursion is bounded + by the value of the *maxtries* attribute, which defaults to 10. + + For all other response codes, the method :meth:`http_error_default` is called + which you can override in subclasses to handle the error appropriately. + + .. note:: + + According to the letter of :rfc:`2616`, 301 and 302 responses to POST requests + must not be automatically redirected without confirmation by the user. In + reality, browsers do allow automatic redirection of these responses, changing + the POST to a GET, and :mod:`urllib` reproduces this behaviour. + + The parameters to the constructor are the same as those for :class:`URLopener`. + + .. note:: + + When performing basic authentication, a :class:`FancyURLopener` instance calls + its :meth:`prompt_user_passwd` method. The default implementation asks the + users for the required information on the controlling terminal. A subclass may + override this method to support more appropriate behavior if needed. + + +.. exception:: ContentTooShortError(msg[, content]) + + This exception is raised when the :func:`urlretrieve` function detects that the + amount of the downloaded data is less than the expected amount (given by the + *Content-Length* header). The :attr:`content` attribute stores the downloaded + (and supposedly truncated) data. + + .. versionadded:: 2.5 + +Restrictions: + + .. index:: + pair: HTTP; protocol + pair: FTP; protocol + +* Currently, only the following protocols are supported: HTTP, (versions 0.9 and + 1.0), FTP, and local files. + +* The caching feature of :func:`urlretrieve` has been disabled until I find the + time to hack proper processing of Expiration time headers. + +* There should be a function to query whether a particular URL is in the cache. + +* For backward compatibility, if a URL appears to point to a local file but the + file can't be opened, the URL is re-interpreted using the FTP protocol. This + can sometimes cause confusing error messages. + +* The :func:`urlopen` and :func:`urlretrieve` functions can cause arbitrarily + long delays while waiting for a network connection to be set up. This means + that it is difficult to build an interactive Web client using these functions + without using threads. + + .. index:: + single: HTML + pair: HTTP; protocol + module: htmllib + +* The data returned by :func:`urlopen` or :func:`urlretrieve` is the raw data + returned by the server. This may be binary data (such as an image), plain text + or (for example) HTML. The HTTP protocol provides type information in the reply + header, which can be inspected by looking at the :mailheader:`Content-Type` + header. If the returned data is HTML, you can use the module :mod:`htmllib` to + parse it. + + .. index:: single: FTP + +* The code handling the FTP protocol cannot differentiate between a file and a + directory. This can lead to unexpected behavior when attempting to read a URL + that points to a file that is not accessible. If the URL ends in a ``/``, it is + assumed to refer to a directory and will be handled accordingly. But if an + attempt to read a file leads to a 550 error (meaning the URL cannot be found or + is not accessible, often for permission reasons), then the path is treated as a + directory in order to handle the case when a directory is specified by a URL but + the trailing ``/`` has been left off. This can cause misleading results when + you try to fetch a file whose read permissions make it inaccessible; the FTP + code will try to read it, fail with a 550 error, and then perform a directory + listing for the unreadable file. If fine-grained control is needed, consider + using the :mod:`ftplib` module, subclassing :class:`FancyURLOpener`, or changing + *_urlopener* to meet your needs. + +* This module does not support the use of proxies which require authentication. + This may be implemented in the future. + + .. index:: module: urlparse + +* Although the :mod:`urllib` module contains (undocumented) routines to parse + and unparse URL strings, the recommended interface for URL manipulation is in + module :mod:`urlparse`. + + +.. _urlopener-objs: + +URLopener Objects +----------------- + +.. sectionauthor:: Skip Montanaro <skip@mojam.com> + + +:class:`URLopener` and :class:`FancyURLopener` objects have the following +attributes. + + +.. method:: URLopener.open(fullurl[, data]) + + Open *fullurl* using the appropriate protocol. This method sets up cache and + proxy information, then calls the appropriate open method with its input + arguments. If the scheme is not recognized, :meth:`open_unknown` is called. + The *data* argument has the same meaning as the *data* argument of + :func:`urlopen`. + + +.. method:: URLopener.open_unknown(fullurl[, data]) + + Overridable interface to open unknown URL types. + + +.. method:: URLopener.retrieve(url[, filename[, reporthook[, data]]]) + + Retrieves the contents of *url* and places it in *filename*. The return value + is a tuple consisting of a local filename and either a + :class:`mimetools.Message` object containing the response headers (for remote + URLs) or ``None`` (for local URLs). The caller must then open and read the + contents of *filename*. If *filename* is not given and the URL refers to a + local file, the input filename is returned. If the URL is non-local and + *filename* is not given, the filename is the output of :func:`tempfile.mktemp` + with a suffix that matches the suffix of the last path component of the input + URL. If *reporthook* is given, it must be a function accepting three numeric + parameters. It will be called after each chunk of data is read from the + network. *reporthook* is ignored for local URLs. + + If the *url* uses the :file:`http:` scheme identifier, the optional *data* + argument may be given to specify a ``POST`` request (normally the request type + is ``GET``). The *data* argument must in standard + :mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode` + function below. + + +.. attribute:: URLopener.version + + Variable that specifies the user agent of the opener object. To get + :mod:`urllib` to tell servers that it is a particular user agent, set this in a + subclass as a class variable or in the constructor before calling the base + constructor. + +The :class:`FancyURLopener` class offers one additional method that should be +overloaded to provide the appropriate behavior: + + +.. method:: FancyURLopener.prompt_user_passwd(host, realm) + + Return information needed to authenticate the user at the given host in the + specified security realm. The return value should be a tuple, ``(user, + password)``, which can be used for basic authentication. + + The implementation prompts for this information on the terminal; an application + should override this method to use an appropriate interaction model in the local + environment. + + +.. _urllib-examples: + +Examples +-------- + +Here is an example session that uses the ``GET`` method to retrieve a URL +containing parameters:: + + >>> import urllib + >>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) + >>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params) + >>> print f.read() + +The following example uses the ``POST`` method instead:: + + >>> import urllib + >>> params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) + >>> f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query", params) + >>> print f.read() + +The following example uses an explicitly specified HTTP proxy, overriding +environment settings:: + + >>> import urllib + >>> proxies = {'http': 'http://proxy.example.com:8080/'} + >>> opener = urllib.FancyURLopener(proxies) + >>> f = opener.open("http://www.python.org") + >>> f.read() + +The following example uses no proxies at all, overriding environment settings:: + + >>> import urllib + >>> opener = urllib.FancyURLopener({}) + >>> f = opener.open("http://www.python.org/") + >>> f.read() + diff --git a/Doc/library/urllib2.rst b/Doc/library/urllib2.rst new file mode 100644 index 0000000..41bb033 --- /dev/null +++ b/Doc/library/urllib2.rst @@ -0,0 +1,927 @@ +:mod:`urllib2` --- extensible library for opening URLs +====================================================== + +.. module:: urllib2 + :synopsis: Next generation URL opening library. +.. moduleauthor:: Jeremy Hylton <jhylton@users.sourceforge.net> +.. sectionauthor:: Moshe Zadka <moshez@users.sourceforge.net> + + +The :mod:`urllib2` module defines functions and classes which help in opening +URLs (mostly HTTP) in a complex world --- basic and digest authentication, +redirections, cookies and more. + +The :mod:`urllib2` module defines the following functions: + + +.. function:: urlopen(url[, data][, timeout]) + + Open the URL *url*, which can be either a string or a :class:`Request` object. + + *data* may be a string specifying additional data to send to the server, or + ``None`` if no such data is needed. Currently HTTP requests are the only ones + that use *data*; the HTTP request will be a POST instead of a GET when the + *data* parameter is provided. *data* should be a buffer in the standard + :mimetype:`application/x-www-form-urlencoded` format. The + :func:`urllib.urlencode` function takes a mapping or sequence of 2-tuples and + returns a string in this format. + + The optional *timeout* parameter specifies a timeout in seconds for the + connection attempt (if not specified, or passed as None, the global default + timeout setting will be used). This actually only work for HTTP, HTTPS, FTP and + FTPS connections. + + This function returns a file-like object with two additional methods: + + * :meth:`geturl` --- return the URL of the resource retrieved + + * :meth:`info` --- return the meta-information of the page, as a dictionary-like + object + + Raises :exc:`URLError` on errors. + + Note that ``None`` may be returned if no handler handles the request (though the + default installed global :class:`OpenerDirector` uses :class:`UnknownHandler` to + ensure this never happens). + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. function:: install_opener(opener) + + Install an :class:`OpenerDirector` instance as the default global opener. + Installing an opener is only necessary if you want urlopen to use that opener; + otherwise, simply call :meth:`OpenerDirector.open` instead of :func:`urlopen`. + The code does not check for a real :class:`OpenerDirector`, and any class with + the appropriate interface will work. + + +.. function:: build_opener([handler, ...]) + + Return an :class:`OpenerDirector` instance, which chains the handlers in the + order given. *handler*\s can be either instances of :class:`BaseHandler`, or + subclasses of :class:`BaseHandler` (in which case it must be possible to call + the constructor without any parameters). Instances of the following classes + will be in front of the *handler*\s, unless the *handler*\s contain them, + instances of them or subclasses of them: :class:`ProxyHandler`, + :class:`UnknownHandler`, :class:`HTTPHandler`, :class:`HTTPDefaultErrorHandler`, + :class:`HTTPRedirectHandler`, :class:`FTPHandler`, :class:`FileHandler`, + :class:`HTTPErrorProcessor`. + + If the Python installation has SSL support (:func:`socket.ssl` exists), + :class:`HTTPSHandler` will also be added. + + Beginning in Python 2.3, a :class:`BaseHandler` subclass may also change its + :attr:`handler_order` member variable to modify its position in the handlers + list. + +The following exceptions are raised as appropriate: + + +.. exception:: URLError + + The handlers raise this exception (or derived exceptions) when they run into a + problem. It is a subclass of :exc:`IOError`. + + +.. exception:: HTTPError + + A subclass of :exc:`URLError`, it can also function as a non-exceptional + file-like return value (the same thing that :func:`urlopen` returns). This + is useful when handling exotic HTTP errors, such as requests for + authentication. + +The following classes are provided: + + +.. class:: Request(url[, data][, headers] [, origin_req_host][, unverifiable]) + + This class is an abstraction of a URL request. + + *url* should be a string containing a valid URL. + + *data* may be a string specifying additional data to send to the server, or + ``None`` if no such data is needed. Currently HTTP requests are the only ones + that use *data*; the HTTP request will be a POST instead of a GET when the + *data* parameter is provided. *data* should be a buffer in the standard + :mimetype:`application/x-www-form-urlencoded` format. The + :func:`urllib.urlencode` function takes a mapping or sequence of 2-tuples and + returns a string in this format. + + *headers* should be a dictionary, and will be treated as if :meth:`add_header` + was called with each key and value as arguments. + + The final two arguments are only of interest for correct handling of third-party + HTTP cookies: + + *origin_req_host* should be the request-host of the origin transaction, as + defined by :rfc:`2965`. It defaults to ``cookielib.request_host(self)``. This + is the host name or IP address of the original request that was initiated by the + user. For example, if the request is for an image in an HTML document, this + should be the request-host of the request for the page containing the image. + + *unverifiable* should indicate whether the request is unverifiable, as defined + by RFC 2965. It defaults to False. An unverifiable request is one whose URL + the user did not have the option to approve. For example, if the request is for + an image in an HTML document, and the user had no option to approve the + automatic fetching of the image, this should be true. + + +.. class:: OpenerDirector() + + The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained + together. It manages the chaining of handlers, and recovery from errors. + + +.. class:: BaseHandler() + + This is the base class for all registered handlers --- and handles only the + simple mechanics of registration. + + +.. class:: HTTPDefaultErrorHandler() + + A class which defines a default handler for HTTP error responses; all responses + are turned into :exc:`HTTPError` exceptions. + + +.. class:: HTTPRedirectHandler() + + A class to handle redirections. + + +.. class:: HTTPCookieProcessor([cookiejar]) + + A class to handle HTTP Cookies. + + +.. class:: ProxyHandler([proxies]) + + Cause requests to go through a proxy. If *proxies* is given, it must be a + dictionary mapping protocol names to URLs of proxies. The default is to read the + list of proxies from the environment variables :envvar:`<protocol>_proxy`. + + +.. class:: HTTPPasswordMgr() + + Keep a database of ``(realm, uri) -> (user, password)`` mappings. + + +.. class:: HTTPPasswordMgrWithDefaultRealm() + + Keep a database of ``(realm, uri) -> (user, password)`` mappings. A realm of + ``None`` is considered a catch-all realm, which is searched if no other realm + fits. + + +.. class:: AbstractBasicAuthHandler([password_mgr]) + + This is a mixin class that helps with HTTP authentication, both to the remote + host and to a proxy. *password_mgr*, if given, should be something that is + compatible with :class:`HTTPPasswordMgr`; refer to section + :ref:`http-password-mgr` for information on the interface that must be + supported. + + +.. class:: HTTPBasicAuthHandler([password_mgr]) + + Handle authentication with the remote host. *password_mgr*, if given, should be + something that is compatible with :class:`HTTPPasswordMgr`; refer to section + :ref:`http-password-mgr` for information on the interface that must be + supported. + + +.. class:: ProxyBasicAuthHandler([password_mgr]) + + Handle authentication with the proxy. *password_mgr*, if given, should be + something that is compatible with :class:`HTTPPasswordMgr`; refer to section + :ref:`http-password-mgr` for information on the interface that must be + supported. + + +.. class:: AbstractDigestAuthHandler([password_mgr]) + + This is a mixin class that helps with HTTP authentication, both to the remote + host and to a proxy. *password_mgr*, if given, should be something that is + compatible with :class:`HTTPPasswordMgr`; refer to section + :ref:`http-password-mgr` for information on the interface that must be + supported. + + +.. class:: HTTPDigestAuthHandler([password_mgr]) + + Handle authentication with the remote host. *password_mgr*, if given, should be + something that is compatible with :class:`HTTPPasswordMgr`; refer to section + :ref:`http-password-mgr` for information on the interface that must be + supported. + + +.. class:: ProxyDigestAuthHandler([password_mgr]) + + Handle authentication with the proxy. *password_mgr*, if given, should be + something that is compatible with :class:`HTTPPasswordMgr`; refer to section + :ref:`http-password-mgr` for information on the interface that must be + supported. + + +.. class:: HTTPHandler() + + A class to handle opening of HTTP URLs. + + +.. class:: HTTPSHandler() + + A class to handle opening of HTTPS URLs. + + +.. class:: FileHandler() + + Open local files. + + +.. class:: FTPHandler() + + Open FTP URLs. + + +.. class:: CacheFTPHandler() + + Open FTP URLs, keeping a cache of open FTP connections to minimize delays. + + +.. class:: UnknownHandler() + + A catch-all class to handle unknown URLs. + + +.. _request-objects: + +Request Objects +--------------- + +The following methods describe all of :class:`Request`'s public interface, and +so all must be overridden in subclasses. + + +.. method:: Request.add_data(data) + + Set the :class:`Request` data to *data*. This is ignored by all handlers except + HTTP handlers --- and there it should be a byte string, and will change the + request to be ``POST`` rather than ``GET``. + + +.. method:: Request.get_method() + + Return a string indicating the HTTP request method. This is only meaningful for + HTTP requests, and currently always returns ``'GET'`` or ``'POST'``. + + +.. method:: Request.has_data() + + Return whether the instance has a non-\ ``None`` data. + + +.. method:: Request.get_data() + + Return the instance's data. + + +.. method:: Request.add_header(key, val) + + Add another header to the request. Headers are currently ignored by all + handlers except HTTP handlers, where they are added to the list of headers sent + to the server. Note that there cannot be more than one header with the same + name, and later calls will overwrite previous calls in case the *key* collides. + Currently, this is no loss of HTTP functionality, since all headers which have + meaning when used more than once have a (header-specific) way of gaining the + same functionality using only one header. + + +.. method:: Request.add_unredirected_header(key, header) + + Add a header that will not be added to a redirected request. + + .. versionadded:: 2.4 + + +.. method:: Request.has_header(header) + + Return whether the instance has the named header (checks both regular and + unredirected). + + .. versionadded:: 2.4 + + +.. method:: Request.get_full_url() + + Return the URL given in the constructor. + + +.. method:: Request.get_type() + + Return the type of the URL --- also known as the scheme. + + +.. method:: Request.get_host() + + Return the host to which a connection will be made. + + +.. method:: Request.get_selector() + + Return the selector --- the part of the URL that is sent to the server. + + +.. method:: Request.set_proxy(host, type) + + Prepare the request by connecting to a proxy server. The *host* and *type* will + replace those of the instance, and the instance's selector will be the original + URL given in the constructor. + + +.. method:: Request.get_origin_req_host() + + Return the request-host of the origin transaction, as defined by :rfc:`2965`. + See the documentation for the :class:`Request` constructor. + + +.. method:: Request.is_unverifiable() + + Return whether the request is unverifiable, as defined by RFC 2965. See the + documentation for the :class:`Request` constructor. + + +.. _opener-director-objects: + +OpenerDirector Objects +---------------------- + +:class:`OpenerDirector` instances have the following methods: + + +.. method:: OpenerDirector.add_handler(handler) + + *handler* should be an instance of :class:`BaseHandler`. The following methods + are searched, and added to the possible chains (note that HTTP errors are a + special case). + + * :meth:`protocol_open` --- signal that the handler knows how to open *protocol* + URLs. + + * :meth:`http_error_type` --- signal that the handler knows how to handle HTTP + errors with HTTP error code *type*. + + * :meth:`protocol_error` --- signal that the handler knows how to handle errors + from (non-\ ``http``) *protocol*. + + * :meth:`protocol_request` --- signal that the handler knows how to pre-process + *protocol* requests. + + * :meth:`protocol_response` --- signal that the handler knows how to + post-process *protocol* responses. + + +.. method:: OpenerDirector.open(url[, data][, timeout]) + + Open the given *url* (which can be a request object or a string), optionally + passing the given *data*. Arguments, return values and exceptions raised are the + same as those of :func:`urlopen` (which simply calls the :meth:`open` method on + the currently installed global :class:`OpenerDirector`). The optional *timeout* + parameter specifies a timeout in seconds for the connection attempt (if not + specified, or passed as None, the global default timeout setting will be used; + this actually only work for HTTP, HTTPS, FTP and FTPS connections). + + .. versionchanged:: 2.6 + *timeout* was added. + + +.. method:: OpenerDirector.error(proto[, arg[, ...]]) + + Handle an error of the given protocol. This will call the registered error + handlers for the given protocol with the given arguments (which are protocol + specific). The HTTP protocol is a special case which uses the HTTP response + code to determine the specific error handler; refer to the :meth:`http_error_\*` + methods of the handler classes. + + Return values and exceptions raised are the same as those of :func:`urlopen`. + +OpenerDirector objects open URLs in three stages: + +The order in which these methods are called within each stage is determined by +sorting the handler instances. + +#. Every handler with a method named like :meth:`protocol_request` has that + method called to pre-process the request. + +#. Handlers with a method named like :meth:`protocol_open` are called to handle + the request. This stage ends when a handler either returns a non-\ :const:`None` + value (ie. a response), or raises an exception (usually :exc:`URLError`). + Exceptions are allowed to propagate. + + In fact, the above algorithm is first tried for methods named + :meth:`default_open`. If all such methods return :const:`None`, the algorithm + is repeated for methods named like :meth:`protocol_open`. If all such methods + return :const:`None`, the algorithm is repeated for methods named + :meth:`unknown_open`. + + Note that the implementation of these methods may involve calls of the parent + :class:`OpenerDirector` instance's :meth:`.open` and :meth:`.error` methods. + +#. Every handler with a method named like :meth:`protocol_response` has that + method called to post-process the response. + + +.. _base-handler-objects: + +BaseHandler Objects +------------------- + +:class:`BaseHandler` objects provide a couple of methods that are directly +useful, and others that are meant to be used by derived classes. These are +intended for direct use: + + +.. method:: BaseHandler.add_parent(director) + + Add a director as parent. + + +.. method:: BaseHandler.close() + + Remove any parents. + +The following members and methods should only be used by classes derived from +:class:`BaseHandler`. + +.. note:: + + The convention has been adopted that subclasses defining + :meth:`protocol_request` or :meth:`protocol_response` methods are named + :class:`\*Processor`; all others are named :class:`\*Handler`. + + +.. attribute:: BaseHandler.parent + + A valid :class:`OpenerDirector`, which can be used to open using a different + protocol, or handle errors. + + +.. method:: BaseHandler.default_open(req) + + This method is *not* defined in :class:`BaseHandler`, but subclasses should + define it if they want to catch all URLs. + + This method, if implemented, will be called by the parent + :class:`OpenerDirector`. It should return a file-like object as described in + the return value of the :meth:`open` of :class:`OpenerDirector`, or ``None``. + It should raise :exc:`URLError`, unless a truly exceptional thing happens (for + example, :exc:`MemoryError` should not be mapped to :exc:`URLError`). + + This method will be called before any protocol-specific open method. + + +.. method:: BaseHandler.protocol_open(req) + :noindex: + + This method is *not* defined in :class:`BaseHandler`, but subclasses should + define it if they want to handle URLs with the given protocol. + + This method, if defined, will be called by the parent :class:`OpenerDirector`. + Return values should be the same as for :meth:`default_open`. + + +.. method:: BaseHandler.unknown_open(req) + + This method is *not* defined in :class:`BaseHandler`, but subclasses should + define it if they want to catch all URLs with no specific registered handler to + open it. + + This method, if implemented, will be called by the :attr:`parent` + :class:`OpenerDirector`. Return values should be the same as for + :meth:`default_open`. + + +.. method:: BaseHandler.http_error_default(req, fp, code, msg, hdrs) + + This method is *not* defined in :class:`BaseHandler`, but subclasses should + override it if they intend to provide a catch-all for otherwise unhandled HTTP + errors. It will be called automatically by the :class:`OpenerDirector` getting + the error, and should not normally be called in other circumstances. + + *req* will be a :class:`Request` object, *fp* will be a file-like object with + the HTTP error body, *code* will be the three-digit code of the error, *msg* + will be the user-visible explanation of the code and *hdrs* will be a mapping + object with the headers of the error. + + Return values and exceptions raised should be the same as those of + :func:`urlopen`. + + +.. method:: BaseHandler.http_error_nnn(req, fp, code, msg, hdrs) + + *nnn* should be a three-digit HTTP error code. This method is also not defined + in :class:`BaseHandler`, but will be called, if it exists, on an instance of a + subclass, when an HTTP error with code *nnn* occurs. + + Subclasses should override this method to handle specific HTTP errors. + + Arguments, return values and exceptions raised should be the same as for + :meth:`http_error_default`. + + +.. method:: BaseHandler.protocol_request(req) + :noindex: + + This method is *not* defined in :class:`BaseHandler`, but subclasses should + define it if they want to pre-process requests of the given protocol. + + This method, if defined, will be called by the parent :class:`OpenerDirector`. + *req* will be a :class:`Request` object. The return value should be a + :class:`Request` object. + + +.. method:: BaseHandler.protocol_response(req, response) + :noindex: + + This method is *not* defined in :class:`BaseHandler`, but subclasses should + define it if they want to post-process responses of the given protocol. + + This method, if defined, will be called by the parent :class:`OpenerDirector`. + *req* will be a :class:`Request` object. *response* will be an object + implementing the same interface as the return value of :func:`urlopen`. The + return value should implement the same interface as the return value of + :func:`urlopen`. + + +.. _http-redirect-handler: + +HTTPRedirectHandler Objects +--------------------------- + +.. note:: + + Some HTTP redirections require action from this module's client code. If this + is the case, :exc:`HTTPError` is raised. See :rfc:`2616` for details of the + precise meanings of the various redirection codes. + + +.. method:: HTTPRedirectHandler.redirect_request(req, fp, code, msg, hdrs) + + Return a :class:`Request` or ``None`` in response to a redirect. This is called + by the default implementations of the :meth:`http_error_30\*` methods when a + redirection is received from the server. If a redirection should take place, + return a new :class:`Request` to allow :meth:`http_error_30\*` to perform the + redirect. Otherwise, raise :exc:`HTTPError` if no other handler should try to + handle this URL, or return ``None`` if you can't but another handler might. + + .. note:: + + The default implementation of this method does not strictly follow :rfc:`2616`, + which says that 301 and 302 responses to ``POST`` requests must not be + automatically redirected without confirmation by the user. In reality, browsers + do allow automatic redirection of these responses, changing the POST to a + ``GET``, and the default implementation reproduces this behavior. + + +.. method:: HTTPRedirectHandler.http_error_301(req, fp, code, msg, hdrs) + + Redirect to the ``Location:`` URL. This method is called by the parent + :class:`OpenerDirector` when getting an HTTP 'moved permanently' response. + + +.. method:: HTTPRedirectHandler.http_error_302(req, fp, code, msg, hdrs) + + The same as :meth:`http_error_301`, but called for the 'found' response. + + +.. method:: HTTPRedirectHandler.http_error_303(req, fp, code, msg, hdrs) + + The same as :meth:`http_error_301`, but called for the 'see other' response. + + +.. method:: HTTPRedirectHandler.http_error_307(req, fp, code, msg, hdrs) + + The same as :meth:`http_error_301`, but called for the 'temporary redirect' + response. + + +.. _http-cookie-processor: + +HTTPCookieProcessor Objects +--------------------------- + +.. versionadded:: 2.4 + +:class:`HTTPCookieProcessor` instances have one attribute: + + +.. attribute:: HTTPCookieProcessor.cookiejar + + The :class:`cookielib.CookieJar` in which cookies are stored. + + +.. _proxy-handler: + +ProxyHandler Objects +-------------------- + + +.. method:: ProxyHandler.protocol_open(request) + :noindex: + + The :class:`ProxyHandler` will have a method :meth:`protocol_open` for every + *protocol* which has a proxy in the *proxies* dictionary given in the + constructor. The method will modify requests to go through the proxy, by + calling ``request.set_proxy()``, and call the next handler in the chain to + actually execute the protocol. + + +.. _http-password-mgr: + +HTTPPasswordMgr Objects +----------------------- + +These methods are available on :class:`HTTPPasswordMgr` and +:class:`HTTPPasswordMgrWithDefaultRealm` objects. + + +.. method:: HTTPPasswordMgr.add_password(realm, uri, user, passwd) + + *uri* can be either a single URI, or a sequence of URIs. *realm*, *user* and + *passwd* must be strings. This causes ``(user, passwd)`` to be used as + authentication tokens when authentication for *realm* and a super-URI of any of + the given URIs is given. + + +.. method:: HTTPPasswordMgr.find_user_password(realm, authuri) + + Get user/password for given realm and URI, if any. This method will return + ``(None, None)`` if there is no matching user/password. + + For :class:`HTTPPasswordMgrWithDefaultRealm` objects, the realm ``None`` will be + searched if the given *realm* has no matching user/password. + + +.. _abstract-basic-auth-handler: + +AbstractBasicAuthHandler Objects +-------------------------------- + + +.. method:: AbstractBasicAuthHandler.http_error_auth_reqed(authreq, host, req, headers) + + Handle an authentication request by getting a user/password pair, and re-trying + the request. *authreq* should be the name of the header where the information + about the realm is included in the request, *host* specifies the URL and path to + authenticate for, *req* should be the (failed) :class:`Request` object, and + *headers* should be the error headers. + + *host* is either an authority (e.g. ``"python.org"``) or a URL containing an + authority component (e.g. ``"http://python.org/"``). In either case, the + authority must not contain a userinfo component (so, ``"python.org"`` and + ``"python.org:80"`` are fine, ``"joe:password@python.org"`` is not). + + +.. _http-basic-auth-handler: + +HTTPBasicAuthHandler Objects +---------------------------- + + +.. method:: HTTPBasicAuthHandler.http_error_401(req, fp, code, msg, hdrs) + + Retry the request with authentication information, if available. + + +.. _proxy-basic-auth-handler: + +ProxyBasicAuthHandler Objects +----------------------------- + + +.. method:: ProxyBasicAuthHandler.http_error_407(req, fp, code, msg, hdrs) + + Retry the request with authentication information, if available. + + +.. _abstract-digest-auth-handler: + +AbstractDigestAuthHandler Objects +--------------------------------- + + +.. method:: AbstractDigestAuthHandler.http_error_auth_reqed(authreq, host, req, headers) + + *authreq* should be the name of the header where the information about the realm + is included in the request, *host* should be the host to authenticate to, *req* + should be the (failed) :class:`Request` object, and *headers* should be the + error headers. + + +.. _http-digest-auth-handler: + +HTTPDigestAuthHandler Objects +----------------------------- + + +.. method:: HTTPDigestAuthHandler.http_error_401(req, fp, code, msg, hdrs) + + Retry the request with authentication information, if available. + + +.. _proxy-digest-auth-handler: + +ProxyDigestAuthHandler Objects +------------------------------ + + +.. method:: ProxyDigestAuthHandler.http_error_407(req, fp, code, msg, hdrs) + + Retry the request with authentication information, if available. + + +.. _http-handler-objects: + +HTTPHandler Objects +------------------- + + +.. method:: HTTPHandler.http_open(req) + + Send an HTTP request, which can be either GET or POST, depending on + ``req.has_data()``. + + +.. _https-handler-objects: + +HTTPSHandler Objects +-------------------- + + +.. method:: HTTPSHandler.https_open(req) + + Send an HTTPS request, which can be either GET or POST, depending on + ``req.has_data()``. + + +.. _file-handler-objects: + +FileHandler Objects +------------------- + + +.. method:: FileHandler.file_open(req) + + Open the file locally, if there is no host name, or the host name is + ``'localhost'``. Change the protocol to ``ftp`` otherwise, and retry opening it + using :attr:`parent`. + + +.. _ftp-handler-objects: + +FTPHandler Objects +------------------ + + +.. method:: FTPHandler.ftp_open(req) + + Open the FTP file indicated by *req*. The login is always done with empty + username and password. + + +.. _cacheftp-handler-objects: + +CacheFTPHandler Objects +----------------------- + +:class:`CacheFTPHandler` objects are :class:`FTPHandler` objects with the +following additional methods: + + +.. method:: CacheFTPHandler.setTimeout(t) + + Set timeout of connections to *t* seconds. + + +.. method:: CacheFTPHandler.setMaxConns(m) + + Set maximum number of cached connections to *m*. + + +.. _unknown-handler-objects: + +UnknownHandler Objects +---------------------- + + +.. method:: UnknownHandler.unknown_open() + + Raise a :exc:`URLError` exception. + + +.. _http-error-processor-objects: + +HTTPErrorProcessor Objects +-------------------------- + +.. versionadded:: 2.4 + + +.. method:: HTTPErrorProcessor.unknown_open() + + Process HTTP error responses. + + For 200 error codes, the response object is returned immediately. + + For non-200 error codes, this simply passes the job on to the + :meth:`protocol_error_code` handler methods, via :meth:`OpenerDirector.error`. + Eventually, :class:`urllib2.HTTPDefaultErrorHandler` will raise an + :exc:`HTTPError` if no other handler handles the error. + + +.. _urllib2-examples: + +Examples +-------- + +This example gets the python.org main page and displays the first 100 bytes of +it:: + + >>> import urllib2 + >>> f = urllib2.urlopen('http://www.python.org/') + >>> print f.read(100) + <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> + <?xml-stylesheet href="./css/ht2html + +Here we are sending a data-stream to the stdin of a CGI and reading the data it +returns to us. Note that this example will only work when the Python +installation supports SSL. :: + + >>> import urllib2 + >>> req = urllib2.Request(url='https://localhost/cgi-bin/test.cgi', + ... data='This data is passed to stdin of the CGI') + >>> f = urllib2.urlopen(req) + >>> print f.read() + Got Data: "This data is passed to stdin of the CGI" + +The code for the sample CGI used in the above example is:: + + #!/usr/bin/env python + import sys + data = sys.stdin.read() + print 'Content-type: text-plain\n\nGot Data: "%s"' % data + +Use of Basic HTTP Authentication:: + + import urllib2 + # Create an OpenerDirector with support for Basic HTTP Authentication... + auth_handler = urllib2.HTTPBasicAuthHandler() + auth_handler.add_password(realm='PDQ Application', + uri='https://mahler:8092/site-updates.py', + user='klem', + passwd='kadidd!ehopper') + opener = urllib2.build_opener(auth_handler) + # ...and install it globally so it can be used with urlopen. + urllib2.install_opener(opener) + urllib2.urlopen('http://www.example.com/login.html') + +:func:`build_opener` provides many handlers by default, including a +:class:`ProxyHandler`. By default, :class:`ProxyHandler` uses the environment +variables named ``<scheme>_proxy``, where ``<scheme>`` is the URL scheme +involved. For example, the :envvar:`http_proxy` environment variable is read to +obtain the HTTP proxy's URL. + +This example replaces the default :class:`ProxyHandler` with one that uses +programatically-supplied proxy URLs, and adds proxy authorization support with +:class:`ProxyBasicAuthHandler`. :: + + proxy_handler = urllib2.ProxyHandler({'http': 'http://www.example.com:3128/'}) + proxy_auth_handler = urllib2.HTTPBasicAuthHandler() + proxy_auth_handler.add_password('realm', 'host', 'username', 'password') + + opener = build_opener(proxy_handler, proxy_auth_handler) + # This time, rather than install the OpenerDirector, we use it directly: + opener.open('http://www.example.com/login.html') + +Adding HTTP headers: + +Use the *headers* argument to the :class:`Request` constructor, or:: + + import urllib2 + req = urllib2.Request('http://www.example.com/') + req.add_header('Referer', 'http://www.python.org/') + r = urllib2.urlopen(req) + +:class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to +every :class:`Request`. To change this:: + + import urllib2 + opener = urllib2.build_opener() + opener.addheaders = [('User-agent', 'Mozilla/5.0')] + opener.open('http://www.example.com/') + +Also, remember that a few standard headers (:mailheader:`Content-Length`, +:mailheader:`Content-Type` and :mailheader:`Host`) are added when the +:class:`Request` is passed to :func:`urlopen` (or :meth:`OpenerDirector.open`). + diff --git a/Doc/library/urlparse.rst b/Doc/library/urlparse.rst new file mode 100644 index 0000000..c6bc82b --- /dev/null +++ b/Doc/library/urlparse.rst @@ -0,0 +1,268 @@ +:mod:`urlparse` --- Parse URLs into components +============================================== + +.. module:: urlparse + :synopsis: Parse URLs into or assemble them from components. + + +.. index:: + single: WWW + single: World Wide Web + single: URL + pair: URL; parsing + pair: relative; URL + +This module defines a standard interface to break Uniform Resource Locator (URL) +strings up in components (addressing scheme, network location, path etc.), to +combine the components back into a URL string, and to convert a "relative URL" +to an absolute URL given a "base URL." + +The module has been designed to match the Internet RFC on Relative Uniform +Resource Locators (and discovered a bug in an earlier draft!). It supports the +following URL schemes: ``file``, ``ftp``, ``gopher``, ``hdl``, ``http``, +``https``, ``imap``, ``mailto``, ``mms``, ``news``, ``nntp``, ``prospero``, +``rsync``, ``rtsp``, ``rtspu``, ``sftp``, ``shttp``, ``sip``, ``sips``, +``snews``, ``svn``, ``svn+ssh``, ``telnet``, ``wais``. + +.. versionadded:: 2.5 + Support for the ``sftp`` and ``sips`` schemes. + +The :mod:`urlparse` module defines the following functions: + + +.. function:: urlparse(urlstring[, default_scheme[, allow_fragments]]) + + Parse a URL into six components, returning a 6-tuple. This corresponds to the + general structure of a URL: ``scheme://netloc/path;parameters?query#fragment``. + Each tuple item is a string, possibly empty. The components are not broken up in + smaller parts (for example, the network location is a single string), and % + escapes are not expanded. The delimiters as shown above are not part of the + result, except for a leading slash in the *path* component, which is retained if + present. For example:: + + >>> from urlparse import urlparse + >>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html') + >>> o + ('http', 'www.cwi.nl:80', '/%7Eguido/Python.html', '', '', '') + >>> o.scheme + 'http' + >>> o.port + 80 + >>> o.geturl() + 'http://www.cwi.nl:80/%7Eguido/Python.html' + + If the *default_scheme* argument is specified, it gives the default addressing + scheme, to be used only if the URL does not specify one. The default value for + this argument is the empty string. + + If the *allow_fragments* argument is false, fragment identifiers are not + allowed, even if the URL's addressing scheme normally does support them. The + default value for this argument is :const:`True`. + + The return value is actually an instance of a subclass of :class:`tuple`. This + class has the following additional read-only convenience attributes: + + +------------------+-------+--------------------------+----------------------+ + | Attribute | Index | Value | Value if not present | + +==================+=======+==========================+======================+ + | :attr:`scheme` | 0 | URL scheme specifier | empty string | + +------------------+-------+--------------------------+----------------------+ + | :attr:`netloc` | 1 | Network location part | empty string | + +------------------+-------+--------------------------+----------------------+ + | :attr:`path` | 2 | Hierarchical path | empty string | + +------------------+-------+--------------------------+----------------------+ + | :attr:`params` | 3 | Parameters for last path | empty string | + | | | element | | + +------------------+-------+--------------------------+----------------------+ + | :attr:`query` | 4 | Query component | empty string | + +------------------+-------+--------------------------+----------------------+ + | :attr:`fragment` | 5 | Fragment identifier | empty string | + +------------------+-------+--------------------------+----------------------+ + | :attr:`username` | | User name | :const:`None` | + +------------------+-------+--------------------------+----------------------+ + | :attr:`password` | | Password | :const:`None` | + +------------------+-------+--------------------------+----------------------+ + | :attr:`hostname` | | Host name (lower case) | :const:`None` | + +------------------+-------+--------------------------+----------------------+ + | :attr:`port` | | Port number as integer, | :const:`None` | + | | | if present | | + +------------------+-------+--------------------------+----------------------+ + + See section :ref:`urlparse-result-object` for more information on the result + object. + + .. versionchanged:: 2.5 + Added attributes to return value. + + +.. function:: urlunparse(parts) + + Construct a URL from a tuple as returned by ``urlparse()``. The *parts* argument + can be any six-item iterable. This may result in a slightly different, but + equivalent URL, if the URL that was parsed originally had unnecessary delimiters + (for example, a ? with an empty query; the RFC states that these are + equivalent). + + +.. function:: urlsplit(urlstring[, default_scheme[, allow_fragments]]) + + This is similar to :func:`urlparse`, but does not split the params from the URL. + This should generally be used instead of :func:`urlparse` if the more recent URL + syntax allowing parameters to be applied to each segment of the *path* portion + of the URL (see :rfc:`2396`) is wanted. A separate function is needed to + separate the path segments and parameters. This function returns a 5-tuple: + (addressing scheme, network location, path, query, fragment identifier). + + The return value is actually an instance of a subclass of :class:`tuple`. This + class has the following additional read-only convenience attributes: + + +------------------+-------+-------------------------+----------------------+ + | Attribute | Index | Value | Value if not present | + +==================+=======+=========================+======================+ + | :attr:`scheme` | 0 | URL scheme specifier | empty string | + +------------------+-------+-------------------------+----------------------+ + | :attr:`netloc` | 1 | Network location part | empty string | + +------------------+-------+-------------------------+----------------------+ + | :attr:`path` | 2 | Hierarchical path | empty string | + +------------------+-------+-------------------------+----------------------+ + | :attr:`query` | 3 | Query component | empty string | + +------------------+-------+-------------------------+----------------------+ + | :attr:`fragment` | 4 | Fragment identifier | empty string | + +------------------+-------+-------------------------+----------------------+ + | :attr:`username` | | User name | :const:`None` | + +------------------+-------+-------------------------+----------------------+ + | :attr:`password` | | Password | :const:`None` | + +------------------+-------+-------------------------+----------------------+ + | :attr:`hostname` | | Host name (lower case) | :const:`None` | + +------------------+-------+-------------------------+----------------------+ + | :attr:`port` | | Port number as integer, | :const:`None` | + | | | if present | | + +------------------+-------+-------------------------+----------------------+ + + See section :ref:`urlparse-result-object` for more information on the result + object. + + .. versionadded:: 2.2 + + .. versionchanged:: 2.5 + Added attributes to return value. + + +.. function:: urlunsplit(parts) + + Combine the elements of a tuple as returned by :func:`urlsplit` into a complete + URL as a string. The *parts* argument can be any five-item iterable. This may + result in a slightly different, but equivalent URL, if the URL that was parsed + originally had unnecessary delimiters (for example, a ? with an empty query; the + RFC states that these are equivalent). + + .. versionadded:: 2.2 + + +.. function:: urljoin(base, url[, allow_fragments]) + + Construct a full ("absolute") URL by combining a "base URL" (*base*) with + another URL (*url*). Informally, this uses components of the base URL, in + particular the addressing scheme, the network location and (part of) the path, + to provide missing components in the relative URL. For example:: + + >>> from urlparse import urljoin + >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html') + 'http://www.cwi.nl/%7Eguido/FAQ.html' + + The *allow_fragments* argument has the same meaning and default as for + :func:`urlparse`. + + .. note:: + + If *url* is an absolute URL (that is, starting with ``//`` or ``scheme://``), + the *url*'s host name and/or scheme will be present in the result. For example: + + :: + + >>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', + ... '//www.python.org/%7Eguido') + 'http://www.python.org/%7Eguido' + + If you do not want that behavior, preprocess the *url* with :func:`urlsplit` and + :func:`urlunsplit`, removing possible *scheme* and *netloc* parts. + + +.. function:: urldefrag(url) + + If *url* contains a fragment identifier, returns a modified version of *url* + with no fragment identifier, and the fragment identifier as a separate string. + If there is no fragment identifier in *url*, returns *url* unmodified and an + empty string. + + +.. seealso:: + + :rfc:`1738` - Uniform Resource Locators (URL) + This specifies the formal syntax and semantics of absolute URLs. + + :rfc:`1808` - Relative Uniform Resource Locators + This Request For Comments includes the rules for joining an absolute and a + relative URL, including a fair number of "Abnormal Examples" which govern the + treatment of border cases. + + :rfc:`2396` - Uniform Resource Identifiers (URI): Generic Syntax + Document describing the generic syntactic requirements for both Uniform Resource + Names (URNs) and Uniform Resource Locators (URLs). + + +.. _urlparse-result-object: + +Results of :func:`urlparse` and :func:`urlsplit` +------------------------------------------------ + +The result objects from the :func:`urlparse` and :func:`urlsplit` functions are +subclasses of the :class:`tuple` type. These subclasses add the attributes +described in those functions, as well as provide an additional method: + + +.. method:: ParseResult.geturl() + + Return the re-combined version of the original URL as a string. This may differ + from the original URL in that the scheme will always be normalized to lower case + and empty components may be dropped. Specifically, empty parameters, queries, + and fragment identifiers will be removed. + + The result of this method is a fixpoint if passed back through the original + parsing function:: + + >>> import urlparse + >>> url = 'HTTP://www.Python.org/doc/#' + + >>> r1 = urlparse.urlsplit(url) + >>> r1.geturl() + 'http://www.Python.org/doc/' + + >>> r2 = urlparse.urlsplit(r1.geturl()) + >>> r2.geturl() + 'http://www.Python.org/doc/' + + .. versionadded:: 2.5 + +The following classes provide the implementations of the parse results:: + + +.. class:: BaseResult + + Base class for the concrete result classes. This provides most of the attribute + definitions. It does not provide a :meth:`geturl` method. It is derived from + :class:`tuple`, but does not override the :meth:`__init__` or :meth:`__new__` + methods. + + +.. class:: ParseResult(scheme, netloc, path, params, query, fragment) + + Concrete class for :func:`urlparse` results. The :meth:`__new__` method is + overridden to support checking that the right number of arguments are passed. + + +.. class:: SplitResult(scheme, netloc, path, query, fragment) + + Concrete class for :func:`urlsplit` results. The :meth:`__new__` method is + overridden to support checking that the right number of arguments are passed. + diff --git a/Doc/library/user.rst b/Doc/library/user.rst new file mode 100644 index 0000000..ba94262 --- /dev/null +++ b/Doc/library/user.rst @@ -0,0 +1,69 @@ + +:mod:`user` --- User-specific configuration hook +================================================ + +.. module:: user + :synopsis: A standard way to reference user-specific modules. + + +.. index:: + pair: .pythonrc.py; file + triple: user; configuration; file + +As a policy, Python doesn't run user-specified code on startup of Python +programs. (Only interactive sessions execute the script specified in the +:envvar:`PYTHONSTARTUP` environment variable if it exists). + +However, some programs or sites may find it convenient to allow users to have a +standard customization file, which gets run when a program requests it. This +module implements such a mechanism. A program that wishes to use the mechanism +must execute the statement :: + + import user + +.. index:: builtin: exec + +The :mod:`user` module looks for a file :file:`.pythonrc.py` in the user's home +directory and if it can be opened, executes it (using :func:`exec`) in its +own (the module :mod:`user`'s) global namespace. Errors during this phase are +not caught; that's up to the program that imports the :mod:`user` module, if it +wishes. The home directory is assumed to be named by the :envvar:`HOME` +environment variable; if this is not set, the current directory is used. + +The user's :file:`.pythonrc.py` could conceivably test for ``sys.version`` if it +wishes to do different things depending on the Python version. + +A warning to users: be very conservative in what you place in your +:file:`.pythonrc.py` file. Since you don't know which programs will use it, +changing the behavior of standard modules or functions is generally not a good +idea. + +A suggestion for programmers who wish to use this mechanism: a simple way to let +users specify options for your package is to have them define variables in their +:file:`.pythonrc.py` file that you test in your module. For example, a module +:mod:`spam` that has a verbosity level can look for a variable +``user.spam_verbose``, as follows:: + + import user + + verbose = bool(getattr(user, "spam_verbose", 0)) + +(The three-argument form of :func:`getattr` is used in case the user has not +defined ``spam_verbose`` in their :file:`.pythonrc.py` file.) + +Programs with extensive customization needs are better off reading a +program-specific customization file. + +Programs with security or privacy concerns should *not* import this module; a +user can easily break into a program by placing arbitrary code in the +:file:`.pythonrc.py` file. + +Modules for general use should *not* import this module; it may interfere with +the operation of the importing program. + + +.. seealso:: + + Module :mod:`site` + Site-wide customization mechanism. + diff --git a/Doc/library/userdict.rst b/Doc/library/userdict.rst new file mode 100644 index 0000000..11d46ed --- /dev/null +++ b/Doc/library/userdict.rst @@ -0,0 +1,188 @@ + +:mod:`UserDict` --- Class wrapper for dictionary objects +======================================================== + +.. module:: UserDict + :synopsis: Class wrapper for dictionary objects. + + +The module defines a mixin, :class:`DictMixin`, defining all dictionary methods +for classes that already have a minimum mapping interface. This greatly +simplifies writing classes that need to be substitutable for dictionaries (such +as the shelve module). + +This also module defines a class, :class:`UserDict`, that acts as a wrapper +around dictionary objects. The need for this class has been largely supplanted +by the ability to subclass directly from :class:`dict` (a feature that became +available starting with Python version 2.2). Prior to the introduction of +:class:`dict`, the :class:`UserDict` class was used to create dictionary-like +sub-classes that obtained new behaviors by overriding existing methods or adding +new ones. + +The :mod:`UserDict` module defines the :class:`UserDict` class and +:class:`DictMixin`: + + +.. class:: UserDict([initialdata]) + + Class that simulates a dictionary. The instance's contents are kept in a + regular dictionary, which is accessible via the :attr:`data` attribute of + :class:`UserDict` instances. If *initialdata* is provided, :attr:`data` is + initialized with its contents; note that a reference to *initialdata* will not + be kept, allowing it be used for other purposes. + + .. note:: + + For backward compatibility, instances of :class:`UserDict` are not iterable. + + +.. class:: IterableUserDict([initialdata]) + + Subclass of :class:`UserDict` that supports direct iteration (e.g. ``for key in + myDict``). + +In addition to supporting the methods and operations of mappings (see section +:ref:`typesmapping`), :class:`UserDict` and :class:`IterableUserDict` instances +provide the following attribute: + + +.. attribute:: IterableUserDict.data + + A real dictionary used to store the contents of the :class:`UserDict` class. + + +.. class:: DictMixin() + + Mixin defining all dictionary methods for classes that already have a minimum + dictionary interface including :meth:`__getitem__`, :meth:`__setitem__`, + :meth:`__delitem__`, and :meth:`keys`. + + This mixin should be used as a superclass. Adding each of the above methods + adds progressively more functionality. For instance, defining all but + :meth:`__delitem__` will preclude only :meth:`pop` and :meth:`popitem` from the + full interface. + + In addition to the four base methods, progressively more efficiency comes with + defining :meth:`__contains__`, :meth:`__iter__`, and :meth:`iteritems`. + + Since the mixin has no knowledge of the subclass constructor, it does not define + :meth:`__init__` or :meth:`copy`. + + +:mod:`UserList` --- Class wrapper for list objects +================================================== + +.. module:: UserList + :synopsis: Class wrapper for list objects. + + +.. note:: + + This module is available for backward compatibility only. If you are writing + code that does not need to work with versions of Python earlier than Python 2.2, + please consider subclassing directly from the built-in :class:`list` type. + +This module defines a class that acts as a wrapper around list objects. It is a +useful base class for your own list-like classes, which can inherit from them +and override existing methods or add new ones. In this way one can add new +behaviors to lists. + +The :mod:`UserList` module defines the :class:`UserList` class: + + +.. class:: UserList([list]) + + Class that simulates a list. The instance's contents are kept in a regular + list, which is accessible via the :attr:`data` attribute of :class:`UserList` + instances. The instance's contents are initially set to a copy of *list*, + defaulting to the empty list ``[]``. *list* can be any iterable, e.g. a + real Python list or a :class:`UserList` object. + +In addition to supporting the methods and operations of mutable sequences (see +section :ref:`typesseq`), :class:`UserList` instances provide the following +attribute: + + +.. attribute:: UserList.data + + A real Python list object used to store the contents of the :class:`UserList` + class. + +**Subclassing requirements:** Subclasses of :class:`UserList` are expect to +offer a constructor which can be called with either no arguments or one +argument. List operations which return a new sequence attempt to create an +instance of the actual implementation class. To do so, it assumes that the +constructor can be called with a single parameter, which is a sequence object +used as a data source. + +If a derived class does not wish to comply with this requirement, all of the +special methods supported by this class will need to be overridden; please +consult the sources for information about the methods which need to be provided +in that case. + +.. versionchanged:: 2.0 + Python versions 1.5.2 and 1.6 also required that the constructor be callable + with no parameters, and offer a mutable :attr:`data` attribute. Earlier + versions of Python did not attempt to create instances of the derived class. + + +:mod:`UserString` --- Class wrapper for string objects +====================================================== + +.. module:: UserString + :synopsis: Class wrapper for string objects. +.. moduleauthor:: Peter Funk <pf@artcom-gmbh.de> +.. sectionauthor:: Peter Funk <pf@artcom-gmbh.de> + + +.. note:: + + This :class:`UserString` class from this module is available for backward + compatibility only. If you are writing code that does not need to work with + versions of Python earlier than Python 2.2, please consider subclassing directly + from the built-in :class:`str` type instead of using :class:`UserString` (there + is no built-in equivalent to :class:`MutableString`). + +This module defines a class that acts as a wrapper around string objects. It is +a useful base class for your own string-like classes, which can inherit from +them and override existing methods or add new ones. In this way one can add new +behaviors to strings. + +It should be noted that these classes are highly inefficient compared to real +string or Unicode objects; this is especially the case for +:class:`MutableString`. + +The :mod:`UserString` module defines the following classes: + + +.. class:: UserString([sequence]) + + Class that simulates a string or a Unicode string object. The instance's + content is kept in a regular string or Unicode string object, which is + accessible via the :attr:`data` attribute of :class:`UserString` instances. The + instance's contents are initially set to a copy of *sequence*. *sequence* can + be either a regular Python string or Unicode string, an instance of + :class:`UserString` (or a subclass) or an arbitrary sequence which can be + converted into a string using the built-in :func:`str` function. + + +.. class:: MutableString([sequence]) + + This class is derived from the :class:`UserString` above and redefines strings + to be *mutable*. Mutable strings can't be used as dictionary keys, because + dictionaries require *immutable* objects as keys. The main intention of this + class is to serve as an educational example for inheritance and necessity to + remove (override) the :meth:`__hash__` method in order to trap attempts to use a + mutable object as dictionary key, which would be otherwise very error prone and + hard to track down. + +In addition to supporting the methods and operations of string and Unicode +objects (see section :ref:`string-methods`), :class:`UserString` instances +provide the following attribute: + + +.. attribute:: MutableString.data + + A real Python string or Unicode object used to store the content of the + :class:`UserString` class. + diff --git a/Doc/library/uu.rst b/Doc/library/uu.rst new file mode 100644 index 0000000..e2303c3 --- /dev/null +++ b/Doc/library/uu.rst @@ -0,0 +1,60 @@ + +:mod:`uu` --- Encode and decode uuencode files +============================================== + +.. module:: uu + :synopsis: Encode and decode files in uuencode format. +.. moduleauthor:: Lance Ellinghouse + + +This module encodes and decodes files in uuencode format, allowing arbitrary +binary data to be transferred over ASCII-only connections. Wherever a file +argument is expected, the methods accept a file-like object. For backwards +compatibility, a string containing a pathname is also accepted, and the +corresponding file will be opened for reading and writing; the pathname ``'-'`` +is understood to mean the standard input or output. However, this interface is +deprecated; it's better for the caller to open the file itself, and be sure +that, when required, the mode is ``'rb'`` or ``'wb'`` on Windows. + +.. index:: + single: Jansen, Jack + single: Ellinghouse, Lance + +This code was contributed by Lance Ellinghouse, and modified by Jack Jansen. + +The :mod:`uu` module defines the following functions: + + +.. function:: encode(in_file, out_file[, name[, mode]]) + + Uuencode file *in_file* into file *out_file*. The uuencoded file will have the + header specifying *name* and *mode* as the defaults for the results of decoding + the file. The default defaults are taken from *in_file*, or ``'-'`` and ``0666`` + respectively. + + +.. function:: decode(in_file[, out_file[, mode[, quiet]]]) + + This call decodes uuencoded file *in_file* placing the result on file + *out_file*. If *out_file* is a pathname, *mode* is used to set the permission + bits if the file must be created. Defaults for *out_file* and *mode* are taken + from the uuencode header. However, if the file specified in the header already + exists, a :exc:`uu.Error` is raised. + + :func:`decode` may print a warning to standard error if the input was produced + by an incorrect uuencoder and Python could recover from that error. Setting + *quiet* to a true value silences this warning. + + +.. exception:: Error() + + Subclass of :exc:`Exception`, this can be raised by :func:`uu.decode` under + various situations, such as described above, but also including a badly + formatted header, or truncated input file. + + +.. seealso:: + + Module :mod:`binascii` + Support module containing ASCII-to-binary and binary-to-ASCII conversions. + diff --git a/Doc/library/uuid.rst b/Doc/library/uuid.rst new file mode 100644 index 0000000..dd52638 --- /dev/null +++ b/Doc/library/uuid.rst @@ -0,0 +1,258 @@ + +:mod:`uuid` --- UUID objects according to RFC 4122 +================================================== + +.. module:: uuid + :synopsis: UUID objects (universally unique identifiers) according to RFC 4122 +.. moduleauthor:: Ka-Ping Yee <ping@zesty.ca> +.. sectionauthor:: George Yoshida <quiver@users.sourceforge.net> + + +.. versionadded:: 2.5 + +This module provides immutable :class:`UUID` objects (the :class:`UUID` class) +and the functions :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, :func:`uuid5` for +generating version 1, 3, 4, and 5 UUIDs as specified in :rfc:`4122`. + +If all you want is a unique ID, you should probably call :func:`uuid1` or +:func:`uuid4`. Note that :func:`uuid1` may compromise privacy since it creates +a UUID containing the computer's network address. :func:`uuid4` creates a +random UUID. + + +.. class:: UUID([hex[, bytes[, bytes_le[, fields[, int[, version]]]]]]) + + Create a UUID from either a string of 32 hexadecimal digits, a string of 16 + bytes as the *bytes* argument, a string of 16 bytes in little-endian order as + the *bytes_le* argument, a tuple of six integers (32-bit *time_low*, 16-bit + *time_mid*, 16-bit *time_hi_version*, 8-bit *clock_seq_hi_variant*, 8-bit + *clock_seq_low*, 48-bit *node*) as the *fields* argument, or a single 128-bit + integer as the *int* argument. When a string of hex digits is given, curly + braces, hyphens, and a URN prefix are all optional. For example, these + expressions all yield the same UUID:: + + UUID('{12345678-1234-5678-1234-567812345678}') + UUID('12345678123456781234567812345678') + UUID('urn:uuid:12345678-1234-5678-1234-567812345678') + UUID(bytes='\x12\x34\x56\x78'*4) + UUID(bytes_le='\x78\x56\x34\x12\x34\x12\x78\x56' + + '\x12\x34\x56\x78\x12\x34\x56\x78') + UUID(fields=(0x12345678, 0x1234, 0x5678, 0x12, 0x34, 0x567812345678)) + UUID(int=0x12345678123456781234567812345678) + + Exactly one of *hex*, *bytes*, *bytes_le*, *fields*, or *int* must be given. + The *version* argument is optional; if given, the resulting UUID will have its + variant and version number set according to RFC 4122, overriding bits in the + given *hex*, *bytes*, *bytes_le*, *fields*, or *int*. + +:class:`UUID` instances have these read-only attributes: + + +.. attribute:: UUID.bytes + + The UUID as a 16-byte string (containing the six integer fields in big-endian + byte order). + + +.. attribute:: UUID.bytes_le + + The UUID as a 16-byte string (with *time_low*, *time_mid*, and *time_hi_version* + in little-endian byte order). + + +.. attribute:: UUID.fields + + A tuple of the six integer fields of the UUID, which are also available as six + individual attributes and two derived attributes: + + +------------------------------+-------------------------------+ + | Field | Meaning | + +==============================+===============================+ + | :attr:`time_low` | the first 32 bits of the UUID | + +------------------------------+-------------------------------+ + | :attr:`time_mid` | the next 16 bits of the UUID | + +------------------------------+-------------------------------+ + | :attr:`time_hi_version` | the next 16 bits of the UUID | + +------------------------------+-------------------------------+ + | :attr:`clock_seq_hi_variant` | the next 8 bits of the UUID | + +------------------------------+-------------------------------+ + | :attr:`clock_seq_low` | the next 8 bits of the UUID | + +------------------------------+-------------------------------+ + | :attr:`node` | the last 48 bits of the UUID | + +------------------------------+-------------------------------+ + | :attr:`time` | the 60-bit timestamp | + +------------------------------+-------------------------------+ + | :attr:`clock_seq` | the 14-bit sequence number | + +------------------------------+-------------------------------+ + + +.. attribute:: UUID.hex + + The UUID as a 32-character hexadecimal string. + + +.. attribute:: UUID.int + + The UUID as a 128-bit integer. + + +.. attribute:: UUID.urn + + The UUID as a URN as specified in RFC 4122. + + +.. attribute:: UUID.variant + + The UUID variant, which determines the internal layout of the UUID. This will be + one of the integer constants :const:`RESERVED_NCS`, :const:`RFC_4122`, + :const:`RESERVED_MICROSOFT`, or :const:`RESERVED_FUTURE`. + + +.. attribute:: UUID.version + + The UUID version number (1 through 5, meaningful only when the variant is + :const:`RFC_4122`). + +The :mod:`uuid` module defines the following functions: + + +.. function:: getnode() + + Get the hardware address as a 48-bit positive integer. The first time this + runs, it may launch a separate program, which could be quite slow. If all + attempts to obtain the hardware address fail, we choose a random 48-bit number + with its eighth bit set to 1 as recommended in RFC 4122. "Hardware address" + means the MAC address of a network interface, and on a machine with multiple + network interfaces the MAC address of any one of them may be returned. + +.. index:: single: getnode + + +.. function:: uuid1([node[, clock_seq]]) + + Generate a UUID from a host ID, sequence number, and the current time. If *node* + is not given, :func:`getnode` is used to obtain the hardware address. If + *clock_seq* is given, it is used as the sequence number; otherwise a random + 14-bit sequence number is chosen. + +.. index:: single: uuid1 + + +.. function:: uuid3(namespace, name) + + Generate a UUID based on the MD5 hash of a namespace identifier (which is a + UUID) and a name (which is a string). + +.. index:: single: uuid3 + + +.. function:: uuid4() + + Generate a random UUID. + +.. index:: single: uuid4 + + +.. function:: uuid5(namespace, name) + + Generate a UUID based on the SHA-1 hash of a namespace identifier (which is a + UUID) and a name (which is a string). + +.. index:: single: uuid5 + +The :mod:`uuid` module defines the following namespace identifiers for use with +:func:`uuid3` or :func:`uuid5`. + + +.. data:: NAMESPACE_DNS + + When this namespace is specified, the *name* string is a fully-qualified domain + name. + + +.. data:: NAMESPACE_URL + + When this namespace is specified, the *name* string is a URL. + + +.. data:: NAMESPACE_OID + + When this namespace is specified, the *name* string is an ISO OID. + + +.. data:: NAMESPACE_X500 + + When this namespace is specified, the *name* string is an X.500 DN in DER or a + text output format. + +The :mod:`uuid` module defines the following constants for the possible values +of the :attr:`variant` attribute: + + +.. data:: RESERVED_NCS + + Reserved for NCS compatibility. + + +.. data:: RFC_4122 + + Specifies the UUID layout given in :rfc:`4122`. + + +.. data:: RESERVED_MICROSOFT + + Reserved for Microsoft compatibility. + + +.. data:: RESERVED_FUTURE + + Reserved for future definition. + + +.. seealso:: + + :rfc:`4122` - A Universally Unique IDentifier (UUID) URN Namespace + This specification defines a Uniform Resource Name namespace for UUIDs, the + internal format of UUIDs, and methods of generating UUIDs. + + +.. _uuid-example: + +Example +------- + +Here are some examples of typical usage of the :mod:`uuid` module:: + + >>> import uuid + + # make a UUID based on the host ID and current time + >>> uuid.uuid1() + UUID('a8098c1a-f86e-11da-bd1a-00112444be1e') + + # make a UUID using an MD5 hash of a namespace UUID and a name + >>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org') + UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e') + + # make a random UUID + >>> uuid.uuid4() + UUID('16fd2706-8baf-433b-82eb-8c7fada847da') + + # make a UUID using a SHA-1 hash of a namespace UUID and a name + >>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org') + UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d') + + # make a UUID from a string of hex digits (braces and hyphens ignored) + >>> x = uuid.UUID('{00010203-0405-0607-0809-0a0b0c0d0e0f}') + + # convert a UUID to a string of hex digits in standard form + >>> str(x) + '00010203-0405-0607-0809-0a0b0c0d0e0f' + + # get the raw 16 bytes of the UUID + >>> x.bytes + '\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f' + + # make a UUID from a 16-byte string + >>> uuid.UUID(bytes=x.bytes) + UUID('00010203-0405-0607-0809-0a0b0c0d0e0f') + diff --git a/Doc/library/warnings.rst b/Doc/library/warnings.rst new file mode 100644 index 0000000..35e9888 --- /dev/null +++ b/Doc/library/warnings.rst @@ -0,0 +1,242 @@ + +:mod:`warnings` --- Warning control +=================================== + +.. index:: single: warnings + +.. module:: warnings + :synopsis: Issue warning messages and control their disposition. + + +.. versionadded:: 2.1 + +Warning messages are typically issued in situations where it is useful to alert +the user of some condition in a program, where that condition (normally) doesn't +warrant raising an exception and terminating the program. For example, one +might want to issue a warning when a program uses an obsolete module. + +Python programmers issue warnings by calling the :func:`warn` function defined +in this module. (C programmers use :cfunc:`PyErr_WarnEx`; see +:ref:`exceptionhandling` for details). + +Warning messages are normally written to ``sys.stderr``, but their disposition +can be changed flexibly, from ignoring all warnings to turning them into +exceptions. The disposition of warnings can vary based on the warning category +(see below), the text of the warning message, and the source location where it +is issued. Repetitions of a particular warning for the same source location are +typically suppressed. + +There are two stages in warning control: first, each time a warning is issued, a +determination is made whether a message should be issued or not; next, if a +message is to be issued, it is formatted and printed using a user-settable hook. + +The determination whether to issue a warning message is controlled by the +warning filter, which is a sequence of matching rules and actions. Rules can be +added to the filter by calling :func:`filterwarnings` and reset to its default +state by calling :func:`resetwarnings`. + +The printing of warning messages is done by calling :func:`showwarning`, which +may be overridden; the default implementation of this function formats the +message by calling :func:`formatwarning`, which is also available for use by +custom implementations. + + +.. _warning-categories: + +Warning Categories +------------------ + +There are a number of built-in exceptions that represent warning categories. +This categorization is useful to be able to filter out groups of warnings. The +following warnings category classes are currently defined: + ++----------------------------------+-----------------------------------------------+ +| Class | Description | ++==================================+===============================================+ +| :exc:`Warning` | This is the base class of all warning | +| | category classes. It is a subclass of | +| | :exc:`Exception`. | ++----------------------------------+-----------------------------------------------+ +| :exc:`UserWarning` | The default category for :func:`warn`. | ++----------------------------------+-----------------------------------------------+ +| :exc:`DeprecationWarning` | Base category for warnings about deprecated | +| | features. | ++----------------------------------+-----------------------------------------------+ +| :exc:`SyntaxWarning` | Base category for warnings about dubious | +| | syntactic features. | ++----------------------------------+-----------------------------------------------+ +| :exc:`RuntimeWarning` | Base category for warnings about dubious | +| | runtime features. | ++----------------------------------+-----------------------------------------------+ +| :exc:`FutureWarning` | Base category for warnings about constructs | +| | that will change semantically in the future. | ++----------------------------------+-----------------------------------------------+ +| :exc:`PendingDeprecationWarning` | Base category for warnings about features | +| | that will be deprecated in the future | +| | (ignored by default). | ++----------------------------------+-----------------------------------------------+ +| :exc:`ImportWarning` | Base category for warnings triggered during | +| | the process of importing a module (ignored by | +| | default). | ++----------------------------------+-----------------------------------------------+ +| :exc:`UnicodeWarning` | Base category for warnings related to | +| | Unicode. | ++----------------------------------+-----------------------------------------------+ + +While these are technically built-in exceptions, they are documented here, +because conceptually they belong to the warnings mechanism. + +User code can define additional warning categories by subclassing one of the +standard warning categories. A warning category must always be a subclass of +the :exc:`Warning` class. + + +.. _warning-filter: + +The Warnings Filter +------------------- + +The warnings filter controls whether warnings are ignored, displayed, or turned +into errors (raising an exception). + +Conceptually, the warnings filter maintains an ordered list of filter +specifications; any specific warning is matched against each filter +specification in the list in turn until a match is found; the match determines +the disposition of the match. Each entry is a tuple of the form (*action*, +*message*, *category*, *module*, *lineno*), where: + +* *action* is one of the following strings: + + +---------------+----------------------------------------------+ + | Value | Disposition | + +===============+==============================================+ + | ``"error"`` | turn matching warnings into exceptions | + +---------------+----------------------------------------------+ + | ``"ignore"`` | never print matching warnings | + +---------------+----------------------------------------------+ + | ``"always"`` | always print matching warnings | + +---------------+----------------------------------------------+ + | ``"default"`` | print the first occurrence of matching | + | | warnings for each location where the warning | + | | is issued | + +---------------+----------------------------------------------+ + | ``"module"`` | print the first occurrence of matching | + | | warnings for each module where the warning | + | | is issued | + +---------------+----------------------------------------------+ + | ``"once"`` | print only the first occurrence of matching | + | | warnings, regardless of location | + +---------------+----------------------------------------------+ + +* *message* is a string containing a regular expression that the warning message + must match (the match is compiled to always be case-insensitive) + +* *category* is a class (a subclass of :exc:`Warning`) of which the warning + category must be a subclass in order to match + +* *module* is a string containing a regular expression that the module name must + match (the match is compiled to be case-sensitive) + +* *lineno* is an integer that the line number where the warning occurred must + match, or ``0`` to match all line numbers + +Since the :exc:`Warning` class is derived from the built-in :exc:`Exception` +class, to turn a warning into an error we simply raise ``category(message)``. + +The warnings filter is initialized by :option:`-W` options passed to the Python +interpreter command line. The interpreter saves the arguments for all +:option:`-W` options without interpretation in ``sys.warnoptions``; the +:mod:`warnings` module parses these when it is first imported (invalid options +are ignored, after printing a message to ``sys.stderr``). + +The warnings that are ignored by default may be enabled by passing :option:`-Wd` +to the interpreter. This enables default handling for all warnings, including +those that are normally ignored by default. This is particular useful for +enabling ImportWarning when debugging problems importing a developed package. +ImportWarning can also be enabled explicitly in Python code using:: + + warnings.simplefilter('default', ImportWarning) + + +.. _warning-functions: + +Available Functions +------------------- + + +.. function:: warn(message[, category[, stacklevel]]) + + Issue a warning, or maybe ignore it or raise an exception. The *category* + argument, if given, must be a warning category class (see above); it defaults to + :exc:`UserWarning`. Alternatively *message* can be a :exc:`Warning` instance, + in which case *category* will be ignored and ``message.__class__`` will be used. + In this case the message text will be ``str(message)``. This function raises an + exception if the particular warning issued is changed into an error by the + warnings filter see above. The *stacklevel* argument can be used by wrapper + functions written in Python, like this:: + + def deprecation(message): + warnings.warn(message, DeprecationWarning, stacklevel=2) + + This makes the warning refer to :func:`deprecation`'s caller, rather than to the + source of :func:`deprecation` itself (since the latter would defeat the purpose + of the warning message). + + +.. function:: warn_explicit(message, category, filename, lineno[, module[, registry[, module_globals]]]) + + This is a low-level interface to the functionality of :func:`warn`, passing in + explicitly the message, category, filename and line number, and optionally the + module name and the registry (which should be the ``__warningregistry__`` + dictionary of the module). The module name defaults to the filename with + ``.py`` stripped; if no registry is passed, the warning is never suppressed. + *message* must be a string and *category* a subclass of :exc:`Warning` or + *message* may be a :exc:`Warning` instance, in which case *category* will be + ignored. + + *module_globals*, if supplied, should be the global namespace in use by the code + for which the warning is issued. (This argument is used to support displaying + source for modules found in zipfiles or other non-filesystem import sources, and + was added in Python 2.5.) + + +.. function:: showwarning(message, category, filename, lineno[, file]) + + Write a warning to a file. The default implementation calls + ``formatwarning(message, category, filename, lineno)`` and writes the resulting + string to *file*, which defaults to ``sys.stderr``. You may replace this + function with an alternative implementation by assigning to + ``warnings.showwarning``. + + +.. function:: formatwarning(message, category, filename, lineno) + + Format a warning the standard way. This returns a string which may contain + embedded newlines and ends in a newline. + + +.. function:: filterwarnings(action[, message[, category[, module[, lineno[, append]]]]]) + + Insert an entry into the list of warnings filters. The entry is inserted at the + front by default; if *append* is true, it is inserted at the end. This checks + the types of the arguments, compiles the message and module regular expressions, + and inserts them as a tuple in the list of warnings filters. Entries closer to + the front of the list override entries later in the list, if both match a + particular warning. Omitted arguments default to a value that matches + everything. + + +.. function:: simplefilter(action[, category[, lineno[, append]]]) + + Insert a simple entry into the list of warnings filters. The meaning of the + function parameters is as for :func:`filterwarnings`, but regular expressions + are not needed as the filter inserted always matches any message in any module + as long as the category and line number match. + + +.. function:: resetwarnings() + + Reset the warnings filter. This discards the effect of all previous calls to + :func:`filterwarnings`, including that of the :option:`-W` command line options + and calls to :func:`simplefilter`. + diff --git a/Doc/library/wave.rst b/Doc/library/wave.rst new file mode 100644 index 0000000..d03f091 --- /dev/null +++ b/Doc/library/wave.rst @@ -0,0 +1,201 @@ +.. % Documentations stolen and LaTeX'ed from comments in file. + + +:mod:`wave` --- Read and write WAV files +======================================== + +.. module:: wave + :synopsis: Provide an interface to the WAV sound format. +.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il> + + +The :mod:`wave` module provides a convenient interface to the WAV sound format. +It does not support compression/decompression, but it does support mono/stereo. + +The :mod:`wave` module defines the following function and exception: + + +.. function:: open(file[, mode]) + + If *file* is a string, open the file by that name, other treat it as a seekable + file-like object. *mode* can be any of + + ``'r'``, ``'rb'`` + Read only mode. + + ``'w'``, ``'wb'`` + Write only mode. + + Note that it does not allow read/write WAV files. + + A *mode* of ``'r'`` or ``'rb'`` returns a :class:`Wave_read` object, while a + *mode* of ``'w'`` or ``'wb'`` returns a :class:`Wave_write` object. If *mode* + is omitted and a file-like object is passed as *file*, ``file.mode`` is used as + the default value for *mode* (the ``'b'`` flag is still added if necessary). + + +.. function:: openfp(file, mode) + + A synonym for :func:`open`, maintained for backwards compatibility. + + +.. exception:: Error + + An error raised when something is impossible because it violates the WAV + specification or hits an implementation deficiency. + + +.. _wave-read-objects: + +Wave_read Objects +----------------- + +Wave_read objects, as returned by :func:`open`, have the following methods: + + +.. method:: Wave_read.close() + + Close the stream, and make the instance unusable. This is called automatically + on object collection. + + +.. method:: Wave_read.getnchannels() + + Returns number of audio channels (``1`` for mono, ``2`` for stereo). + + +.. method:: Wave_read.getsampwidth() + + Returns sample width in bytes. + + +.. method:: Wave_read.getframerate() + + Returns sampling frequency. + + +.. method:: Wave_read.getnframes() + + Returns number of audio frames. + + +.. method:: Wave_read.getcomptype() + + Returns compression type (``'NONE'`` is the only supported type). + + +.. method:: Wave_read.getcompname() + + Human-readable version of :meth:`getcomptype`. Usually ``'not compressed'`` + parallels ``'NONE'``. + + +.. method:: Wave_read.getparams() + + Returns a tuple ``(nchannels, sampwidth, framerate, nframes, comptype, + compname)``, equivalent to output of the :meth:`get\*` methods. + + +.. method:: Wave_read.readframes(n) + + Reads and returns at most *n* frames of audio, as a string of bytes. + + +.. method:: Wave_read.rewind() + + Rewind the file pointer to the beginning of the audio stream. + +The following two methods are defined for compatibility with the :mod:`aifc` +module, and don't do anything interesting. + + +.. method:: Wave_read.getmarkers() + + Returns ``None``. + + +.. method:: Wave_read.getmark(id) + + Raise an error. + +The following two methods define a term "position" which is compatible between +them, and is otherwise implementation dependent. + + +.. method:: Wave_read.setpos(pos) + + Set the file pointer to the specified position. + + +.. method:: Wave_read.tell() + + Return current file pointer position. + + +.. _wave-write-objects: + +Wave_write Objects +------------------ + +Wave_write objects, as returned by :func:`open`, have the following methods: + + +.. method:: Wave_write.close() + + Make sure *nframes* is correct, and close the file. This method is called upon + deletion. + + +.. method:: Wave_write.setnchannels(n) + + Set the number of channels. + + +.. method:: Wave_write.setsampwidth(n) + + Set the sample width to *n* bytes. + + +.. method:: Wave_write.setframerate(n) + + Set the frame rate to *n*. + + +.. method:: Wave_write.setnframes(n) + + Set the number of frames to *n*. This will be changed later if more frames are + written. + + +.. method:: Wave_write.setcomptype(type, name) + + Set the compression type and description. At the moment, only compression type + ``NONE`` is supported, meaning no compression. + + +.. method:: Wave_write.setparams(tuple) + + The *tuple* should be ``(nchannels, sampwidth, framerate, nframes, comptype, + compname)``, with values valid for the :meth:`set\*` methods. Sets all + parameters. + + +.. method:: Wave_write.tell() + + Return current position in the file, with the same disclaimer for the + :meth:`Wave_read.tell` and :meth:`Wave_read.setpos` methods. + + +.. method:: Wave_write.writeframesraw(data) + + Write audio frames, without correcting *nframes*. + + +.. method:: Wave_write.writeframes(data) + + Write audio frames and make sure *nframes* is correct. + +Note that it is invalid to set any parameters after calling :meth:`writeframes` +or :meth:`writeframesraw`, and any attempt to do so will raise +:exc:`wave.Error`. + diff --git a/Doc/library/weakref.rst b/Doc/library/weakref.rst new file mode 100644 index 0000000..c5857ba --- /dev/null +++ b/Doc/library/weakref.rst @@ -0,0 +1,330 @@ + +:mod:`weakref` --- Weak references +================================== + +.. module:: weakref + :synopsis: Support for weak references and weak dictionaries. +.. moduleauthor:: Fred L. Drake, Jr. <fdrake@acm.org> +.. moduleauthor:: Neil Schemenauer <nas@arctrix.com> +.. moduleauthor:: Martin von Löwis <martin@loewis.home.cs.tu-berlin.de> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. versionadded:: 2.1 + +The :mod:`weakref` module allows the Python programmer to create :dfn:`weak +references` to objects. + +.. % When making changes to the examples in this file, be sure to update +.. % Lib/test/test_weakref.py::libreftest too! + +In the following, the term :dfn:`referent` means the object which is referred to +by a weak reference. + +A weak reference to an object is not enough to keep the object alive: when the +only remaining references to a referent are weak references, garbage collection +is free to destroy the referent and reuse its memory for something else. A +primary use for weak references is to implement caches or mappings holding large +objects, where it's desired that a large object not be kept alive solely because +it appears in a cache or mapping. For example, if you have a number of large +binary image objects, you may wish to associate a name with each. If you used a +Python dictionary to map names to images, or images to names, the image objects +would remain alive just because they appeared as values or keys in the +dictionaries. The :class:`WeakKeyDictionary` and :class:`WeakValueDictionary` +classes supplied by the :mod:`weakref` module are an alternative, using weak +references to construct mappings that don't keep objects alive solely because +they appear in the mapping objects. If, for example, an image object is a value +in a :class:`WeakValueDictionary`, then when the last remaining references to +that image object are the weak references held by weak mappings, garbage +collection can reclaim the object, and its corresponding entries in weak +mappings are simply deleted. + +:class:`WeakKeyDictionary` and :class:`WeakValueDictionary` use weak references +in their implementation, setting up callback functions on the weak references +that notify the weak dictionaries when a key or value has been reclaimed by +garbage collection. Most programs should find that using one of these weak +dictionary types is all they need -- it's not usually necessary to create your +own weak references directly. The low-level machinery used by the weak +dictionary implementations is exposed by the :mod:`weakref` module for the +benefit of advanced uses. + +Not all objects can be weakly referenced; those objects which can include class +instances, functions written in Python (but not in C), methods (both bound and +unbound), sets, frozensets, file objects, generators, type objects, DBcursor +objects from the :mod:`bsddb` module, sockets, arrays, deques, and regular +expression pattern objects. + +.. versionchanged:: 2.4 + Added support for files, sockets, arrays, and patterns. + +Several builtin types such as :class:`list` and :class:`dict` do not directly +support weak references but can add support through subclassing:: + + class Dict(dict): + pass + + obj = Dict(red=1, green=2, blue=3) # this object is weak referencable + +Extension types can easily be made to support weak references; see +:ref:`weakref-support`. + + +.. class:: ref(object[, callback]) + + Return a weak reference to *object*. The original object can be retrieved by + calling the reference object if the referent is still alive; if the referent is + no longer alive, calling the reference object will cause :const:`None` to be + returned. If *callback* is provided and not :const:`None`, and the returned + weakref object is still alive, the callback will be called when the object is + about to be finalized; the weak reference object will be passed as the only + parameter to the callback; the referent will no longer be available. + + It is allowable for many weak references to be constructed for the same object. + Callbacks registered for each weak reference will be called from the most + recently registered callback to the oldest registered callback. + + Exceptions raised by the callback will be noted on the standard error output, + but cannot be propagated; they are handled in exactly the same way as exceptions + raised from an object's :meth:`__del__` method. + + Weak references are hashable if the *object* is hashable. They will maintain + their hash value even after the *object* was deleted. If :func:`hash` is called + the first time only after the *object* was deleted, the call will raise + :exc:`TypeError`. + + Weak references support tests for equality, but not ordering. If the referents + are still alive, two references have the same equality relationship as their + referents (regardless of the *callback*). If either referent has been deleted, + the references are equal only if the reference objects are the same object. + + .. versionchanged:: 2.4 + This is now a subclassable type rather than a factory function; it derives from + :class:`object`. + + +.. function:: proxy(object[, callback]) + + Return a proxy to *object* which uses a weak reference. This supports use of + the proxy in most contexts instead of requiring the explicit dereferencing used + with weak reference objects. The returned object will have a type of either + ``ProxyType`` or ``CallableProxyType``, depending on whether *object* is + callable. Proxy objects are not hashable regardless of the referent; this + avoids a number of problems related to their fundamentally mutable nature, and + prevent their use as dictionary keys. *callback* is the same as the parameter + of the same name to the :func:`ref` function. + + +.. function:: getweakrefcount(object) + + Return the number of weak references and proxies which refer to *object*. + + +.. function:: getweakrefs(object) + + Return a list of all weak reference and proxy objects which refer to *object*. + + +.. class:: WeakKeyDictionary([dict]) + + Mapping class that references keys weakly. Entries in the dictionary will be + discarded when there is no longer a strong reference to the key. This can be + used to associate additional data with an object owned by other parts of an + application without adding attributes to those objects. This can be especially + useful with objects that override attribute accesses. + + .. note:: + + Caution: Because a :class:`WeakKeyDictionary` is built on top of a Python + dictionary, it must not change size when iterating over it. This can be + difficult to ensure for a :class:`WeakKeyDictionary` because actions performed + by the program during iteration may cause items in the dictionary to vanish "by + magic" (as a side effect of garbage collection). + +:class:`WeakKeyDictionary` objects have the following additional methods. These +expose the internal references directly. The references are not guaranteed to +be "live" at the time they are used, so the result of calling the references +needs to be checked before being used. This can be used to avoid creating +references that will cause the garbage collector to keep the keys around longer +than needed. + + +.. method:: WeakKeyDictionary.iterkeyrefs() + + Return an iterator that yields the weak references to the keys. + + .. versionadded:: 2.5 + + +.. method:: WeakKeyDictionary.keyrefs() + + Return a list of weak references to the keys. + + .. versionadded:: 2.5 + + +.. class:: WeakValueDictionary([dict]) + + Mapping class that references values weakly. Entries in the dictionary will be + discarded when no strong reference to the value exists any more. + + .. note:: + + Caution: Because a :class:`WeakValueDictionary` is built on top of a Python + dictionary, it must not change size when iterating over it. This can be + difficult to ensure for a :class:`WeakValueDictionary` because actions performed + by the program during iteration may cause items in the dictionary to vanish "by + magic" (as a side effect of garbage collection). + +:class:`WeakValueDictionary` objects have the following additional methods. +These method have the same issues as the :meth:`iterkeyrefs` and :meth:`keyrefs` +methods of :class:`WeakKeyDictionary` objects. + + +.. method:: WeakValueDictionary.itervaluerefs() + + Return an iterator that yields the weak references to the values. + + .. versionadded:: 2.5 + + +.. method:: WeakValueDictionary.valuerefs() + + Return a list of weak references to the values. + + .. versionadded:: 2.5 + + +.. data:: ReferenceType + + The type object for weak references objects. + + +.. data:: ProxyType + + The type object for proxies of objects which are not callable. + + +.. data:: CallableProxyType + + The type object for proxies of callable objects. + + +.. data:: ProxyTypes + + Sequence containing all the type objects for proxies. This can make it simpler + to test if an object is a proxy without being dependent on naming both proxy + types. + + +.. exception:: ReferenceError + + Exception raised when a proxy object is used but the underlying object has been + collected. This is the same as the standard :exc:`ReferenceError` exception. + + +.. seealso:: + + :pep:`0205` - Weak References + The proposal and rationale for this feature, including links to earlier + implementations and information about similar features in other languages. + + +.. _weakref-objects: + +Weak Reference Objects +---------------------- + +Weak reference objects have no attributes or methods, but do allow the referent +to be obtained, if it still exists, by calling it:: + + >>> import weakref + >>> class Object: + ... pass + ... + >>> o = Object() + >>> r = weakref.ref(o) + >>> o2 = r() + >>> o is o2 + True + +If the referent no longer exists, calling the reference object returns +:const:`None`:: + + >>> del o, o2 + >>> print r() + None + +Testing that a weak reference object is still live should be done using the +expression ``ref() is not None``. Normally, application code that needs to use +a reference object should follow this pattern:: + + # r is a weak reference object + o = r() + if o is None: + # referent has been garbage collected + print "Object has been deallocated; can't frobnicate." + else: + print "Object is still live!" + o.do_something_useful() + +Using a separate test for "liveness" creates race conditions in threaded +applications; another thread can cause a weak reference to become invalidated +before the weak reference is called; the idiom shown above is safe in threaded +applications as well as single-threaded applications. + +Specialized versions of :class:`ref` objects can be created through subclassing. +This is used in the implementation of the :class:`WeakValueDictionary` to reduce +the memory overhead for each entry in the mapping. This may be most useful to +associate additional information with a reference, but could also be used to +insert additional processing on calls to retrieve the referent. + +This example shows how a subclass of :class:`ref` can be used to store +additional information about an object and affect the value that's returned when +the referent is accessed:: + + import weakref + + class ExtendedRef(weakref.ref): + def __init__(self, ob, callback=None, **annotations): + super(ExtendedRef, self).__init__(ob, callback) + self.__counter = 0 + for k, v in annotations.iteritems(): + setattr(self, k, v) + + def __call__(self): + """Return a pair containing the referent and the number of + times the reference has been called. + """ + ob = super(ExtendedRef, self).__call__() + if ob is not None: + self.__counter += 1 + ob = (ob, self.__counter) + return ob + + +.. _weakref-example: + +Example +------- + +This simple example shows how an application can use objects IDs to retrieve +objects that it has seen before. The IDs of the objects can then be used in +other data structures without forcing the objects to remain alive, but the +objects can still be retrieved by ID if they do. + +.. % Example contributed by Tim Peters. + +:: + + import weakref + + _id2obj_dict = weakref.WeakValueDictionary() + + def remember(obj): + oid = id(obj) + _id2obj_dict[oid] = obj + return oid + + def id2obj(oid): + return _id2obj_dict[oid] + diff --git a/Doc/library/webbrowser.rst b/Doc/library/webbrowser.rst new file mode 100644 index 0000000..c243f7c --- /dev/null +++ b/Doc/library/webbrowser.rst @@ -0,0 +1,199 @@ + +:mod:`webbrowser` --- Convenient Web-browser controller +======================================================= + +.. module:: webbrowser + :synopsis: Easy-to-use controller for Web browsers. +.. moduleauthor:: Fred L. Drake, Jr. <fdrake@acm.org> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +The :mod:`webbrowser` module provides a high-level interface to allow displaying +Web-based documents to users. Under most circumstances, simply calling the +:func:`open` function from this module will do the right thing. + +Under Unix, graphical browsers are preferred under X11, but text-mode browsers +will be used if graphical browsers are not available or an X11 display isn't +available. If text-mode browsers are used, the calling process will block until +the user exits the browser. + +If the environment variable :envvar:`BROWSER` exists, it is interpreted to +override the platform default list of browsers, as a os.pathsep-separated list +of browsers to try in order. When the value of a list part contains the string +``%s``, then it is interpreted as a literal browser command line to be used +with the argument URL substituted for ``%s``; if the part does not contain +``%s``, it is simply interpreted as the name of the browser to launch. + +For non-Unix platforms, or when a remote browser is available on Unix, the +controlling process will not wait for the user to finish with the browser, but +allow the remote browser to maintain its own windows on the display. If remote +browsers are not available on Unix, the controlling process will launch a new +browser and wait. + +The script :program:`webbrowser` can be used as a command-line interface for the +module. It accepts an URL as the argument. It accepts the following optional +parameters: :option:`-n` opens the URL in a new browser window, if possible; +:option:`-t` opens the URL in a new browser page ("tab"). The options are, +naturally, mutually exclusive. + +The following exception is defined: + + +.. exception:: Error + + Exception raised when a browser control error occurs. + +The following functions are defined: + + +.. function:: open(url[, new=0[, autoraise=1]]) + + Display *url* using the default browser. If *new* is 0, the *url* is opened in + the same browser window if possible. If *new* is 1, a new browser window is + opened if possible. If *new* is 2, a new browser page ("tab") is opened if + possible. If *autoraise* is true, the window is raised if possible (note that + under many window managers this will occur regardless of the setting of this + variable). + + .. versionchanged:: 2.5 + *new* can now be 2. + + +.. function:: open_new(url) + + Open *url* in a new window of the default browser, if possible, otherwise, open + *url* in the only browser window. + + +.. function:: open_new_tab(url) + + Open *url* in a new page ("tab") of the default browser, if possible, otherwise + equivalent to :func:`open_new`. + + .. versionadded:: 2.5 + + +.. function:: get([name]) + + Return a controller object for the browser type *name*. If *name* is empty, + return a controller for a default browser appropriate to the caller's + environment. + + +.. function:: register(name, constructor[, instance]) + + Register the browser type *name*. Once a browser type is registered, the + :func:`get` function can return a controller for that browser type. If + *instance* is not provided, or is ``None``, *constructor* will be called without + parameters to create an instance when needed. If *instance* is provided, + *constructor* will never be called, and may be ``None``. + + This entry point is only useful if you plan to either set the :envvar:`BROWSER` + variable or call :func:`get` with a nonempty argument matching the name of a + handler you declare. + +A number of browser types are predefined. This table gives the type names that +may be passed to the :func:`get` function and the corresponding instantiations +for the controller classes, all defined in this module. + ++-----------------------+-----------------------------------------+-------+ +| Type Name | Class Name | Notes | ++=======================+=========================================+=======+ +| ``'mozilla'`` | :class:`Mozilla('mozilla')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'firefox'`` | :class:`Mozilla('mozilla')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'netscape'`` | :class:`Mozilla('netscape')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'galeon'`` | :class:`Galeon('galeon')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'epiphany'`` | :class:`Galeon('epiphany')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'skipstone'`` | :class:`BackgroundBrowser('skipstone')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'kfmclient'`` | :class:`Konqueror()` | \(1) | ++-----------------------+-----------------------------------------+-------+ +| ``'konqueror'`` | :class:`Konqueror()` | \(1) | ++-----------------------+-----------------------------------------+-------+ +| ``'kfm'`` | :class:`Konqueror()` | \(1) | ++-----------------------+-----------------------------------------+-------+ +| ``'mosaic'`` | :class:`BackgroundBrowser('mosaic')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'opera'`` | :class:`Opera()` | | ++-----------------------+-----------------------------------------+-------+ +| ``'grail'`` | :class:`Grail()` | | ++-----------------------+-----------------------------------------+-------+ +| ``'links'`` | :class:`GenericBrowser('links')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'elinks'`` | :class:`Elinks('elinks')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'lynx'`` | :class:`GenericBrowser('lynx')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'w3m'`` | :class:`GenericBrowser('w3m')` | | ++-----------------------+-----------------------------------------+-------+ +| ``'windows-default'`` | :class:`WindowsDefault` | \(2) | ++-----------------------+-----------------------------------------+-------+ +| ``'internet-config'`` | :class:`InternetConfig` | \(3) | ++-----------------------+-----------------------------------------+-------+ +| ``'macosx'`` | :class:`MacOSX('default')` | \(4) | ++-----------------------+-----------------------------------------+-------+ + +Notes: + +(1) + "Konqueror" is the file manager for the KDE desktop environment for Unix, and + only makes sense to use if KDE is running. Some way of reliably detecting KDE + would be nice; the :envvar:`KDEDIR` variable is not sufficient. Note also that + the name "kfm" is used even when using the :program:`konqueror` command with KDE + 2 --- the implementation selects the best strategy for running Konqueror. + +(2) + Only on Windows platforms. + +(3) + Only on MacOS platforms; requires the standard MacPython :mod:`ic` module. + +(4) + Only on MacOS X platform. + +Here are some simple examples:: + + url = 'http://www.python.org' + + # Open URL in a new tab, if a browser window is already open. + webbrowser.open_new_tab(url + '/doc') + + # Open URL in new window, raising the window if possible. + webbrowser.open_new(url) + + +.. _browser-controllers: + +Browser Controller Objects +-------------------------- + +Browser controllers provide two methods which parallel two of the module-level +convenience functions: + + +.. method:: controller.open(url[, new[, autoraise=1]]) + + Display *url* using the browser handled by this controller. If *new* is 1, a new + browser window is opened if possible. If *new* is 2, a new browser page ("tab") + is opened if possible. + + +.. method:: controller.open_new(url) + + Open *url* in a new window of the browser handled by this controller, if + possible, otherwise, open *url* in the only browser window. Alias + :func:`open_new`. + + +.. method:: controller.open_new_tab(url) + + Open *url* in a new page ("tab") of the browser handled by this controller, if + possible, otherwise equivalent to :func:`open_new`. + + .. versionadded:: 2.5 + diff --git a/Doc/library/whichdb.rst b/Doc/library/whichdb.rst new file mode 100644 index 0000000..5c69818 --- /dev/null +++ b/Doc/library/whichdb.rst @@ -0,0 +1,20 @@ + +:mod:`whichdb` --- Guess which DBM module created a database +============================================================ + +.. module:: whichdb + :synopsis: Guess which DBM-style module created a given database. + + +The single function in this module attempts to guess which of the several simple +database modules available--\ :mod:`dbm`, :mod:`gdbm`, or :mod:`dbhash`\ +--should be used to open a given file. + + +.. function:: whichdb(filename) + + Returns one of the following values: ``None`` if the file can't be opened + because it's unreadable or doesn't exist; the empty string (``''``) if the + file's format can't be guessed; or a string containing the required module name, + such as ``'dbm'`` or ``'gdbm'``. + diff --git a/Doc/library/windows.rst b/Doc/library/windows.rst new file mode 100644 index 0000000..a231bc2 --- /dev/null +++ b/Doc/library/windows.rst @@ -0,0 +1,14 @@ + +**************************** +MS Windows Specific Services +**************************** + +This chapter describes modules that are only available on MS Windows platforms. + + +.. toctree:: + + msilib.rst + msvcrt.rst + _winreg.rst + winsound.rst diff --git a/Doc/library/winsound.rst b/Doc/library/winsound.rst new file mode 100644 index 0000000..c4c04bd --- /dev/null +++ b/Doc/library/winsound.rst @@ -0,0 +1,162 @@ + +:mod:`winsound` --- Sound-playing interface for Windows +======================================================= + +.. module:: winsound + :platform: Windows + :synopsis: Access to the sound-playing machinery for Windows. +.. moduleauthor:: Toby Dickenson <htrd90@zepler.org> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> + + +.. versionadded:: 1.5.2 + +The :mod:`winsound` module provides access to the basic sound-playing machinery +provided by Windows platforms. It includes functions and several constants. + + +.. function:: Beep(frequency, duration) + + Beep the PC's speaker. The *frequency* parameter specifies frequency, in hertz, + of the sound, and must be in the range 37 through 32,767. The *duration* + parameter specifies the number of milliseconds the sound should last. If the + system is not able to beep the speaker, :exc:`RuntimeError` is raised. + + .. note:: + + Under Windows 95 and 98, the Windows :cfunc:`Beep` function exists but is + useless (it ignores its arguments). In that case Python simulates it via direct + port manipulation (added in version 2.1). It's unknown whether that will work + on all systems. + + .. versionadded:: 1.6 + + +.. function:: PlaySound(sound, flags) + + Call the underlying :cfunc:`PlaySound` function from the Platform API. The + *sound* parameter may be a filename, audio data as a string, or ``None``. Its + interpretation depends on the value of *flags*, which can be a bit-wise ORed + combination of the constants described below. If the system indicates an error, + :exc:`RuntimeError` is raised. + + +.. function:: MessageBeep([type=MB_OK]) + + Call the underlying :cfunc:`MessageBeep` function from the Platform API. This + plays a sound as specified in the registry. The *type* argument specifies which + sound to play; possible values are ``-1``, ``MB_ICONASTERISK``, + ``MB_ICONEXCLAMATION``, ``MB_ICONHAND``, ``MB_ICONQUESTION``, and ``MB_OK``, all + described below. The value ``-1`` produces a "simple beep"; this is the final + fallback if a sound cannot be played otherwise. + + .. versionadded:: 2.3 + + +.. data:: SND_FILENAME + + The *sound* parameter is the name of a WAV file. Do not use with + :const:`SND_ALIAS`. + + +.. data:: SND_ALIAS + + The *sound* parameter is a sound association name from the registry. If the + registry contains no such name, play the system default sound unless + :const:`SND_NODEFAULT` is also specified. If no default sound is registered, + raise :exc:`RuntimeError`. Do not use with :const:`SND_FILENAME`. + + All Win32 systems support at least the following; most systems support many + more: + + +--------------------------+----------------------------------------+ + | :func:`PlaySound` *name* | Corresponding Control Panel Sound name | + +==========================+========================================+ + | ``'SystemAsterisk'`` | Asterisk | + +--------------------------+----------------------------------------+ + | ``'SystemExclamation'`` | Exclamation | + +--------------------------+----------------------------------------+ + | ``'SystemExit'`` | Exit Windows | + +--------------------------+----------------------------------------+ + | ``'SystemHand'`` | Critical Stop | + +--------------------------+----------------------------------------+ + | ``'SystemQuestion'`` | Question | + +--------------------------+----------------------------------------+ + + For example:: + + import winsound + # Play Windows exit sound. + winsound.PlaySound("SystemExit", winsound.SND_ALIAS) + + # Probably play Windows default sound, if any is registered (because + # "*" probably isn't the registered name of any sound). + winsound.PlaySound("*", winsound.SND_ALIAS) + + +.. data:: SND_LOOP + + Play the sound repeatedly. The :const:`SND_ASYNC` flag must also be used to + avoid blocking. Cannot be used with :const:`SND_MEMORY`. + + +.. data:: SND_MEMORY + + The *sound* parameter to :func:`PlaySound` is a memory image of a WAV file, as a + string. + + .. note:: + + This module does not support playing from a memory image asynchronously, so a + combination of this flag and :const:`SND_ASYNC` will raise :exc:`RuntimeError`. + + +.. data:: SND_PURGE + + Stop playing all instances of the specified sound. + + +.. data:: SND_ASYNC + + Return immediately, allowing sounds to play asynchronously. + + +.. data:: SND_NODEFAULT + + If the specified sound cannot be found, do not play the system default sound. + + +.. data:: SND_NOSTOP + + Do not interrupt sounds currently playing. + + +.. data:: SND_NOWAIT + + Return immediately if the sound driver is busy. + + +.. data:: MB_ICONASTERISK + + Play the ``SystemDefault`` sound. + + +.. data:: MB_ICONEXCLAMATION + + Play the ``SystemExclamation`` sound. + + +.. data:: MB_ICONHAND + + Play the ``SystemHand`` sound. + + +.. data:: MB_ICONQUESTION + + Play the ``SystemQuestion`` sound. + + +.. data:: MB_OK + + Play the ``SystemDefault`` sound. + diff --git a/Doc/library/wsgiref.rst b/Doc/library/wsgiref.rst new file mode 100644 index 0000000..ff68684 --- /dev/null +++ b/Doc/library/wsgiref.rst @@ -0,0 +1,641 @@ +:mod:`wsgiref` --- WSGI Utilities and Reference Implementation +============================================================== + +.. module:: wsgiref + :synopsis: WSGI Utilities and Reference Implementation. +.. moduleauthor:: Phillip J. Eby <pje@telecommunity.com> +.. sectionauthor:: Phillip J. Eby <pje@telecommunity.com> + + +.. versionadded:: 2.5 + +The Web Server Gateway Interface (WSGI) is a standard interface between web +server software and web applications written in Python. Having a standard +interface makes it easy to use an application that supports WSGI with a number +of different web servers. + +Only authors of web servers and programming frameworks need to know every detail +and corner case of the WSGI design. You don't need to understand every detail +of WSGI just to install a WSGI application or to write a web application using +an existing framework. + +:mod:`wsgiref` is a reference implementation of the WSGI specification that can +be used to add WSGI support to a web server or framework. It provides utilities +for manipulating WSGI environment variables and response headers, base classes +for implementing WSGI servers, a demo HTTP server that serves WSGI applications, +and a validation tool that checks WSGI servers and applications for conformance +to the WSGI specification (:pep:`333`). + +See http://www.wsgi.org for more information about WSGI, and links to tutorials +and other resources. + +.. % XXX If you're just trying to write a web application... + + +:mod:`wsgiref.util` -- WSGI environment utilities +------------------------------------------------- + +.. module:: wsgiref.util + :synopsis: WSGI environment utilities. + + +This module provides a variety of utility functions for working with WSGI +environments. A WSGI environment is a dictionary containing HTTP request +variables as described in :pep:`333`. All of the functions taking an *environ* +parameter expect a WSGI-compliant dictionary to be supplied; please see +:pep:`333` for a detailed specification. + + +.. function:: guess_scheme(environ) + + Return a guess for whether ``wsgi.url_scheme`` should be "http" or "https", by + checking for a ``HTTPS`` environment variable in the *environ* dictionary. The + return value is a string. + + This function is useful when creating a gateway that wraps CGI or a CGI-like + protocol such as FastCGI. Typically, servers providing such protocols will + include a ``HTTPS`` variable with a value of "1" "yes", or "on" when a request + is received via SSL. So, this function returns "https" if such a value is + found, and "http" otherwise. + + +.. function:: request_uri(environ [, include_query=1]) + + Return the full request URI, optionally including the query string, using the + algorithm found in the "URL Reconstruction" section of :pep:`333`. If + *include_query* is false, the query string is not included in the resulting URI. + + +.. function:: application_uri(environ) + + Similar to :func:`request_uri`, except that the ``PATH_INFO`` and + ``QUERY_STRING`` variables are ignored. The result is the base URI of the + application object addressed by the request. + + +.. function:: shift_path_info(environ) + + Shift a single name from ``PATH_INFO`` to ``SCRIPT_NAME`` and return the name. + The *environ* dictionary is *modified* in-place; use a copy if you need to keep + the original ``PATH_INFO`` or ``SCRIPT_NAME`` intact. + + If there are no remaining path segments in ``PATH_INFO``, ``None`` is returned. + + Typically, this routine is used to process each portion of a request URI path, + for example to treat the path as a series of dictionary keys. This routine + modifies the passed-in environment to make it suitable for invoking another WSGI + application that is located at the target URI. For example, if there is a WSGI + application at ``/foo``, and the request URI path is ``/foo/bar/baz``, and the + WSGI application at ``/foo`` calls :func:`shift_path_info`, it will receive the + string "bar", and the environment will be updated to be suitable for passing to + a WSGI application at ``/foo/bar``. That is, ``SCRIPT_NAME`` will change from + ``/foo`` to ``/foo/bar``, and ``PATH_INFO`` will change from ``/bar/baz`` to + ``/baz``. + + When ``PATH_INFO`` is just a "/", this routine returns an empty string and + appends a trailing slash to ``SCRIPT_NAME``, even though empty path segments are + normally ignored, and ``SCRIPT_NAME`` doesn't normally end in a slash. This is + intentional behavior, to ensure that an application can tell the difference + between URIs ending in ``/x`` from ones ending in ``/x/`` when using this + routine to do object traversal. + + +.. function:: setup_testing_defaults(environ) + + Update *environ* with trivial defaults for testing purposes. + + This routine adds various parameters required for WSGI, including ``HTTP_HOST``, + ``SERVER_NAME``, ``SERVER_PORT``, ``REQUEST_METHOD``, ``SCRIPT_NAME``, + ``PATH_INFO``, and all of the :pep:`333`\ -defined ``wsgi.*`` variables. It + only supplies default values, and does not replace any existing settings for + these variables. + + This routine is intended to make it easier for unit tests of WSGI servers and + applications to set up dummy environments. It should NOT be used by actual WSGI + servers or applications, since the data is fake! + +In addition to the environment functions above, the :mod:`wsgiref.util` module +also provides these miscellaneous utilities: + + +.. function:: is_hop_by_hop(header_name) + + Return true if 'header_name' is an HTTP/1.1 "Hop-by-Hop" header, as defined by + :rfc:`2616`. + + +.. class:: FileWrapper(filelike [, blksize=8192]) + + A wrapper to convert a file-like object to an iterator. The resulting objects + support both :meth:`__getitem__` and :meth:`__iter__` iteration styles, for + compatibility with Python 2.1 and Jython. As the object is iterated over, the + optional *blksize* parameter will be repeatedly passed to the *filelike* + object's :meth:`read` method to obtain strings to yield. When :meth:`read` + returns an empty string, iteration is ended and is not resumable. + + If *filelike* has a :meth:`close` method, the returned object will also have a + :meth:`close` method, and it will invoke the *filelike* object's :meth:`close` + method when called. + + +:mod:`wsgiref.headers` -- WSGI response header tools +---------------------------------------------------- + +.. module:: wsgiref.headers + :synopsis: WSGI response header tools. + + +This module provides a single class, :class:`Headers`, for convenient +manipulation of WSGI response headers using a mapping-like interface. + + +.. class:: Headers(headers) + + Create a mapping-like object wrapping *headers*, which must be a list of header + name/value tuples as described in :pep:`333`. Any changes made to the new + :class:`Headers` object will directly update the *headers* list it was created + with. + + :class:`Headers` objects support typical mapping operations including + :meth:`__getitem__`, :meth:`get`, :meth:`__setitem__`, :meth:`setdefault`, + :meth:`__delitem__`, :meth:`__contains__` and :meth:`has_key`. For each of + these methods, the key is the header name (treated case-insensitively), and the + value is the first value associated with that header name. Setting a header + deletes any existing values for that header, then adds a new value at the end of + the wrapped header list. Headers' existing order is generally maintained, with + new headers added to the end of the wrapped list. + + Unlike a dictionary, :class:`Headers` objects do not raise an error when you try + to get or delete a key that isn't in the wrapped header list. Getting a + nonexistent header just returns ``None``, and deleting a nonexistent header does + nothing. + + :class:`Headers` objects also support :meth:`keys`, :meth:`values`, and + :meth:`items` methods. The lists returned by :meth:`keys` and :meth:`items` can + include the same key more than once if there is a multi-valued header. The + ``len()`` of a :class:`Headers` object is the same as the length of its + :meth:`items`, which is the same as the length of the wrapped header list. In + fact, the :meth:`items` method just returns a copy of the wrapped header list. + + Calling ``str()`` on a :class:`Headers` object returns a formatted string + suitable for transmission as HTTP response headers. Each header is placed on a + line with its value, separated by a colon and a space. Each line is terminated + by a carriage return and line feed, and the string is terminated with a blank + line. + + In addition to their mapping interface and formatting features, :class:`Headers` + objects also have the following methods for querying and adding multi-valued + headers, and for adding headers with MIME parameters: + + + .. method:: Headers.get_all(name) + + Return a list of all the values for the named header. + + The returned list will be sorted in the order they appeared in the original + header list or were added to this instance, and may contain duplicates. Any + fields deleted and re-inserted are always appended to the header list. If no + fields exist with the given name, returns an empty list. + + + .. method:: Headers.add_header(name, value, **_params) + + Add a (possibly multi-valued) header, with optional MIME parameters specified + via keyword arguments. + + *name* is the header field to add. Keyword arguments can be used to set MIME + parameters for the header field. Each parameter must be a string or ``None``. + Underscores in parameter names are converted to dashes, since dashes are illegal + in Python identifiers, but many MIME parameter names include dashes. If the + parameter value is a string, it is added to the header value parameters in the + form ``name="value"``. If it is ``None``, only the parameter name is added. + (This is used for MIME parameters without a value.) Example usage:: + + h.add_header('content-disposition', 'attachment', filename='bud.gif') + + The above will add a header that looks like this:: + + Content-Disposition: attachment; filename="bud.gif" + + +:mod:`wsgiref.simple_server` -- a simple WSGI HTTP server +--------------------------------------------------------- + +.. module:: wsgiref.simple_server + :synopsis: A simple WSGI HTTP server. + + +This module implements a simple HTTP server (based on :mod:`BaseHTTPServer`) +that serves WSGI applications. Each server instance serves a single WSGI +application on a given host and port. If you want to serve multiple +applications on a single host and port, you should create a WSGI application +that parses ``PATH_INFO`` to select which application to invoke for each +request. (E.g., using the :func:`shift_path_info` function from +:mod:`wsgiref.util`.) + + +.. function:: make_server(host, port, app [, server_class=WSGIServer [, handler_class=WSGIRequestHandler]]) + + Create a new WSGI server listening on *host* and *port*, accepting connections + for *app*. The return value is an instance of the supplied *server_class*, and + will process requests using the specified *handler_class*. *app* must be a WSGI + application object, as defined by :pep:`333`. + + Example usage:: + + from wsgiref.simple_server import make_server, demo_app + + httpd = make_server('', 8000, demo_app) + print "Serving HTTP on port 8000..." + + # Respond to requests until process is killed + httpd.serve_forever() + + # Alternative: serve one request, then exit + ##httpd.handle_request() + + +.. function:: demo_app(environ, start_response) + + This function is a small but complete WSGI application that returns a text page + containing the message "Hello world!" and a list of the key/value pairs provided + in the *environ* parameter. It's useful for verifying that a WSGI server (such + as :mod:`wsgiref.simple_server`) is able to run a simple WSGI application + correctly. + + +.. class:: WSGIServer(server_address, RequestHandlerClass) + + Create a :class:`WSGIServer` instance. *server_address* should be a + ``(host,port)`` tuple, and *RequestHandlerClass* should be the subclass of + :class:`BaseHTTPServer.BaseHTTPRequestHandler` that will be used to process + requests. + + You do not normally need to call this constructor, as the :func:`make_server` + function can handle all the details for you. + + :class:`WSGIServer` is a subclass of :class:`BaseHTTPServer.HTTPServer`, so all + of its methods (such as :meth:`serve_forever` and :meth:`handle_request`) are + available. :class:`WSGIServer` also provides these WSGI-specific methods: + + + .. method:: WSGIServer.set_app(application) + + Sets the callable *application* as the WSGI application that will receive + requests. + + + .. method:: WSGIServer.get_app() + + Returns the currently-set application callable. + + Normally, however, you do not need to use these additional methods, as + :meth:`set_app` is normally called by :func:`make_server`, and the + :meth:`get_app` exists mainly for the benefit of request handler instances. + + +.. class:: WSGIRequestHandler(request, client_address, server) + + Create an HTTP handler for the given *request* (i.e. a socket), *client_address* + (a ``(host,port)`` tuple), and *server* (:class:`WSGIServer` instance). + + You do not need to create instances of this class directly; they are + automatically created as needed by :class:`WSGIServer` objects. You can, + however, subclass this class and supply it as a *handler_class* to the + :func:`make_server` function. Some possibly relevant methods for overriding in + subclasses: + + + .. method:: WSGIRequestHandler.get_environ() + + Returns a dictionary containing the WSGI environment for a request. The default + implementation copies the contents of the :class:`WSGIServer` object's + :attr:`base_environ` dictionary attribute and then adds various headers derived + from the HTTP request. Each call to this method should return a new dictionary + containing all of the relevant CGI environment variables as specified in + :pep:`333`. + + + .. method:: WSGIRequestHandler.get_stderr() + + Return the object that should be used as the ``wsgi.errors`` stream. The default + implementation just returns ``sys.stderr``. + + + .. method:: WSGIRequestHandler.handle() + + Process the HTTP request. The default implementation creates a handler instance + using a :mod:`wsgiref.handlers` class to implement the actual WSGI application + interface. + + +:mod:`wsgiref.validate` --- WSGI conformance checker +---------------------------------------------------- + +.. module:: wsgiref.validate + :synopsis: WSGI conformance checker. + + +When creating new WSGI application objects, frameworks, servers, or middleware, +it can be useful to validate the new code's conformance using +:mod:`wsgiref.validate`. This module provides a function that creates WSGI +application objects that validate communications between a WSGI server or +gateway and a WSGI application object, to check both sides for protocol +conformance. + +Note that this utility does not guarantee complete :pep:`333` compliance; an +absence of errors from this module does not necessarily mean that errors do not +exist. However, if this module does produce an error, then it is virtually +certain that either the server or application is not 100% compliant. + +This module is based on the :mod:`paste.lint` module from Ian Bicking's "Python +Paste" library. + + +.. function:: validator(application) + + Wrap *application* and return a new WSGI application object. The returned + application will forward all requests to the original *application*, and will + check that both the *application* and the server invoking it are conforming to + the WSGI specification and to RFC 2616. + + Any detected nonconformance results in an :exc:`AssertionError` being raised; + note, however, that how these errors are handled is server-dependent. For + example, :mod:`wsgiref.simple_server` and other servers based on + :mod:`wsgiref.handlers` (that don't override the error handling methods to do + something else) will simply output a message that an error has occurred, and + dump the traceback to ``sys.stderr`` or some other error stream. + + This wrapper may also generate output using the :mod:`warnings` module to + indicate behaviors that are questionable but which may not actually be + prohibited by :pep:`333`. Unless they are suppressed using Python command-line + options or the :mod:`warnings` API, any such warnings will be written to + ``sys.stderr`` (*not* ``wsgi.errors``, unless they happen to be the same + object). + + +:mod:`wsgiref.handlers` -- server/gateway base classes +------------------------------------------------------ + +.. module:: wsgiref.handlers + :synopsis: WSGI server/gateway base classes. + + +This module provides base handler classes for implementing WSGI servers and +gateways. These base classes handle most of the work of communicating with a +WSGI application, as long as they are given a CGI-like environment, along with +input, output, and error streams. + + +.. class:: CGIHandler() + + CGI-based invocation via ``sys.stdin``, ``sys.stdout``, ``sys.stderr`` and + ``os.environ``. This is useful when you have a WSGI application and want to run + it as a CGI script. Simply invoke ``CGIHandler().run(app)``, where ``app`` is + the WSGI application object you wish to invoke. + + This class is a subclass of :class:`BaseCGIHandler` that sets ``wsgi.run_once`` + to true, ``wsgi.multithread`` to false, and ``wsgi.multiprocess`` to true, and + always uses :mod:`sys` and :mod:`os` to obtain the necessary CGI streams and + environment. + + +.. class:: BaseCGIHandler(stdin, stdout, stderr, environ [, multithread=True [, multiprocess=False]]) + + Similar to :class:`CGIHandler`, but instead of using the :mod:`sys` and + :mod:`os` modules, the CGI environment and I/O streams are specified explicitly. + The *multithread* and *multiprocess* values are used to set the + ``wsgi.multithread`` and ``wsgi.multiprocess`` flags for any applications run by + the handler instance. + + This class is a subclass of :class:`SimpleHandler` intended for use with + software other than HTTP "origin servers". If you are writing a gateway + protocol implementation (such as CGI, FastCGI, SCGI, etc.) that uses a + ``Status:`` header to send an HTTP status, you probably want to subclass this + instead of :class:`SimpleHandler`. + + +.. class:: SimpleHandler(stdin, stdout, stderr, environ [,multithread=True [, multiprocess=False]]) + + Similar to :class:`BaseCGIHandler`, but designed for use with HTTP origin + servers. If you are writing an HTTP server implementation, you will probably + want to subclass this instead of :class:`BaseCGIHandler` + + This class is a subclass of :class:`BaseHandler`. It overrides the + :meth:`__init__`, :meth:`get_stdin`, :meth:`get_stderr`, :meth:`add_cgi_vars`, + :meth:`_write`, and :meth:`_flush` methods to support explicitly setting the + environment and streams via the constructor. The supplied environment and + streams are stored in the :attr:`stdin`, :attr:`stdout`, :attr:`stderr`, and + :attr:`environ` attributes. + + +.. class:: BaseHandler() + + This is an abstract base class for running WSGI applications. Each instance + will handle a single HTTP request, although in principle you could create a + subclass that was reusable for multiple requests. + + :class:`BaseHandler` instances have only one method intended for external use: + + + .. method:: BaseHandler.run(app) + + Run the specified WSGI application, *app*. + + All of the other :class:`BaseHandler` methods are invoked by this method in the + process of running the application, and thus exist primarily to allow + customizing the process. + + The following methods MUST be overridden in a subclass: + + + .. method:: BaseHandler._write(data) + + Buffer the string *data* for transmission to the client. It's okay if this + method actually transmits the data; :class:`BaseHandler` just separates write + and flush operations for greater efficiency when the underlying system actually + has such a distinction. + + + .. method:: BaseHandler._flush() + + Force buffered data to be transmitted to the client. It's okay if this method + is a no-op (i.e., if :meth:`_write` actually sends the data). + + + .. method:: BaseHandler.get_stdin() + + Return an input stream object suitable for use as the ``wsgi.input`` of the + request currently being processed. + + + .. method:: BaseHandler.get_stderr() + + Return an output stream object suitable for use as the ``wsgi.errors`` of the + request currently being processed. + + + .. method:: BaseHandler.add_cgi_vars() + + Insert CGI variables for the current request into the :attr:`environ` attribute. + + Here are some other methods and attributes you may wish to override. This list + is only a summary, however, and does not include every method that can be + overridden. You should consult the docstrings and source code for additional + information before attempting to create a customized :class:`BaseHandler` + subclass. + + Attributes and methods for customizing the WSGI environment: + + + .. attribute:: BaseHandler.wsgi_multithread + + The value to be used for the ``wsgi.multithread`` environment variable. It + defaults to true in :class:`BaseHandler`, but may have a different default (or + be set by the constructor) in the other subclasses. + + + .. attribute:: BaseHandler.wsgi_multiprocess + + The value to be used for the ``wsgi.multiprocess`` environment variable. It + defaults to true in :class:`BaseHandler`, but may have a different default (or + be set by the constructor) in the other subclasses. + + + .. attribute:: BaseHandler.wsgi_run_once + + The value to be used for the ``wsgi.run_once`` environment variable. It + defaults to false in :class:`BaseHandler`, but :class:`CGIHandler` sets it to + true by default. + + + .. attribute:: BaseHandler.os_environ + + The default environment variables to be included in every request's WSGI + environment. By default, this is a copy of ``os.environ`` at the time that + :mod:`wsgiref.handlers` was imported, but subclasses can either create their own + at the class or instance level. Note that the dictionary should be considered + read-only, since the default value is shared between multiple classes and + instances. + + + .. attribute:: BaseHandler.server_software + + If the :attr:`origin_server` attribute is set, this attribute's value is used to + set the default ``SERVER_SOFTWARE`` WSGI environment variable, and also to set a + default ``Server:`` header in HTTP responses. It is ignored for handlers (such + as :class:`BaseCGIHandler` and :class:`CGIHandler`) that are not HTTP origin + servers. + + + .. method:: BaseHandler.get_scheme() + + Return the URL scheme being used for the current request. The default + implementation uses the :func:`guess_scheme` function from :mod:`wsgiref.util` + to guess whether the scheme should be "http" or "https", based on the current + request's :attr:`environ` variables. + + + .. method:: BaseHandler.setup_environ() + + Set the :attr:`environ` attribute to a fully-populated WSGI environment. The + default implementation uses all of the above methods and attributes, plus the + :meth:`get_stdin`, :meth:`get_stderr`, and :meth:`add_cgi_vars` methods and the + :attr:`wsgi_file_wrapper` attribute. It also inserts a ``SERVER_SOFTWARE`` key + if not present, as long as the :attr:`origin_server` attribute is a true value + and the :attr:`server_software` attribute is set. + + Methods and attributes for customizing exception handling: + + + .. method:: BaseHandler.log_exception(exc_info) + + Log the *exc_info* tuple in the server log. *exc_info* is a ``(type, value, + traceback)`` tuple. The default implementation simply writes the traceback to + the request's ``wsgi.errors`` stream and flushes it. Subclasses can override + this method to change the format or retarget the output, mail the traceback to + an administrator, or whatever other action may be deemed suitable. + + + .. attribute:: BaseHandler.traceback_limit + + The maximum number of frames to include in tracebacks output by the default + :meth:`log_exception` method. If ``None``, all frames are included. + + + .. method:: BaseHandler.error_output(environ, start_response) + + This method is a WSGI application to generate an error page for the user. It is + only invoked if an error occurs before headers are sent to the client. + + This method can access the current error information using ``sys.exc_info()``, + and should pass that information to *start_response* when calling it (as + described in the "Error Handling" section of :pep:`333`). + + The default implementation just uses the :attr:`error_status`, + :attr:`error_headers`, and :attr:`error_body` attributes to generate an output + page. Subclasses can override this to produce more dynamic error output. + + Note, however, that it's not recommended from a security perspective to spit out + diagnostics to any old user; ideally, you should have to do something special to + enable diagnostic output, which is why the default implementation doesn't + include any. + + + .. attribute:: BaseHandler.error_status + + The HTTP status used for error responses. This should be a status string as + defined in :pep:`333`; it defaults to a 500 code and message. + + + .. attribute:: BaseHandler.error_headers + + The HTTP headers used for error responses. This should be a list of WSGI + response headers (``(name, value)`` tuples), as described in :pep:`333`. The + default list just sets the content type to ``text/plain``. + + + .. attribute:: BaseHandler.error_body + + The error response body. This should be an HTTP response body string. It + defaults to the plain text, "A server error occurred. Please contact the + administrator." + + Methods and attributes for :pep:`333`'s "Optional Platform-Specific File + Handling" feature: + + + .. attribute:: BaseHandler.wsgi_file_wrapper + + A ``wsgi.file_wrapper`` factory, or ``None``. The default value of this + attribute is the :class:`FileWrapper` class from :mod:`wsgiref.util`. + + + .. method:: BaseHandler.sendfile() + + Override to implement platform-specific file transmission. This method is + called only if the application's return value is an instance of the class + specified by the :attr:`wsgi_file_wrapper` attribute. It should return a true + value if it was able to successfully transmit the file, so that the default + transmission code will not be executed. The default implementation of this + method just returns a false value. + + Miscellaneous methods and attributes: + + + .. attribute:: BaseHandler.origin_server + + This attribute should be set to a true value if the handler's :meth:`_write` and + :meth:`_flush` are being used to communicate directly to the client, rather than + via a CGI-like gateway protocol that wants the HTTP status in a special + ``Status:`` header. + + This attribute's default value is true in :class:`BaseHandler`, but false in + :class:`BaseCGIHandler` and :class:`CGIHandler`. + + + .. attribute:: BaseHandler.http_version + + If :attr:`origin_server` is true, this string attribute is used to set the HTTP + version of the response set to the client. It defaults to ``"1.0"``. + diff --git a/Doc/library/xdrlib.rst b/Doc/library/xdrlib.rst new file mode 100644 index 0000000..6339a7f --- /dev/null +++ b/Doc/library/xdrlib.rst @@ -0,0 +1,276 @@ + +:mod:`xdrlib` --- Encode and decode XDR data +============================================ + +.. module:: xdrlib + :synopsis: Encoders and decoders for the External Data Representation (XDR). + + +.. index:: + single: XDR + single: External Data Representation + +The :mod:`xdrlib` module supports the External Data Representation Standard as +described in :rfc:`1014`, written by Sun Microsystems, Inc. June 1987. It +supports most of the data types described in the RFC. + +The :mod:`xdrlib` module defines two classes, one for packing variables into XDR +representation, and another for unpacking from XDR representation. There are +also two exception classes. + + +.. class:: Packer() + + :class:`Packer` is the class for packing data into XDR representation. The + :class:`Packer` class is instantiated with no arguments. + + +.. class:: Unpacker(data) + + ``Unpacker`` is the complementary class which unpacks XDR data values from a + string buffer. The input buffer is given as *data*. + + +.. seealso:: + + :rfc:`1014` - XDR: External Data Representation Standard + This RFC defined the encoding of data which was XDR at the time this module was + originally written. It has apparently been obsoleted by :rfc:`1832`. + + :rfc:`1832` - XDR: External Data Representation Standard + Newer RFC that provides a revised definition of XDR. + + +.. _xdr-packer-objects: + +Packer Objects +-------------- + +:class:`Packer` instances have the following methods: + + +.. method:: Packer.get_buffer() + + Returns the current pack buffer as a string. + + +.. method:: Packer.reset() + + Resets the pack buffer to the empty string. + +In general, you can pack any of the most common XDR data types by calling the +appropriate ``pack_type()`` method. Each method takes a single argument, the +value to pack. The following simple data type packing methods are supported: +:meth:`pack_uint`, :meth:`pack_int`, :meth:`pack_enum`, :meth:`pack_bool`, +:meth:`pack_uhyper`, and :meth:`pack_hyper`. + + +.. method:: Packer.pack_float(value) + + Packs the single-precision floating point number *value*. + + +.. method:: Packer.pack_double(value) + + Packs the double-precision floating point number *value*. + +The following methods support packing strings, bytes, and opaque data: + + +.. method:: Packer.pack_fstring(n, s) + + Packs a fixed length string, *s*. *n* is the length of the string but it is + *not* packed into the data buffer. The string is padded with null bytes if + necessary to guaranteed 4 byte alignment. + + +.. method:: Packer.pack_fopaque(n, data) + + Packs a fixed length opaque data stream, similarly to :meth:`pack_fstring`. + + +.. method:: Packer.pack_string(s) + + Packs a variable length string, *s*. The length of the string is first packed + as an unsigned integer, then the string data is packed with + :meth:`pack_fstring`. + + +.. method:: Packer.pack_opaque(data) + + Packs a variable length opaque data string, similarly to :meth:`pack_string`. + + +.. method:: Packer.pack_bytes(bytes) + + Packs a variable length byte stream, similarly to :meth:`pack_string`. + +The following methods support packing arrays and lists: + + +.. method:: Packer.pack_list(list, pack_item) + + Packs a *list* of homogeneous items. This method is useful for lists with an + indeterminate size; i.e. the size is not available until the entire list has + been walked. For each item in the list, an unsigned integer ``1`` is packed + first, followed by the data value from the list. *pack_item* is the function + that is called to pack the individual item. At the end of the list, an unsigned + integer ``0`` is packed. + + For example, to pack a list of integers, the code might appear like this:: + + import xdrlib + p = xdrlib.Packer() + p.pack_list([1, 2, 3], p.pack_int) + + +.. method:: Packer.pack_farray(n, array, pack_item) + + Packs a fixed length list (*array*) of homogeneous items. *n* is the length of + the list; it is *not* packed into the buffer, but a :exc:`ValueError` exception + is raised if ``len(array)`` is not equal to *n*. As above, *pack_item* is the + function used to pack each element. + + +.. method:: Packer.pack_array(list, pack_item) + + Packs a variable length *list* of homogeneous items. First, the length of the + list is packed as an unsigned integer, then each element is packed as in + :meth:`pack_farray` above. + + +.. _xdr-unpacker-objects: + +Unpacker Objects +---------------- + +The :class:`Unpacker` class offers the following methods: + + +.. method:: Unpacker.reset(data) + + Resets the string buffer with the given *data*. + + +.. method:: Unpacker.get_position() + + Returns the current unpack position in the data buffer. + + +.. method:: Unpacker.set_position(position) + + Sets the data buffer unpack position to *position*. You should be careful about + using :meth:`get_position` and :meth:`set_position`. + + +.. method:: Unpacker.get_buffer() + + Returns the current unpack data buffer as a string. + + +.. method:: Unpacker.done() + + Indicates unpack completion. Raises an :exc:`Error` exception if all of the + data has not been unpacked. + +In addition, every data type that can be packed with a :class:`Packer`, can be +unpacked with an :class:`Unpacker`. Unpacking methods are of the form +``unpack_type()``, and take no arguments. They return the unpacked object. + + +.. method:: Unpacker.unpack_float() + + Unpacks a single-precision floating point number. + + +.. method:: Unpacker.unpack_double() + + Unpacks a double-precision floating point number, similarly to + :meth:`unpack_float`. + +In addition, the following methods unpack strings, bytes, and opaque data: + + +.. method:: Unpacker.unpack_fstring(n) + + Unpacks and returns a fixed length string. *n* is the number of characters + expected. Padding with null bytes to guaranteed 4 byte alignment is assumed. + + +.. method:: Unpacker.unpack_fopaque(n) + + Unpacks and returns a fixed length opaque data stream, similarly to + :meth:`unpack_fstring`. + + +.. method:: Unpacker.unpack_string() + + Unpacks and returns a variable length string. The length of the string is first + unpacked as an unsigned integer, then the string data is unpacked with + :meth:`unpack_fstring`. + + +.. method:: Unpacker.unpack_opaque() + + Unpacks and returns a variable length opaque data string, similarly to + :meth:`unpack_string`. + + +.. method:: Unpacker.unpack_bytes() + + Unpacks and returns a variable length byte stream, similarly to + :meth:`unpack_string`. + +The following methods support unpacking arrays and lists: + + +.. method:: Unpacker.unpack_list(unpack_item) + + Unpacks and returns a list of homogeneous items. The list is unpacked one + element at a time by first unpacking an unsigned integer flag. If the flag is + ``1``, then the item is unpacked and appended to the list. A flag of ``0`` + indicates the end of the list. *unpack_item* is the function that is called to + unpack the items. + + +.. method:: Unpacker.unpack_farray(n, unpack_item) + + Unpacks and returns (as a list) a fixed length array of homogeneous items. *n* + is number of list elements to expect in the buffer. As above, *unpack_item* is + the function used to unpack each element. + + +.. method:: Unpacker.unpack_array(unpack_item) + + Unpacks and returns a variable length *list* of homogeneous items. First, the + length of the list is unpacked as an unsigned integer, then each element is + unpacked as in :meth:`unpack_farray` above. + + +.. _xdr-exceptions: + +Exceptions +---------- + +Exceptions in this module are coded as class instances: + + +.. exception:: Error + + The base exception class. :exc:`Error` has a single public data member + :attr:`msg` containing the description of the error. + + +.. exception:: ConversionError + + Class derived from :exc:`Error`. Contains no additional instance variables. + +Here is an example of how you would catch one of these exceptions:: + + import xdrlib + p = xdrlib.Packer() + try: + p.pack_double(8.01) + except xdrlib.ConversionError as instance: + print 'packing the double failed:', instance.msg + diff --git a/Doc/library/xml.dom.minidom.rst b/Doc/library/xml.dom.minidom.rst new file mode 100644 index 0000000..54c5f3d --- /dev/null +++ b/Doc/library/xml.dom.minidom.rst @@ -0,0 +1,267 @@ + +:mod:`xml.dom.minidom` --- Lightweight DOM implementation +========================================================= + +.. module:: xml.dom.minidom + :synopsis: Lightweight Document Object Model (DOM) implementation. +.. moduleauthor:: Paul Prescod <paul@prescod.net> +.. sectionauthor:: Paul Prescod <paul@prescod.net> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.0 + +:mod:`xml.dom.minidom` is a light-weight implementation of the Document Object +Model interface. It is intended to be simpler than the full DOM and also +significantly smaller. + +DOM applications typically start by parsing some XML into a DOM. With +:mod:`xml.dom.minidom`, this is done through the parse functions:: + + from xml.dom.minidom import parse, parseString + + dom1 = parse('c:\\temp\\mydata.xml') # parse an XML file by name + + datasource = open('c:\\temp\\mydata.xml') + dom2 = parse(datasource) # parse an open file + + dom3 = parseString('<myxml>Some data<empty/> some more data</myxml>') + +The :func:`parse` function can take either a filename or an open file object. + + +.. function:: parse(filename_or_file, parser) + + Return a :class:`Document` from the given input. *filename_or_file* may be + either a file name, or a file-like object. *parser*, if given, must be a SAX2 + parser object. This function will change the document handler of the parser and + activate namespace support; other parser configuration (like setting an entity + resolver) must have been done in advance. + +If you have XML in a string, you can use the :func:`parseString` function +instead: + + +.. function:: parseString(string[, parser]) + + Return a :class:`Document` that represents the *string*. This method creates a + :class:`StringIO` object for the string and passes that on to :func:`parse`. + +Both functions return a :class:`Document` object representing the content of the +document. + +What the :func:`parse` and :func:`parseString` functions do is connect an XML +parser with a "DOM builder" that can accept parse events from any SAX parser and +convert them into a DOM tree. The name of the functions are perhaps misleading, +but are easy to grasp when learning the interfaces. The parsing of the document +will be completed before these functions return; it's simply that these +functions do not provide a parser implementation themselves. + +You can also create a :class:`Document` by calling a method on a "DOM +Implementation" object. You can get this object either by calling the +:func:`getDOMImplementation` function in the :mod:`xml.dom` package or the +:mod:`xml.dom.minidom` module. Using the implementation from the +:mod:`xml.dom.minidom` module will always return a :class:`Document` instance +from the minidom implementation, while the version from :mod:`xml.dom` may +provide an alternate implementation (this is likely if you have the `PyXML +package <http://pyxml.sourceforge.net/>`_ installed). Once you have a +:class:`Document`, you can add child nodes to it to populate the DOM:: + + from xml.dom.minidom import getDOMImplementation + + impl = getDOMImplementation() + + newdoc = impl.createDocument(None, "some_tag", None) + top_element = newdoc.documentElement + text = newdoc.createTextNode('Some textual content.') + top_element.appendChild(text) + +Once you have a DOM document object, you can access the parts of your XML +document through its properties and methods. These properties are defined in +the DOM specification. The main property of the document object is the +:attr:`documentElement` property. It gives you the main element in the XML +document: the one that holds all others. Here is an example program:: + + dom3 = parseString("<myxml>Some data</myxml>") + assert dom3.documentElement.tagName == "myxml" + +When you are finished with a DOM, you should clean it up. This is necessary +because some versions of Python do not support garbage collection of objects +that refer to each other in a cycle. Until this restriction is removed from all +versions of Python, it is safest to write your code as if cycles would not be +cleaned up. + +The way to clean up a DOM is to call its :meth:`unlink` method:: + + dom1.unlink() + dom2.unlink() + dom3.unlink() + +:meth:`unlink` is a :mod:`xml.dom.minidom`\ -specific extension to the DOM API. +After calling :meth:`unlink` on a node, the node and its descendants are +essentially useless. + + +.. seealso:: + + `Document Object Model (DOM) Level 1 Specification <http://www.w3.org/TR/REC-DOM-Level-1/>`_ + The W3C recommendation for the DOM supported by :mod:`xml.dom.minidom`. + + +.. _minidom-objects: + +DOM Objects +----------- + +The definition of the DOM API for Python is given as part of the :mod:`xml.dom` +module documentation. This section lists the differences between the API and +:mod:`xml.dom.minidom`. + + +.. method:: Node.unlink() + + Break internal references within the DOM so that it will be garbage collected on + versions of Python without cyclic GC. Even when cyclic GC is available, using + this can make large amounts of memory available sooner, so calling this on DOM + objects as soon as they are no longer needed is good practice. This only needs + to be called on the :class:`Document` object, but may be called on child nodes + to discard children of that node. + + +.. method:: Node.writexml(writer[,indent=""[,addindent=""[,newl=""]]]) + + Write XML to the writer object. The writer should have a :meth:`write` method + which matches that of the file object interface. The *indent* parameter is the + indentation of the current node. The *addindent* parameter is the incremental + indentation to use for subnodes of the current one. The *newl* parameter + specifies the string to use to terminate newlines. + + .. versionchanged:: 2.1 + The optional keyword parameters *indent*, *addindent*, and *newl* were added to + support pretty output. + + .. versionchanged:: 2.3 + For the :class:`Document` node, an additional keyword argument *encoding* can be + used to specify the encoding field of the XML header. + + +.. method:: Node.toxml([encoding]) + + Return the XML that the DOM represents as a string. + + With no argument, the XML header does not specify an encoding, and the result is + Unicode string if the default encoding cannot represent all characters in the + document. Encoding this string in an encoding other than UTF-8 is likely + incorrect, since UTF-8 is the default encoding of XML. + + With an explicit *encoding* argument, the result is a byte string in the + specified encoding. It is recommended that this argument is always specified. To + avoid :exc:`UnicodeError` exceptions in case of unrepresentable text data, the + encoding argument should be specified as "utf-8". + + .. versionchanged:: 2.3 + the *encoding* argument was introduced. + + +.. method:: Node.toprettyxml([indent[, newl]]) + + Return a pretty-printed version of the document. *indent* specifies the + indentation string and defaults to a tabulator; *newl* specifies the string + emitted at the end of each line and defaults to ``\n``. + + .. versionadded:: 2.1 + + .. versionchanged:: 2.3 + the encoding argument; see :meth:`toxml`. + +The following standard DOM methods have special considerations with +:mod:`xml.dom.minidom`: + + +.. method:: Node.cloneNode(deep) + + Although this method was present in the version of :mod:`xml.dom.minidom` + packaged with Python 2.0, it was seriously broken. This has been corrected for + subsequent releases. + + +.. _dom-example: + +DOM Example +----------- + +This example program is a fairly realistic example of a simple program. In this +particular case, we do not take much advantage of the flexibility of the DOM. + +.. literalinclude:: ../includes/minidom-example.py + + +.. _minidom-and-dom: + +minidom and the DOM standard +---------------------------- + +The :mod:`xml.dom.minidom` module is essentially a DOM 1.0-compatible DOM with +some DOM 2 features (primarily namespace features). + +Usage of the DOM interface in Python is straight-forward. The following mapping +rules apply: + +* Interfaces are accessed through instance objects. Applications should not + instantiate the classes themselves; they should use the creator functions + available on the :class:`Document` object. Derived interfaces support all + operations (and attributes) from the base interfaces, plus any new operations. + +* Operations are used as methods. Since the DOM uses only :keyword:`in` + parameters, the arguments are passed in normal order (from left to right). + There are no optional arguments. :keyword:`void` operations return ``None``. + +* IDL attributes map to instance attributes. For compatibility with the OMG IDL + language mapping for Python, an attribute ``foo`` can also be accessed through + accessor methods :meth:`_get_foo` and :meth:`_set_foo`. :keyword:`readonly` + attributes must not be changed; this is not enforced at runtime. + +* The types ``short int``, ``unsigned int``, ``unsigned long long``, and + ``boolean`` all map to Python integer objects. + +* The type ``DOMString`` maps to Python strings. :mod:`xml.dom.minidom` supports + either byte or Unicode strings, but will normally produce Unicode strings. + Values of type ``DOMString`` may also be ``None`` where allowed to have the IDL + ``null`` value by the DOM specification from the W3C. + +* :keyword:`const` declarations map to variables in their respective scope (e.g. + ``xml.dom.minidom.Node.PROCESSING_INSTRUCTION_NODE``); they must not be changed. + +* ``DOMException`` is currently not supported in :mod:`xml.dom.minidom`. + Instead, :mod:`xml.dom.minidom` uses standard Python exceptions such as + :exc:`TypeError` and :exc:`AttributeError`. + +* :class:`NodeList` objects are implemented using Python's built-in list type. + Starting with Python 2.2, these objects provide the interface defined in the DOM + specification, but with earlier versions of Python they do not support the + official API. They are, however, much more "Pythonic" than the interface + defined in the W3C recommendations. + +The following interfaces have no implementation in :mod:`xml.dom.minidom`: + +* :class:`DOMTimeStamp` + +* :class:`DocumentType` (added in Python 2.1) + +* :class:`DOMImplementation` (added in Python 2.1) + +* :class:`CharacterData` + +* :class:`CDATASection` + +* :class:`Notation` + +* :class:`Entity` + +* :class:`EntityReference` + +* :class:`DocumentFragment` + +Most of these reflect information in the XML document that is not of general +utility to most DOM users. + diff --git a/Doc/library/xml.dom.pulldom.rst b/Doc/library/xml.dom.pulldom.rst new file mode 100644 index 0000000..80a91b8 --- /dev/null +++ b/Doc/library/xml.dom.pulldom.rst @@ -0,0 +1,69 @@ + +:mod:`xml.dom.pulldom` --- Support for building partial DOM trees +================================================================= + +.. module:: xml.dom.pulldom + :synopsis: Support for building partial DOM trees from SAX events. +.. moduleauthor:: Paul Prescod <paul@prescod.net> + + +.. versionadded:: 2.0 + +:mod:`xml.dom.pulldom` allows building only selected portions of a Document +Object Model representation of a document from SAX events. + + +.. class:: PullDOM([documentFactory]) + + :class:`xml.sax.handler.ContentHandler` implementation that ... + + +.. class:: DOMEventStream(stream, parser, bufsize) + + ... + + +.. class:: SAX2DOM([documentFactory]) + + :class:`xml.sax.handler.ContentHandler` implementation that ... + + +.. function:: parse(stream_or_string[, parser[, bufsize]]) + + ... + + +.. function:: parseString(string[, parser]) + + ... + + +.. data:: default_bufsize + + Default value for the *bufsize* parameter to :func:`parse`. + + .. versionchanged:: 2.1 + The value of this variable can be changed before calling :func:`parse` and the + new value will take effect. + + +.. _domeventstream-objects: + +DOMEventStream Objects +---------------------- + + +.. method:: DOMEventStream.getEvent() + + ... + + +.. method:: DOMEventStream.expandNode(node) + + ... + + +.. method:: DOMEventStream.reset() + + ... + diff --git a/Doc/library/xml.dom.rst b/Doc/library/xml.dom.rst new file mode 100644 index 0000000..76f5cc1 --- /dev/null +++ b/Doc/library/xml.dom.rst @@ -0,0 +1,1045 @@ + +:mod:`xml.dom` --- The Document Object Model API +================================================ + +.. module:: xml.dom + :synopsis: Document Object Model API for Python. +.. sectionauthor:: Paul Prescod <paul@prescod.net> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.0 + +The Document Object Model, or "DOM," is a cross-language API from the World Wide +Web Consortium (W3C) for accessing and modifying XML documents. A DOM +implementation presents an XML document as a tree structure, or allows client +code to build such a structure from scratch. It then gives access to the +structure through a set of objects which provided well-known interfaces. + +The DOM is extremely useful for random-access applications. SAX only allows you +a view of one bit of the document at a time. If you are looking at one SAX +element, you have no access to another. If you are looking at a text node, you +have no access to a containing element. When you write a SAX application, you +need to keep track of your program's position in the document somewhere in your +own code. SAX does not do it for you. Also, if you need to look ahead in the +XML document, you are just out of luck. + +Some applications are simply impossible in an event driven model with no access +to a tree. Of course you could build some sort of tree yourself in SAX events, +but the DOM allows you to avoid writing that code. The DOM is a standard tree +representation for XML data. + +The Document Object Model is being defined by the W3C in stages, or "levels" in +their terminology. The Python mapping of the API is substantially based on the +DOM Level 2 recommendation. The mapping of the Level 3 specification, currently +only available in draft form, is being developed by the `Python XML Special +Interest Group <http://www.python.org/sigs/xml-sig/>`_ as part of the `PyXML +package <http://pyxml.sourceforge.net/>`_. Refer to the documentation bundled +with that package for information on the current state of DOM Level 3 support. + +.. % What if your needs are somewhere between SAX and the DOM? Perhaps +.. % you cannot afford to load the entire tree in memory but you find the +.. % SAX model somewhat cumbersome and low-level. There is also a module +.. % called xml.dom.pulldom that allows you to build trees of only the +.. % parts of a document that you need structured access to. It also has +.. % features that allow you to find your way around the DOM. +.. % See http://www.prescod.net/python/pulldom + +DOM applications typically start by parsing some XML into a DOM. How this is +accomplished is not covered at all by DOM Level 1, and Level 2 provides only +limited improvements: There is a :class:`DOMImplementation` object class which +provides access to :class:`Document` creation methods, but no way to access an +XML reader/parser/Document builder in an implementation-independent way. There +is also no well-defined way to access these methods without an existing +:class:`Document` object. In Python, each DOM implementation will provide a +function :func:`getDOMImplementation`. DOM Level 3 adds a Load/Store +specification, which defines an interface to the reader, but this is not yet +available in the Python standard library. + +Once you have a DOM document object, you can access the parts of your XML +document through its properties and methods. These properties are defined in +the DOM specification; this portion of the reference manual describes the +interpretation of the specification in Python. + +The specification provided by the W3C defines the DOM API for Java, ECMAScript, +and OMG IDL. The Python mapping defined here is based in large part on the IDL +version of the specification, but strict compliance is not required (though +implementations are free to support the strict mapping from IDL). See section +:ref:`dom-conformance` for a detailed discussion of mapping requirements. + + +.. seealso:: + + `Document Object Model (DOM) Level 2 Specification <http://www.w3.org/TR/DOM-Level-2-Core/>`_ + The W3C recommendation upon which the Python DOM API is based. + + `Document Object Model (DOM) Level 1 Specification <http://www.w3.org/TR/REC-DOM-Level-1/>`_ + The W3C recommendation for the DOM supported by :mod:`xml.dom.minidom`. + + `PyXML <http://pyxml.sourceforge.net>`_ + Users that require a full-featured implementation of DOM should use the PyXML + package. + + `Python Language Mapping Specification <http://www.omg.org/docs/formal/02-11-05.pdf>`_ + This specifies the mapping from OMG IDL to Python. + + +Module Contents +--------------- + +The :mod:`xml.dom` contains the following functions: + + +.. function:: registerDOMImplementation(name, factory) + + Register the *factory* function with the name *name*. The factory function + should return an object which implements the :class:`DOMImplementation` + interface. The factory function can return the same object every time, or a new + one for each call, as appropriate for the specific implementation (e.g. if that + implementation supports some customization). + + +.. function:: getDOMImplementation([name[, features]]) + + Return a suitable DOM implementation. The *name* is either well-known, the + module name of a DOM implementation, or ``None``. If it is not ``None``, imports + the corresponding module and returns a :class:`DOMImplementation` object if the + import succeeds. If no name is given, and if the environment variable + :envvar:`PYTHON_DOM` is set, this variable is used to find the implementation. + + If name is not given, this examines the available implementations to find one + with the required feature set. If no implementation can be found, raise an + :exc:`ImportError`. The features list must be a sequence of ``(feature, + version)`` pairs which are passed to the :meth:`hasFeature` method on available + :class:`DOMImplementation` objects. + +Some convenience constants are also provided: + + +.. data:: EMPTY_NAMESPACE + + The value used to indicate that no namespace is associated with a node in the + DOM. This is typically found as the :attr:`namespaceURI` of a node, or used as + the *namespaceURI* parameter to a namespaces-specific method. + + .. versionadded:: 2.2 + + +.. data:: XML_NAMESPACE + + The namespace URI associated with the reserved prefix ``xml``, as defined by + `Namespaces in XML <http://www.w3.org/TR/REC-xml-names/>`_ (section 4). + + .. versionadded:: 2.2 + + +.. data:: XMLNS_NAMESPACE + + The namespace URI for namespace declarations, as defined by `Document Object + Model (DOM) Level 2 Core Specification + <http://www.w3.org/TR/DOM-Level-2-Core/core.html>`_ (section 1.1.8). + + .. versionadded:: 2.2 + + +.. data:: XHTML_NAMESPACE + + The URI of the XHTML namespace as defined by `XHTML 1.0: The Extensible + HyperText Markup Language <http://www.w3.org/TR/xhtml1/>`_ (section 3.1.1). + + .. versionadded:: 2.2 + +In addition, :mod:`xml.dom` contains a base :class:`Node` class and the DOM +exception classes. The :class:`Node` class provided by this module does not +implement any of the methods or attributes defined by the DOM specification; +concrete DOM implementations must provide those. The :class:`Node` class +provided as part of this module does provide the constants used for the +:attr:`nodeType` attribute on concrete :class:`Node` objects; they are located +within the class rather than at the module level to conform with the DOM +specifications. + +.. % Should the Node documentation go here? + + +.. _dom-objects: + +Objects in the DOM +------------------ + +The definitive documentation for the DOM is the DOM specification from the W3C. + +Note that DOM attributes may also be manipulated as nodes instead of as simple +strings. It is fairly rare that you must do this, however, so this usage is not +yet documented. + ++--------------------------------+-----------------------------------+---------------------------------+ +| Interface | Section | Purpose | ++================================+===================================+=================================+ +| :class:`DOMImplementation` | :ref:`dom-implementation-objects` | Interface to the underlying | +| | | implementation. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`Node` | :ref:`dom-node-objects` | Base interface for most objects | +| | | in a document. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`NodeList` | :ref:`dom-nodelist-objects` | Interface for a sequence of | +| | | nodes. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`DocumentType` | :ref:`dom-documenttype-objects` | Information about the | +| | | declarations needed to process | +| | | a document. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`Document` | :ref:`dom-document-objects` | Object which represents an | +| | | entire document. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`Element` | :ref:`dom-element-objects` | Element nodes in the document | +| | | hierarchy. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`Attr` | :ref:`dom-attr-objects` | Attribute value nodes on | +| | | element nodes. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`Comment` | :ref:`dom-comment-objects` | Representation of comments in | +| | | the source document. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`Text` | :ref:`dom-text-objects` | Nodes containing textual | +| | | content from the document. | ++--------------------------------+-----------------------------------+---------------------------------+ +| :class:`ProcessingInstruction` | :ref:`dom-pi-objects` | Processing instruction | +| | | representation. | ++--------------------------------+-----------------------------------+---------------------------------+ + +An additional section describes the exceptions defined for working with the DOM +in Python. + + +.. _dom-implementation-objects: + +DOMImplementation Objects +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The :class:`DOMImplementation` interface provides a way for applications to +determine the availability of particular features in the DOM they are using. +DOM Level 2 added the ability to create new :class:`Document` and +:class:`DocumentType` objects using the :class:`DOMImplementation` as well. + + +.. method:: DOMImplementation.hasFeature(feature, version) + + Return true if the feature identified by the pair of strings *feature* and + *version* is implemented. + + +.. method:: DOMImplementation.createDocument(namespaceUri, qualifiedName, doctype) + + Return a new :class:`Document` object (the root of the DOM), with a child + :class:`Element` object having the given *namespaceUri* and *qualifiedName*. The + *doctype* must be a :class:`DocumentType` object created by + :meth:`createDocumentType`, or ``None``. In the Python DOM API, the first two + arguments can also be ``None`` in order to indicate that no :class:`Element` + child is to be created. + + +.. method:: DOMImplementation.createDocumentType(qualifiedName, publicId, systemId) + + Return a new :class:`DocumentType` object that encapsulates the given + *qualifiedName*, *publicId*, and *systemId* strings, representing the + information contained in an XML document type declaration. + + +.. _dom-node-objects: + +Node Objects +^^^^^^^^^^^^ + +All of the components of an XML document are subclasses of :class:`Node`. + + +.. attribute:: Node.nodeType + + An integer representing the node type. Symbolic constants for the types are on + the :class:`Node` object: :const:`ELEMENT_NODE`, :const:`ATTRIBUTE_NODE`, + :const:`TEXT_NODE`, :const:`CDATA_SECTION_NODE`, :const:`ENTITY_NODE`, + :const:`PROCESSING_INSTRUCTION_NODE`, :const:`COMMENT_NODE`, + :const:`DOCUMENT_NODE`, :const:`DOCUMENT_TYPE_NODE`, :const:`NOTATION_NODE`. + This is a read-only attribute. + + +.. attribute:: Node.parentNode + + The parent of the current node, or ``None`` for the document node. The value is + always a :class:`Node` object or ``None``. For :class:`Element` nodes, this + will be the parent element, except for the root element, in which case it will + be the :class:`Document` object. For :class:`Attr` nodes, this is always + ``None``. This is a read-only attribute. + + +.. attribute:: Node.attributes + + A :class:`NamedNodeMap` of attribute objects. Only elements have actual values + for this; others provide ``None`` for this attribute. This is a read-only + attribute. + + +.. attribute:: Node.previousSibling + + The node that immediately precedes this one with the same parent. For + instance the element with an end-tag that comes just before the *self* + element's start-tag. Of course, XML documents are made up of more than just + elements so the previous sibling could be text, a comment, or something else. + If this node is the first child of the parent, this attribute will be + ``None``. This is a read-only attribute. + + +.. attribute:: Node.nextSibling + + The node that immediately follows this one with the same parent. See also + :attr:`previousSibling`. If this is the last child of the parent, this + attribute will be ``None``. This is a read-only attribute. + + +.. attribute:: Node.childNodes + + A list of nodes contained within this node. This is a read-only attribute. + + +.. attribute:: Node.firstChild + + The first child of the node, if there are any, or ``None``. This is a read-only + attribute. + + +.. attribute:: Node.lastChild + + The last child of the node, if there are any, or ``None``. This is a read-only + attribute. + + +.. attribute:: Node.localName + + The part of the :attr:`tagName` following the colon if there is one, else the + entire :attr:`tagName`. The value is a string. + + +.. attribute:: Node.prefix + + The part of the :attr:`tagName` preceding the colon if there is one, else the + empty string. The value is a string, or ``None`` + + +.. attribute:: Node.namespaceURI + + The namespace associated with the element name. This will be a string or + ``None``. This is a read-only attribute. + + +.. attribute:: Node.nodeName + + This has a different meaning for each node type; see the DOM specification for + details. You can always get the information you would get here from another + property such as the :attr:`tagName` property for elements or the :attr:`name` + property for attributes. For all node types, the value of this attribute will be + either a string or ``None``. This is a read-only attribute. + + +.. attribute:: Node.nodeValue + + This has a different meaning for each node type; see the DOM specification for + details. The situation is similar to that with :attr:`nodeName`. The value is + a string or ``None``. + + +.. method:: Node.hasAttributes() + + Returns true if the node has any attributes. + + +.. method:: Node.hasChildNodes() + + Returns true if the node has any child nodes. + + +.. method:: Node.isSameNode(other) + + Returns true if *other* refers to the same node as this node. This is especially + useful for DOM implementations which use any sort of proxy architecture (because + more than one object can refer to the same node). + + .. note:: + + This is based on a proposed DOM Level 3 API which is still in the "working + draft" stage, but this particular interface appears uncontroversial. Changes + from the W3C will not necessarily affect this method in the Python DOM interface + (though any new W3C API for this would also be supported). + + +.. method:: Node.appendChild(newChild) + + Add a new child node to this node at the end of the list of children, returning + *newChild*. + + +.. method:: Node.insertBefore(newChild, refChild) + + Insert a new child node before an existing child. It must be the case that + *refChild* is a child of this node; if not, :exc:`ValueError` is raised. + *newChild* is returned. If *refChild* is ``None``, it inserts *newChild* at the + end of the children's list. + + +.. method:: Node.removeChild(oldChild) + + Remove a child node. *oldChild* must be a child of this node; if not, + :exc:`ValueError` is raised. *oldChild* is returned on success. If *oldChild* + will not be used further, its :meth:`unlink` method should be called. + + +.. method:: Node.replaceChild(newChild, oldChild) + + Replace an existing node with a new node. It must be the case that *oldChild* + is a child of this node; if not, :exc:`ValueError` is raised. + + +.. method:: Node.normalize() + + Join adjacent text nodes so that all stretches of text are stored as single + :class:`Text` instances. This simplifies processing text from a DOM tree for + many applications. + + .. versionadded:: 2.1 + + +.. method:: Node.cloneNode(deep) + + Clone this node. Setting *deep* means to clone all child nodes as well. This + returns the clone. + + +.. _dom-nodelist-objects: + +NodeList Objects +^^^^^^^^^^^^^^^^ + +A :class:`NodeList` represents a sequence of nodes. These objects are used in +two ways in the DOM Core recommendation: the :class:`Element` objects provides +one as its list of child nodes, and the :meth:`getElementsByTagName` and +:meth:`getElementsByTagNameNS` methods of :class:`Node` return objects with this +interface to represent query results. + +The DOM Level 2 recommendation defines one method and one attribute for these +objects: + + +.. method:: NodeList.item(i) + + Return the *i*'th item from the sequence, if there is one, or ``None``. The + index *i* is not allowed to be less then zero or greater than or equal to the + length of the sequence. + + +.. attribute:: NodeList.length + + The number of nodes in the sequence. + +In addition, the Python DOM interface requires that some additional support is +provided to allow :class:`NodeList` objects to be used as Python sequences. All +:class:`NodeList` implementations must include support for :meth:`__len__` and +:meth:`__getitem__`; this allows iteration over the :class:`NodeList` in +:keyword:`for` statements and proper support for the :func:`len` built-in +function. + +If a DOM implementation supports modification of the document, the +:class:`NodeList` implementation must also support the :meth:`__setitem__` and +:meth:`__delitem__` methods. + + +.. _dom-documenttype-objects: + +DocumentType Objects +^^^^^^^^^^^^^^^^^^^^ + +Information about the notations and entities declared by a document (including +the external subset if the parser uses it and can provide the information) is +available from a :class:`DocumentType` object. The :class:`DocumentType` for a +document is available from the :class:`Document` object's :attr:`doctype` +attribute; if there is no ``DOCTYPE`` declaration for the document, the +document's :attr:`doctype` attribute will be set to ``None`` instead of an +instance of this interface. + +:class:`DocumentType` is a specialization of :class:`Node`, and adds the +following attributes: + + +.. attribute:: DocumentType.publicId + + The public identifier for the external subset of the document type definition. + This will be a string or ``None``. + + +.. attribute:: DocumentType.systemId + + The system identifier for the external subset of the document type definition. + This will be a URI as a string, or ``None``. + + +.. attribute:: DocumentType.internalSubset + + A string giving the complete internal subset from the document. This does not + include the brackets which enclose the subset. If the document has no internal + subset, this should be ``None``. + + +.. attribute:: DocumentType.name + + The name of the root element as given in the ``DOCTYPE`` declaration, if + present. + + +.. attribute:: DocumentType.entities + + This is a :class:`NamedNodeMap` giving the definitions of external entities. + For entity names defined more than once, only the first definition is provided + (others are ignored as required by the XML recommendation). This may be + ``None`` if the information is not provided by the parser, or if no entities are + defined. + + +.. attribute:: DocumentType.notations + + This is a :class:`NamedNodeMap` giving the definitions of notations. For + notation names defined more than once, only the first definition is provided + (others are ignored as required by the XML recommendation). This may be + ``None`` if the information is not provided by the parser, or if no notations + are defined. + + +.. _dom-document-objects: + +Document Objects +^^^^^^^^^^^^^^^^ + +A :class:`Document` represents an entire XML document, including its constituent +elements, attributes, processing instructions, comments etc. Remeber that it +inherits properties from :class:`Node`. + + +.. attribute:: Document.documentElement + + The one and only root element of the document. + + +.. method:: Document.createElement(tagName) + + Create and return a new element node. The element is not inserted into the + document when it is created. You need to explicitly insert it with one of the + other methods such as :meth:`insertBefore` or :meth:`appendChild`. + + +.. method:: Document.createElementNS(namespaceURI, tagName) + + Create and return a new element with a namespace. The *tagName* may have a + prefix. The element is not inserted into the document when it is created. You + need to explicitly insert it with one of the other methods such as + :meth:`insertBefore` or :meth:`appendChild`. + + +.. method:: Document.createTextNode(data) + + Create and return a text node containing the data passed as a parameter. As + with the other creation methods, this one does not insert the node into the + tree. + + +.. method:: Document.createComment(data) + + Create and return a comment node containing the data passed as a parameter. As + with the other creation methods, this one does not insert the node into the + tree. + + +.. method:: Document.createProcessingInstruction(target, data) + + Create and return a processing instruction node containing the *target* and + *data* passed as parameters. As with the other creation methods, this one does + not insert the node into the tree. + + +.. method:: Document.createAttribute(name) + + Create and return an attribute node. This method does not associate the + attribute node with any particular element. You must use + :meth:`setAttributeNode` on the appropriate :class:`Element` object to use the + newly created attribute instance. + + +.. method:: Document.createAttributeNS(namespaceURI, qualifiedName) + + Create and return an attribute node with a namespace. The *tagName* may have a + prefix. This method does not associate the attribute node with any particular + element. You must use :meth:`setAttributeNode` on the appropriate + :class:`Element` object to use the newly created attribute instance. + + +.. method:: Document.getElementsByTagName(tagName) + + Search for all descendants (direct children, children's children, etc.) with a + particular element type name. + + +.. method:: Document.getElementsByTagNameNS(namespaceURI, localName) + + Search for all descendants (direct children, children's children, etc.) with a + particular namespace URI and localname. The localname is the part of the + namespace after the prefix. + + +.. _dom-element-objects: + +Element Objects +^^^^^^^^^^^^^^^ + +:class:`Element` is a subclass of :class:`Node`, so inherits all the attributes +of that class. + + +.. attribute:: Element.tagName + + The element type name. In a namespace-using document it may have colons in it. + The value is a string. + + +.. method:: Element.getElementsByTagName(tagName) + + Same as equivalent method in the :class:`Document` class. + + +.. method:: Element.getElementsByTagNameNS(tagName) + + Same as equivalent method in the :class:`Document` class. + + +.. method:: Element.hasAttribute(name) + + Returns true if the element has an attribute named by *name*. + + +.. method:: Element.hasAttributeNS(namespaceURI, localName) + + Returns true if the element has an attribute named by *namespaceURI* and + *localName*. + + +.. method:: Element.getAttribute(name) + + Return the value of the attribute named by *name* as a string. If no such + attribute exists, an empty string is returned, as if the attribute had no value. + + +.. method:: Element.getAttributeNode(attrname) + + Return the :class:`Attr` node for the attribute named by *attrname*. + + +.. method:: Element.getAttributeNS(namespaceURI, localName) + + Return the value of the attribute named by *namespaceURI* and *localName* as a + string. If no such attribute exists, an empty string is returned, as if the + attribute had no value. + + +.. method:: Element.getAttributeNodeNS(namespaceURI, localName) + + Return an attribute value as a node, given a *namespaceURI* and *localName*. + + +.. method:: Element.removeAttribute(name) + + Remove an attribute by name. No exception is raised if there is no matching + attribute. + + +.. method:: Element.removeAttributeNode(oldAttr) + + Remove and return *oldAttr* from the attribute list, if present. If *oldAttr* is + not present, :exc:`NotFoundErr` is raised. + + +.. method:: Element.removeAttributeNS(namespaceURI, localName) + + Remove an attribute by name. Note that it uses a localName, not a qname. No + exception is raised if there is no matching attribute. + + +.. method:: Element.setAttribute(name, value) + + Set an attribute value from a string. + + +.. method:: Element.setAttributeNode(newAttr) + + Add a new attribute node to the element, replacing an existing attribute if + necessary if the :attr:`name` attribute matches. If a replacement occurs, the + old attribute node will be returned. If *newAttr* is already in use, + :exc:`InuseAttributeErr` will be raised. + + +.. method:: Element.setAttributeNodeNS(newAttr) + + Add a new attribute node to the element, replacing an existing attribute if + necessary if the :attr:`namespaceURI` and :attr:`localName` attributes match. + If a replacement occurs, the old attribute node will be returned. If *newAttr* + is already in use, :exc:`InuseAttributeErr` will be raised. + + +.. method:: Element.setAttributeNS(namespaceURI, qname, value) + + Set an attribute value from a string, given a *namespaceURI* and a *qname*. + Note that a qname is the whole attribute name. This is different than above. + + +.. _dom-attr-objects: + +Attr Objects +^^^^^^^^^^^^ + +:class:`Attr` inherits from :class:`Node`, so inherits all its attributes. + + +.. attribute:: Attr.name + + The attribute name. In a namespace-using document it may have colons in it. + + +.. attribute:: Attr.localName + + The part of the name following the colon if there is one, else the entire name. + This is a read-only attribute. + + +.. attribute:: Attr.prefix + + The part of the name preceding the colon if there is one, else the empty string. + + +.. _dom-attributelist-objects: + +NamedNodeMap Objects +^^^^^^^^^^^^^^^^^^^^ + +:class:`NamedNodeMap` does *not* inherit from :class:`Node`. + + +.. attribute:: NamedNodeMap.length + + The length of the attribute list. + + +.. method:: NamedNodeMap.item(index) + + Return an attribute with a particular index. The order you get the attributes + in is arbitrary but will be consistent for the life of a DOM. Each item is an + attribute node. Get its value with the :attr:`value` attribute. + +There are also experimental methods that give this class more mapping behavior. +You can use them or you can use the standardized :meth:`getAttribute\*` family +of methods on the :class:`Element` objects. + + +.. _dom-comment-objects: + +Comment Objects +^^^^^^^^^^^^^^^ + +:class:`Comment` represents a comment in the XML document. It is a subclass of +:class:`Node`, but cannot have child nodes. + + +.. attribute:: Comment.data + + The content of the comment as a string. The attribute contains all characters + between the leading ``<!-``\ ``-`` and trailing ``-``\ ``->``, but does not + include them. + + +.. _dom-text-objects: + +Text and CDATASection Objects +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The :class:`Text` interface represents text in the XML document. If the parser +and DOM implementation support the DOM's XML extension, portions of the text +enclosed in CDATA marked sections are stored in :class:`CDATASection` objects. +These two interfaces are identical, but provide different values for the +:attr:`nodeType` attribute. + +These interfaces extend the :class:`Node` interface. They cannot have child +nodes. + + +.. attribute:: Text.data + + The content of the text node as a string. + +.. note:: + + The use of a :class:`CDATASection` node does not indicate that the node + represents a complete CDATA marked section, only that the content of the node + was part of a CDATA section. A single CDATA section may be represented by more + than one node in the document tree. There is no way to determine whether two + adjacent :class:`CDATASection` nodes represent different CDATA marked sections. + + +.. _dom-pi-objects: + +ProcessingInstruction Objects +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Represents a processing instruction in the XML document; this inherits from the +:class:`Node` interface and cannot have child nodes. + + +.. attribute:: ProcessingInstruction.target + + The content of the processing instruction up to the first whitespace character. + This is a read-only attribute. + + +.. attribute:: ProcessingInstruction.data + + The content of the processing instruction following the first whitespace + character. + + +.. _dom-exceptions: + +Exceptions +^^^^^^^^^^ + +.. versionadded:: 2.1 + +The DOM Level 2 recommendation defines a single exception, :exc:`DOMException`, +and a number of constants that allow applications to determine what sort of +error occurred. :exc:`DOMException` instances carry a :attr:`code` attribute +that provides the appropriate value for the specific exception. + +The Python DOM interface provides the constants, but also expands the set of +exceptions so that a specific exception exists for each of the exception codes +defined by the DOM. The implementations must raise the appropriate specific +exception, each of which carries the appropriate value for the :attr:`code` +attribute. + + +.. exception:: DOMException + + Base exception class used for all specific DOM exceptions. This exception class + cannot be directly instantiated. + + +.. exception:: DomstringSizeErr + + Raised when a specified range of text does not fit into a string. This is not + known to be used in the Python DOM implementations, but may be received from DOM + implementations not written in Python. + + +.. exception:: HierarchyRequestErr + + Raised when an attempt is made to insert a node where the node type is not + allowed. + + +.. exception:: IndexSizeErr + + Raised when an index or size parameter to a method is negative or exceeds the + allowed values. + + +.. exception:: InuseAttributeErr + + Raised when an attempt is made to insert an :class:`Attr` node that is already + present elsewhere in the document. + + +.. exception:: InvalidAccessErr + + Raised if a parameter or an operation is not supported on the underlying object. + + +.. exception:: InvalidCharacterErr + + This exception is raised when a string parameter contains a character that is + not permitted in the context it's being used in by the XML 1.0 recommendation. + For example, attempting to create an :class:`Element` node with a space in the + element type name will cause this error to be raised. + + +.. exception:: InvalidModificationErr + + Raised when an attempt is made to modify the type of a node. + + +.. exception:: InvalidStateErr + + Raised when an attempt is made to use an object that is not defined or is no + longer usable. + + +.. exception:: NamespaceErr + + If an attempt is made to change any object in a way that is not permitted with + regard to the `Namespaces in XML <http://www.w3.org/TR/REC-xml-names/>`_ + recommendation, this exception is raised. + + +.. exception:: NotFoundErr + + Exception when a node does not exist in the referenced context. For example, + :meth:`NamedNodeMap.removeNamedItem` will raise this if the node passed in does + not exist in the map. + + +.. exception:: NotSupportedErr + + Raised when the implementation does not support the requested type of object or + operation. + + +.. exception:: NoDataAllowedErr + + This is raised if data is specified for a node which does not support data. + + .. % XXX a better explanation is needed! + + +.. exception:: NoModificationAllowedErr + + Raised on attempts to modify an object where modifications are not allowed (such + as for read-only nodes). + + +.. exception:: SyntaxErr + + Raised when an invalid or illegal string is specified. + + .. % XXX how is this different from InvalidCharacterErr ??? + + +.. exception:: WrongDocumentErr + + Raised when a node is inserted in a different document than it currently belongs + to, and the implementation does not support migrating the node from one document + to the other. + +The exception codes defined in the DOM recommendation map to the exceptions +described above according to this table: + ++--------------------------------------+---------------------------------+ +| Constant | Exception | ++======================================+=================================+ +| :const:`DOMSTRING_SIZE_ERR` | :exc:`DomstringSizeErr` | ++--------------------------------------+---------------------------------+ +| :const:`HIERARCHY_REQUEST_ERR` | :exc:`HierarchyRequestErr` | ++--------------------------------------+---------------------------------+ +| :const:`INDEX_SIZE_ERR` | :exc:`IndexSizeErr` | ++--------------------------------------+---------------------------------+ +| :const:`INUSE_ATTRIBUTE_ERR` | :exc:`InuseAttributeErr` | ++--------------------------------------+---------------------------------+ +| :const:`INVALID_ACCESS_ERR` | :exc:`InvalidAccessErr` | ++--------------------------------------+---------------------------------+ +| :const:`INVALID_CHARACTER_ERR` | :exc:`InvalidCharacterErr` | ++--------------------------------------+---------------------------------+ +| :const:`INVALID_MODIFICATION_ERR` | :exc:`InvalidModificationErr` | ++--------------------------------------+---------------------------------+ +| :const:`INVALID_STATE_ERR` | :exc:`InvalidStateErr` | ++--------------------------------------+---------------------------------+ +| :const:`NAMESPACE_ERR` | :exc:`NamespaceErr` | ++--------------------------------------+---------------------------------+ +| :const:`NOT_FOUND_ERR` | :exc:`NotFoundErr` | ++--------------------------------------+---------------------------------+ +| :const:`NOT_SUPPORTED_ERR` | :exc:`NotSupportedErr` | ++--------------------------------------+---------------------------------+ +| :const:`NO_DATA_ALLOWED_ERR` | :exc:`NoDataAllowedErr` | ++--------------------------------------+---------------------------------+ +| :const:`NO_MODIFICATION_ALLOWED_ERR` | :exc:`NoModificationAllowedErr` | ++--------------------------------------+---------------------------------+ +| :const:`SYNTAX_ERR` | :exc:`SyntaxErr` | ++--------------------------------------+---------------------------------+ +| :const:`WRONG_DOCUMENT_ERR` | :exc:`WrongDocumentErr` | ++--------------------------------------+---------------------------------+ + + +.. _dom-conformance: + +Conformance +----------- + +This section describes the conformance requirements and relationships between +the Python DOM API, the W3C DOM recommendations, and the OMG IDL mapping for +Python. + + +.. _dom-type-mapping: + +Type Mapping +^^^^^^^^^^^^ + +The primitive IDL types used in the DOM specification are mapped to Python types +according to the following table. + ++------------------+-------------------------------------------+ +| IDL Type | Python Type | ++==================+===========================================+ +| ``boolean`` | ``IntegerType`` (with a value of ``0`` or | +| | ``1``) | ++------------------+-------------------------------------------+ +| ``int`` | ``IntegerType`` | ++------------------+-------------------------------------------+ +| ``long int`` | ``IntegerType`` | ++------------------+-------------------------------------------+ +| ``unsigned int`` | ``IntegerType`` | ++------------------+-------------------------------------------+ + +Additionally, the :class:`DOMString` defined in the recommendation is mapped to +a Python string or Unicode string. Applications should be able to handle +Unicode whenever a string is returned from the DOM. + +The IDL :keyword:`null` value is mapped to ``None``, which may be accepted or +provided by the implementation whenever :keyword:`null` is allowed by the API. + + +.. _dom-accessor-methods: + +Accessor Methods +^^^^^^^^^^^^^^^^ + +The mapping from OMG IDL to Python defines accessor functions for IDL +:keyword:`attribute` declarations in much the way the Java mapping does. +Mapping the IDL declarations :: + + readonly attribute string someValue; + attribute string anotherValue; + +yields three accessor functions: a "get" method for :attr:`someValue` +(:meth:`_get_someValue`), and "get" and "set" methods for :attr:`anotherValue` +(:meth:`_get_anotherValue` and :meth:`_set_anotherValue`). The mapping, in +particular, does not require that the IDL attributes are accessible as normal +Python attributes: ``object.someValue`` is *not* required to work, and may +raise an :exc:`AttributeError`. + +The Python DOM API, however, *does* require that normal attribute access work. +This means that the typical surrogates generated by Python IDL compilers are not +likely to work, and wrapper objects may be needed on the client if the DOM +objects are accessed via CORBA. While this does require some additional +consideration for CORBA DOM clients, the implementers with experience using DOM +over CORBA from Python do not consider this a problem. Attributes that are +declared :keyword:`readonly` may not restrict write access in all DOM +implementations. + +In the Python DOM API, accessor functions are not required. If provided, they +should take the form defined by the Python IDL mapping, but these methods are +considered unnecessary since the attributes are accessible directly from Python. +"Set" accessors should never be provided for :keyword:`readonly` attributes. + +The IDL definitions do not fully embody the requirements of the W3C DOM API, +such as the notion of certain objects, such as the return value of +:meth:`getElementsByTagName`, being "live". The Python DOM API does not require +implementations to enforce such requirements. + diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst new file mode 100644 index 0000000..ead8d29 --- /dev/null +++ b/Doc/library/xml.etree.elementtree.rst @@ -0,0 +1,444 @@ + +:mod:`xml.etree.ElementTree` --- The ElementTree XML API +======================================================== + +.. module:: xml.etree.ElementTree + :synopsis: Implementation of the ElementTree API. +.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> + + +.. versionadded:: 2.5 + +The Element type is a flexible container object, designed to store hierarchical +data structures in memory. The type can be described as a cross between a list +and a dictionary. + +Each element has a number of properties associated with it: + +* a tag which is a string identifying what kind of data this element represents + (the element type, in other words). + +* a number of attributes, stored in a Python dictionary. + +* a text string. + +* an optional tail string. + +* a number of child elements, stored in a Python sequence + +To create an element instance, use the Element or SubElement factory functions. + +The :class:`ElementTree` class can be used to wrap an element structure, and +convert it from and to XML. + +A C implementation of this API is available as :mod:`xml.etree.cElementTree`. + + +.. _elementtree-functions: + +Functions +--------- + + +.. function:: Comment([text]) + + Comment element factory. This factory function creates a special element that + will be serialized as an XML comment. The comment string can be either an 8-bit + ASCII string or a Unicode string. *text* is a string containing the comment + string. Returns an element instance representing a comment. + + +.. function:: dump(elem) + + Writes an element tree or element structure to sys.stdout. This function should + be used for debugging only. + + The exact output format is implementation dependent. In this version, it's + written as an ordinary XML file. + + *elem* is an element tree or an individual element. + + +.. function:: Element(tag[, attrib][, **extra]) + + Element factory. This function returns an object implementing the standard + Element interface. The exact class or type of that object is implementation + dependent, but it will always be compatible with the _ElementInterface class in + this module. + + The element name, attribute names, and attribute values can be either 8-bit + ASCII strings or Unicode strings. *tag* is the element name. *attrib* is an + optional dictionary, containing element attributes. *extra* contains additional + attributes, given as keyword arguments. Returns an element instance. + + +.. function:: fromstring(text) + + Parses an XML section from a string constant. Same as XML. *text* is a string + containing XML data. Returns an Element instance. + + +.. function:: iselement(element) + + Checks if an object appears to be a valid element object. *element* is an + element instance. Returns a true value if this is an element object. + + +.. function:: iterparse(source[, events]) + + Parses an XML section into an element tree incrementally, and reports what's + going on to the user. *source* is a filename or file object containing XML data. + *events* is a list of events to report back. If omitted, only "end" events are + reported. Returns an iterator providing ``(event, elem)`` pairs. + + +.. function:: parse(source[, parser]) + + Parses an XML section into an element tree. *source* is a filename or file + object containing XML data. *parser* is an optional parser instance. If not + given, the standard XMLTreeBuilder parser is used. Returns an ElementTree + instance. + + +.. function:: ProcessingInstruction(target[, text]) + + PI element factory. This factory function creates a special element that will + be serialized as an XML processing instruction. *target* is a string containing + the PI target. *text* is a string containing the PI contents, if given. Returns + an element instance, representing a processing instruction. + + +.. function:: SubElement(parent, tag[, attrib[, **extra]]) + + Subelement factory. This function creates an element instance, and appends it + to an existing element. + + The element name, attribute names, and attribute values can be either 8-bit + ASCII strings or Unicode strings. *parent* is the parent element. *tag* is the + subelement name. *attrib* is an optional dictionary, containing element + attributes. *extra* contains additional attributes, given as keyword arguments. + Returns an element instance. + + +.. function:: tostring(element[, encoding]) + + Generates a string representation of an XML element, including all subelements. + *element* is an Element instance. *encoding* is the output encoding (default is + US-ASCII). Returns an encoded string containing the XML data. + + +.. function:: XML(text) + + Parses an XML section from a string constant. This function can be used to + embed "XML literals" in Python code. *text* is a string containing XML data. + Returns an Element instance. + + +.. function:: XMLID(text) + + Parses an XML section from a string constant, and also returns a dictionary + which maps from element id:s to elements. *text* is a string containing XML + data. Returns a tuple containing an Element instance and a dictionary. + + +.. _elementtree-element-interface: + +The Element Interface +--------------------- + +Element objects returned by Element or SubElement have the following methods +and attributes. + + +.. attribute:: Element.tag + + A string identifying what kind of data this element represents (the element + type, in other words). + + +.. attribute:: Element.text + + The *text* attribute can be used to hold additional data associated with the + element. As the name implies this attribute is usually a string but may be any + application-specific object. If the element is created from an XML file the + attribute will contain any text found between the element tags. + + +.. attribute:: Element.tail + + The *tail* attribute can be used to hold additional data associated with the + element. This attribute is usually a string but may be any application-specific + object. If the element is created from an XML file the attribute will contain + any text found after the element's end tag and before the next tag. + + +.. attribute:: Element.attrib + + A dictionary containing the element's attributes. Note that while the *attrib* + value is always a real mutable Python dictionary, an ElementTree implementation + may choose to use another internal representation, and create the dictionary + only if someone asks for it. To take advantage of such implementations, use the + dictionary methods below whenever possible. + +The following dictionary-like methods work on the element attributes. + + +.. method:: Element.clear() + + Resets an element. This function removes all subelements, clears all + attributes, and sets the text and tail attributes to None. + + +.. method:: Element.get(key[, default=None]) + + Gets the element attribute named *key*. + + Returns the attribute value, or *default* if the attribute was not found. + + +.. method:: Element.items() + + Returns the element attributes as a sequence of (name, value) pairs. The + attributes are returned in an arbitrary order. + + +.. method:: Element.keys() + + Returns the elements attribute names as a list. The names are returned in an + arbitrary order. + + +.. method:: Element.set(key, value) + + Set the attribute *key* on the element to *value*. + +The following methods work on the element's children (subelements). + + +.. method:: Element.append(subelement) + + Adds the element *subelement* to the end of this elements internal list of + subelements. + + +.. method:: Element.find(match) + + Finds the first subelement matching *match*. *match* may be a tag name or path. + Returns an element instance or ``None``. + + +.. method:: Element.findall(match) + + Finds all subelements matching *match*. *match* may be a tag name or path. + Returns an iterable yielding all matching elements in document order. + + +.. method:: Element.findtext(condition[, default=None]) + + Finds text for the first subelement matching *condition*. *condition* may be a + tag name or path. Returns the text content of the first matching element, or + *default* if no element was found. Note that if the matching element has no + text content an empty string is returned. + + +.. method:: Element.getchildren() + + Returns all subelements. The elements are returned in document order. + + +.. method:: Element.getiterator([tag=None]) + + Creates a tree iterator with the current element as the root. The iterator + iterates over this element and all elements below it that match the given tag. + If tag is ``None`` or ``'*'`` then all elements are iterated over. Returns an + iterable that provides element objects in document (depth first) order. + + +.. method:: Element.insert(index, element) + + Inserts a subelement at the given position in this element. + + +.. method:: Element.makeelement(tag, attrib) + + Creates a new element object of the same type as this element. Do not call this + method, use the SubElement factory function instead. + + +.. method:: Element.remove(subelement) + + Removes *subelement* from the element. Unlike the findXYZ methods this method + compares elements based on the instance identity, not on tag value or contents. + +Element objects also support the following sequence type methods for working +with subelements: :meth:`__delitem__`, :meth:`__getitem__`, :meth:`__setitem__`, +:meth:`__len__`. + +Caution: Because Element objects do not define a :meth:`__nonzero__` method, +elements with no subelements will test as ``False``. :: + + element = root.find('foo') + + if not element: # careful! + print "element not found, or element has no subelements" + + if element is None: + print "element not found" + + +.. _elementtree-elementtree-objects: + +ElementTree Objects +------------------- + + +.. class:: ElementTree([element,] [file]) + + ElementTree wrapper class. This class represents an entire element hierarchy, + and adds some extra support for serialization to and from standard XML. + + *element* is the root element. The tree is initialized with the contents of the + XML *file* if given. + + +.. method:: ElementTree._setroot(element) + + Replaces the root element for this tree. This discards the current contents of + the tree, and replaces it with the given element. Use with care. *element* is + an element instance. + + +.. method:: ElementTree.find(path) + + Finds the first toplevel element with given tag. Same as getroot().find(path). + *path* is the element to look for. Returns the first matching element, or + ``None`` if no element was found. + + +.. method:: ElementTree.findall(path) + + Finds all toplevel elements with the given tag. Same as getroot().findall(path). + *path* is the element to look for. Returns a list or iterator containing all + matching elements, in document order. + + +.. method:: ElementTree.findtext(path[, default]) + + Finds the element text for the first toplevel element with given tag. Same as + getroot().findtext(path). *path* is the toplevel element to look for. *default* + is the value to return if the element was not found. Returns the text content of + the first matching element, or the default value no element was found. Note + that if the element has is found, but has no text content, this method returns + an empty string. + + +.. method:: ElementTree.getiterator([tag]) + + Creates and returns a tree iterator for the root element. The iterator loops + over all elements in this tree, in section order. *tag* is the tag to look for + (default is to return all elements) + + +.. method:: ElementTree.getroot() + + Returns the root element for this tree. + + +.. method:: ElementTree.parse(source[, parser]) + + Loads an external XML section into this element tree. *source* is a file name or + file object. *parser* is an optional parser instance. If not given, the + standard XMLTreeBuilder parser is used. Returns the section root element. + + +.. method:: ElementTree.write(file[, encoding]) + + Writes the element tree to a file, as XML. *file* is a file name, or a file + object opened for writing. *encoding* is the output encoding (default is + US-ASCII). + + +.. _elementtree-qname-objects: + +QName Objects +------------- + + +.. class:: QName(text_or_uri[, tag]) + + QName wrapper. This can be used to wrap a QName attribute value, in order to + get proper namespace handling on output. *text_or_uri* is a string containing + the QName value, in the form {uri}local, or, if the tag argument is given, the + URI part of a QName. If *tag* is given, the first argument is interpreted as an + URI, and this argument is interpreted as a local name. :class:`QName` instances + are opaque. + + +.. _elementtree-treebuilder-objects: + +TreeBuilder Objects +------------------- + + +.. class:: TreeBuilder([element_factory]) + + Generic element structure builder. This builder converts a sequence of start, + data, and end method calls to a well-formed element structure. You can use this + class to build an element structure using a custom XML parser, or a parser for + some other XML-like format. The *element_factory* is called to create new + Element instances when given. + + +.. method:: TreeBuilder.close() + + Flushes the parser buffers, and returns the toplevel documen element. Returns an + Element instance. + + +.. method:: TreeBuilder.data(data) + + Adds text to the current element. *data* is a string. This should be either an + 8-bit string containing ASCII text, or a Unicode string. + + +.. method:: TreeBuilder.end(tag) + + Closes the current element. *tag* is the element name. Returns the closed + element. + + +.. method:: TreeBuilder.start(tag, attrs) + + Opens a new element. *tag* is the element name. *attrs* is a dictionary + containing element attributes. Returns the opened element. + + +.. _elementtree-xmltreebuilder-objects: + +XMLTreeBuilder Objects +---------------------- + + +.. class:: XMLTreeBuilder([html,] [target]) + + Element structure builder for XML source data, based on the expat parser. *html* + are predefined HTML entities. This flag is not supported by the current + implementation. *target* is the target object. If omitted, the builder uses an + instance of the standard TreeBuilder class. + + +.. method:: XMLTreeBuilder.close() + + Finishes feeding data to the parser. Returns an element structure. + + +.. method:: XMLTreeBuilder.doctype(name, pubid, system) + + Handles a doctype declaration. *name* is the doctype name. *pubid* is the public + identifier. *system* is the system identifier. + + +.. method:: XMLTreeBuilder.feed(data) + + Feeds data to the parser. *data* is encoded data. + diff --git a/Doc/library/xml.etree.rst b/Doc/library/xml.etree.rst new file mode 100644 index 0000000..e14c5f9 --- /dev/null +++ b/Doc/library/xml.etree.rst @@ -0,0 +1,25 @@ +:mod:`xml.etree` --- The ElementTree API for XML +================================================ + +.. module:: xml.etree + :synopsis: Package containing common ElementTree modules. +.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> + + +.. versionadded:: 2.5 + +The ElementTree package is a simple, efficient, and quite popular library for +XML manipulation in Python. The :mod:`xml.etree` package contains the most +common components from the ElementTree API library. In the current release, +this package contains the :mod:`ElementTree`, :mod:`ElementPath`, and +:mod:`ElementInclude` modules from the full ElementTree distribution. + +.. % XXX To be continued! + + +.. seealso:: + + `ElementTree Overview <http://effbot.org/tag/elementtree>`_ + The home page for :mod:`ElementTree`. This includes links to additional + documentation, alternative implementations, and other add-ons. + diff --git a/Doc/library/xml.sax.handler.rst b/Doc/library/xml.sax.handler.rst new file mode 100644 index 0000000..bc287d1 --- /dev/null +++ b/Doc/library/xml.sax.handler.rst @@ -0,0 +1,402 @@ + +:mod:`xml.sax.handler` --- Base classes for SAX handlers +======================================================== + +.. module:: xml.sax.handler + :synopsis: Base classes for SAX event handlers. +.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.0 + +The SAX API defines four kinds of handlers: content handlers, DTD handlers, +error handlers, and entity resolvers. Applications normally only need to +implement those interfaces whose events they are interested in; they can +implement the interfaces in a single object or in multiple objects. Handler +implementations should inherit from the base classes provided in the module +:mod:`xml.sax.handler`, so that all methods get default implementations. + + +.. class:: ContentHandler + + This is the main callback interface in SAX, and the one most important to + applications. The order of events in this interface mirrors the order of the + information in the document. + + +.. class:: DTDHandler + + Handle DTD events. + + This interface specifies only those DTD events required for basic parsing + (unparsed entities and attributes). + + +.. class:: EntityResolver + + Basic interface for resolving entities. If you create an object implementing + this interface, then register the object with your Parser, the parser will call + the method in your object to resolve all external entities. + + +.. class:: ErrorHandler + + Interface used by the parser to present error and warning messages to the + application. The methods of this object control whether errors are immediately + converted to exceptions or are handled in some other way. + +In addition to these classes, :mod:`xml.sax.handler` provides symbolic constants +for the feature and property names. + + +.. data:: feature_namespaces + + Value: ``"http://xml.org/sax/features/namespaces"`` --- true: Perform Namespace + processing. --- false: Optionally do not perform Namespace processing (implies + namespace-prefixes; default). --- access: (parsing) read-only; (not parsing) + read/write + + +.. data:: feature_namespace_prefixes + + Value: ``"http://xml.org/sax/features/namespace-prefixes"`` --- true: Report + the original prefixed names and attributes used for Namespace + declarations. --- false: Do not report attributes used for Namespace + declarations, and optionally do not report original prefixed names + (default). --- access: (parsing) read-only; (not parsing) read/write + + +.. data:: feature_string_interning + + Value: ``"http://xml.org/sax/features/string-interning"`` --- true: All element + names, prefixes, attribute names, Namespace URIs, and local names are interned + using the built-in intern function. --- false: Names are not necessarily + interned, although they may be (default). --- access: (parsing) read-only; (not + parsing) read/write + + +.. data:: feature_validation + + Value: ``"http://xml.org/sax/features/validation"`` --- true: Report all + validation errors (implies external-general-entities and + external-parameter-entities). --- false: Do not report validation errors. --- + access: (parsing) read-only; (not parsing) read/write + + +.. data:: feature_external_ges + + Value: ``"http://xml.org/sax/features/external-general-entities"`` --- true: + Include all external general (text) entities. --- false: Do not include + external general entities. --- access: (parsing) read-only; (not parsing) + read/write + + +.. data:: feature_external_pes + + Value: ``"http://xml.org/sax/features/external-parameter-entities"`` --- true: + Include all external parameter entities, including the external DTD subset. --- + false: Do not include any external parameter entities, even the external DTD + subset. --- access: (parsing) read-only; (not parsing) read/write + + +.. data:: all_features + + List of all features. + + +.. data:: property_lexical_handler + + Value: ``"http://xml.org/sax/properties/lexical-handler"`` --- data type: + xml.sax.sax2lib.LexicalHandler (not supported in Python 2) --- description: An + optional extension handler for lexical events like comments. --- access: + read/write + + +.. data:: property_declaration_handler + + Value: ``"http://xml.org/sax/properties/declaration-handler"`` --- data type: + xml.sax.sax2lib.DeclHandler (not supported in Python 2) --- description: An + optional extension handler for DTD-related events other than notations and + unparsed entities. --- access: read/write + + +.. data:: property_dom_node + + Value: ``"http://xml.org/sax/properties/dom-node"`` --- data type: + org.w3c.dom.Node (not supported in Python 2) --- description: When parsing, + the current DOM node being visited if this is a DOM iterator; when not parsing, + the root DOM node for iteration. --- access: (parsing) read-only; (not parsing) + read/write + + +.. data:: property_xml_string + + Value: ``"http://xml.org/sax/properties/xml-string"`` --- data type: String --- + description: The literal string of characters that was the source for the + current event. --- access: read-only + + +.. data:: all_properties + + List of all known property names. + + +.. _content-handler-objects: + +ContentHandler Objects +---------------------- + +Users are expected to subclass :class:`ContentHandler` to support their +application. The following methods are called by the parser on the appropriate +events in the input document: + + +.. method:: ContentHandler.setDocumentLocator(locator) + + Called by the parser to give the application a locator for locating the origin + of document events. + + SAX parsers are strongly encouraged (though not absolutely required) to supply a + locator: if it does so, it must supply the locator to the application by + invoking this method before invoking any of the other methods in the + DocumentHandler interface. + + The locator allows the application to determine the end position of any + document-related event, even if the parser is not reporting an error. Typically, + the application will use this information for reporting its own errors (such as + character content that does not match an application's business rules). The + information returned by the locator is probably not sufficient for use with a + search engine. + + Note that the locator will return correct information only during the invocation + of the events in this interface. The application should not attempt to use it at + any other time. + + +.. method:: ContentHandler.startDocument() + + Receive notification of the beginning of a document. + + The SAX parser will invoke this method only once, before any other methods in + this interface or in DTDHandler (except for :meth:`setDocumentLocator`). + + +.. method:: ContentHandler.endDocument() + + Receive notification of the end of a document. + + The SAX parser will invoke this method only once, and it will be the last method + invoked during the parse. The parser shall not invoke this method until it has + either abandoned parsing (because of an unrecoverable error) or reached the end + of input. + + +.. method:: ContentHandler.startPrefixMapping(prefix, uri) + + Begin the scope of a prefix-URI Namespace mapping. + + The information from this event is not necessary for normal Namespace + processing: the SAX XML reader will automatically replace prefixes for element + and attribute names when the ``feature_namespaces`` feature is enabled (the + default). + + There are cases, however, when applications need to use prefixes in character + data or in attribute values, where they cannot safely be expanded automatically; + the :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events supply the + information to the application to expand prefixes in those contexts itself, if + necessary. + + .. % XXX This is not really the default, is it? MvL + + Note that :meth:`startPrefixMapping` and :meth:`endPrefixMapping` events are not + guaranteed to be properly nested relative to each-other: all + :meth:`startPrefixMapping` events will occur before the corresponding + :meth:`startElement` event, and all :meth:`endPrefixMapping` events will occur + after the corresponding :meth:`endElement` event, but their order is not + guaranteed. + + +.. method:: ContentHandler.endPrefixMapping(prefix) + + End the scope of a prefix-URI mapping. + + See :meth:`startPrefixMapping` for details. This event will always occur after + the corresponding :meth:`endElement` event, but the order of + :meth:`endPrefixMapping` events is not otherwise guaranteed. + + +.. method:: ContentHandler.startElement(name, attrs) + + Signals the start of an element in non-namespace mode. + + The *name* parameter contains the raw XML 1.0 name of the element type as a + string and the *attrs* parameter holds an object of the :class:`Attributes` + interface (see :ref:`attributes-objects`) containing the attributes of + the element. The object passed as *attrs* may be re-used by the parser; holding + on to a reference to it is not a reliable way to keep a copy of the attributes. + To keep a copy of the attributes, use the :meth:`copy` method of the *attrs* + object. + + +.. method:: ContentHandler.endElement(name) + + Signals the end of an element in non-namespace mode. + + The *name* parameter contains the name of the element type, just as with the + :meth:`startElement` event. + + +.. method:: ContentHandler.startElementNS(name, qname, attrs) + + Signals the start of an element in namespace mode. + + The *name* parameter contains the name of the element type as a ``(uri, + localname)`` tuple, the *qname* parameter contains the raw XML 1.0 name used in + the source document, and the *attrs* parameter holds an instance of the + :class:`AttributesNS` interface (see :ref:`attributes-ns-objects`) + containing the attributes of the element. If no namespace is associated with + the element, the *uri* component of *name* will be ``None``. The object passed + as *attrs* may be re-used by the parser; holding on to a reference to it is not + a reliable way to keep a copy of the attributes. To keep a copy of the + attributes, use the :meth:`copy` method of the *attrs* object. + + Parsers may set the *qname* parameter to ``None``, unless the + ``feature_namespace_prefixes`` feature is activated. + + +.. method:: ContentHandler.endElementNS(name, qname) + + Signals the end of an element in namespace mode. + + The *name* parameter contains the name of the element type, just as with the + :meth:`startElementNS` method, likewise the *qname* parameter. + + +.. method:: ContentHandler.characters(content) + + Receive notification of character data. + + The Parser will call this method to report each chunk of character data. SAX + parsers may return all contiguous character data in a single chunk, or they may + split it into several chunks; however, all of the characters in any single event + must come from the same external entity so that the Locator provides useful + information. + + *content* may be a Unicode string or a byte string; the ``expat`` reader module + produces always Unicode strings. + + .. note:: + + The earlier SAX 1 interface provided by the Python XML Special Interest Group + used a more Java-like interface for this method. Since most parsers used from + Python did not take advantage of the older interface, the simpler signature was + chosen to replace it. To convert old code to the new interface, use *content* + instead of slicing content with the old *offset* and *length* parameters. + + +.. method:: ContentHandler.ignorableWhitespace(whitespace) + + Receive notification of ignorable whitespace in element content. + + Validating Parsers must use this method to report each chunk of ignorable + whitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating + parsers may also use this method if they are capable of parsing and using + content models. + + SAX parsers may return all contiguous whitespace in a single chunk, or they may + split it into several chunks; however, all of the characters in any single event + must come from the same external entity, so that the Locator provides useful + information. + + +.. method:: ContentHandler.processingInstruction(target, data) + + Receive notification of a processing instruction. + + The Parser will invoke this method once for each processing instruction found: + note that processing instructions may occur before or after the main document + element. + + A SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a + text declaration (XML 1.0, section 4.3.1) using this method. + + +.. method:: ContentHandler.skippedEntity(name) + + Receive notification of a skipped entity. + + The Parser will invoke this method once for each entity skipped. Non-validating + processors may skip entities if they have not seen the declarations (because, + for example, the entity was declared in an external DTD subset). All processors + may skip external entities, depending on the values of the + ``feature_external_ges`` and the ``feature_external_pes`` properties. + + +.. _dtd-handler-objects: + +DTDHandler Objects +------------------ + +:class:`DTDHandler` instances provide the following methods: + + +.. method:: DTDHandler.notationDecl(name, publicId, systemId) + + Handle a notation declaration event. + + +.. method:: DTDHandler.unparsedEntityDecl(name, publicId, systemId, ndata) + + Handle an unparsed entity declaration event. + + +.. _entity-resolver-objects: + +EntityResolver Objects +---------------------- + + +.. method:: EntityResolver.resolveEntity(publicId, systemId) + + Resolve the system identifier of an entity and return either the system + identifier to read from as a string, or an InputSource to read from. The default + implementation returns *systemId*. + + +.. _sax-error-handler: + +ErrorHandler Objects +-------------------- + +Objects with this interface are used to receive error and warning information +from the :class:`XMLReader`. If you create an object that implements this +interface, then register the object with your :class:`XMLReader`, the parser +will call the methods in your object to report all warnings and errors. There +are three levels of errors available: warnings, (possibly) recoverable errors, +and unrecoverable errors. All methods take a :exc:`SAXParseException` as the +only parameter. Errors and warnings may be converted to an exception by raising +the passed-in exception object. + + +.. method:: ErrorHandler.error(exception) + + Called when the parser encounters a recoverable error. If this method does not + raise an exception, parsing may continue, but further document information + should not be expected by the application. Allowing the parser to continue may + allow additional errors to be discovered in the input document. + + +.. method:: ErrorHandler.fatalError(exception) + + Called when the parser encounters an error it cannot recover from; parsing is + expected to terminate when this method returns. + + +.. method:: ErrorHandler.warning(exception) + + Called when the parser presents minor warning information to the application. + Parsing is expected to continue when this method returns, and document + information will continue to be passed to the application. Raising an exception + in this method will cause parsing to end. + diff --git a/Doc/library/xml.sax.reader.rst b/Doc/library/xml.sax.reader.rst new file mode 100644 index 0000000..d64a4fc --- /dev/null +++ b/Doc/library/xml.sax.reader.rst @@ -0,0 +1,386 @@ + +:mod:`xml.sax.xmlreader` --- Interface for XML parsers +====================================================== + +.. module:: xml.sax.xmlreader + :synopsis: Interface which SAX-compliant XML parsers must implement. +.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.0 + +SAX parsers implement the :class:`XMLReader` interface. They are implemented in +a Python module, which must provide a function :func:`create_parser`. This +function is invoked by :func:`xml.sax.make_parser` with no arguments to create +a new parser object. + + +.. class:: XMLReader() + + Base class which can be inherited by SAX parsers. + + +.. class:: IncrementalParser() + + In some cases, it is desirable not to parse an input source at once, but to feed + chunks of the document as they get available. Note that the reader will normally + not read the entire file, but read it in chunks as well; still :meth:`parse` + won't return until the entire document is processed. So these interfaces should + be used if the blocking behaviour of :meth:`parse` is not desirable. + + When the parser is instantiated it is ready to begin accepting data from the + feed method immediately. After parsing has been finished with a call to close + the reset method must be called to make the parser ready to accept new data, + either from feed or using the parse method. + + Note that these methods must *not* be called during parsing, that is, after + parse has been called and before it returns. + + By default, the class also implements the parse method of the XMLReader + interface using the feed, close and reset methods of the IncrementalParser + interface as a convenience to SAX 2.0 driver writers. + + +.. class:: Locator() + + Interface for associating a SAX event with a document location. A locator object + will return valid results only during calls to DocumentHandler methods; at any + other time, the results are unpredictable. If information is not available, + methods may return ``None``. + + +.. class:: InputSource([systemId]) + + Encapsulation of the information needed by the :class:`XMLReader` to read + entities. + + This class may include information about the public identifier, system + identifier, byte stream (possibly with character encoding information) and/or + the character stream of an entity. + + Applications will create objects of this class for use in the + :meth:`XMLReader.parse` method and for returning from + EntityResolver.resolveEntity. + + An :class:`InputSource` belongs to the application, the :class:`XMLReader` is + not allowed to modify :class:`InputSource` objects passed to it from the + application, although it may make copies and modify those. + + +.. class:: AttributesImpl(attrs) + + This is an implementation of the :class:`Attributes` interface (see section + :ref:`attributes-objects`). This is a dictionary-like object which + represents the element attributes in a :meth:`startElement` call. In addition + to the most useful dictionary operations, it supports a number of other + methods as described by the interface. Objects of this class should be + instantiated by readers; *attrs* must be a dictionary-like object containing + a mapping from attribute names to attribute values. + + +.. class:: AttributesNSImpl(attrs, qnames) + + Namespace-aware variant of :class:`AttributesImpl`, which will be passed to + :meth:`startElementNS`. It is derived from :class:`AttributesImpl`, but + understands attribute names as two-tuples of *namespaceURI* and + *localname*. In addition, it provides a number of methods expecting qualified + names as they appear in the original document. This class implements the + :class:`AttributesNS` interface (see section :ref:`attributes-ns-objects`). + + +.. _xmlreader-objects: + +XMLReader Objects +----------------- + +The :class:`XMLReader` interface supports the following methods: + + +.. method:: XMLReader.parse(source) + + Process an input source, producing SAX events. The *source* object can be a + system identifier (a string identifying the input source -- typically a file + name or an URL), a file-like object, or an :class:`InputSource` object. When + :meth:`parse` returns, the input is completely processed, and the parser object + can be discarded or reset. As a limitation, the current implementation only + accepts byte streams; processing of character streams is for further study. + + +.. method:: XMLReader.getContentHandler() + + Return the current :class:`ContentHandler`. + + +.. method:: XMLReader.setContentHandler(handler) + + Set the current :class:`ContentHandler`. If no :class:`ContentHandler` is set, + content events will be discarded. + + +.. method:: XMLReader.getDTDHandler() + + Return the current :class:`DTDHandler`. + + +.. method:: XMLReader.setDTDHandler(handler) + + Set the current :class:`DTDHandler`. If no :class:`DTDHandler` is set, DTD + events will be discarded. + + +.. method:: XMLReader.getEntityResolver() + + Return the current :class:`EntityResolver`. + + +.. method:: XMLReader.setEntityResolver(handler) + + Set the current :class:`EntityResolver`. If no :class:`EntityResolver` is set, + attempts to resolve an external entity will result in opening the system + identifier for the entity, and fail if it is not available. + + +.. method:: XMLReader.getErrorHandler() + + Return the current :class:`ErrorHandler`. + + +.. method:: XMLReader.setErrorHandler(handler) + + Set the current error handler. If no :class:`ErrorHandler` is set, errors will + be raised as exceptions, and warnings will be printed. + + +.. method:: XMLReader.setLocale(locale) + + Allow an application to set the locale for errors and warnings. + + SAX parsers are not required to provide localization for errors and warnings; if + they cannot support the requested locale, however, they must throw a SAX + exception. Applications may request a locale change in the middle of a parse. + + +.. method:: XMLReader.getFeature(featurename) + + Return the current setting for feature *featurename*. If the feature is not + recognized, :exc:`SAXNotRecognizedException` is raised. The well-known + featurenames are listed in the module :mod:`xml.sax.handler`. + + +.. method:: XMLReader.setFeature(featurename, value) + + Set the *featurename* to *value*. If the feature is not recognized, + :exc:`SAXNotRecognizedException` is raised. If the feature or its setting is not + supported by the parser, *SAXNotSupportedException* is raised. + + +.. method:: XMLReader.getProperty(propertyname) + + Return the current setting for property *propertyname*. If the property is not + recognized, a :exc:`SAXNotRecognizedException` is raised. The well-known + propertynames are listed in the module :mod:`xml.sax.handler`. + + +.. method:: XMLReader.setProperty(propertyname, value) + + Set the *propertyname* to *value*. If the property is not recognized, + :exc:`SAXNotRecognizedException` is raised. If the property or its setting is + not supported by the parser, *SAXNotSupportedException* is raised. + + +.. _incremental-parser-objects: + +IncrementalParser Objects +------------------------- + +Instances of :class:`IncrementalParser` offer the following additional methods: + + +.. method:: IncrementalParser.feed(data) + + Process a chunk of *data*. + + +.. method:: IncrementalParser.close() + + Assume the end of the document. That will check well-formedness conditions that + can be checked only at the end, invoke handlers, and may clean up resources + allocated during parsing. + + +.. method:: IncrementalParser.reset() + + This method is called after close has been called to reset the parser so that it + is ready to parse new documents. The results of calling parse or feed after + close without calling reset are undefined. + + +.. _locator-objects: + +Locator Objects +--------------- + +Instances of :class:`Locator` provide these methods: + + +.. method:: Locator.getColumnNumber() + + Return the column number where the current event ends. + + +.. method:: Locator.getLineNumber() + + Return the line number where the current event ends. + + +.. method:: Locator.getPublicId() + + Return the public identifier for the current event. + + +.. method:: Locator.getSystemId() + + Return the system identifier for the current event. + + +.. _input-source-objects: + +InputSource Objects +------------------- + + +.. method:: InputSource.setPublicId(id) + + Sets the public identifier of this :class:`InputSource`. + + +.. method:: InputSource.getPublicId() + + Returns the public identifier of this :class:`InputSource`. + + +.. method:: InputSource.setSystemId(id) + + Sets the system identifier of this :class:`InputSource`. + + +.. method:: InputSource.getSystemId() + + Returns the system identifier of this :class:`InputSource`. + + +.. method:: InputSource.setEncoding(encoding) + + Sets the character encoding of this :class:`InputSource`. + + The encoding must be a string acceptable for an XML encoding declaration (see + section 4.3.3 of the XML recommendation). + + The encoding attribute of the :class:`InputSource` is ignored if the + :class:`InputSource` also contains a character stream. + + +.. method:: InputSource.getEncoding() + + Get the character encoding of this InputSource. + + +.. method:: InputSource.setByteStream(bytefile) + + Set the byte stream (a Python file-like object which does not perform + byte-to-character conversion) for this input source. + + The SAX parser will ignore this if there is also a character stream specified, + but it will use a byte stream in preference to opening a URI connection itself. + + If the application knows the character encoding of the byte stream, it should + set it with the setEncoding method. + + +.. method:: InputSource.getByteStream() + + Get the byte stream for this input source. + + The getEncoding method will return the character encoding for this byte stream, + or None if unknown. + + +.. method:: InputSource.setCharacterStream(charfile) + + Set the character stream for this input source. (The stream must be a Python 1.6 + Unicode-wrapped file-like that performs conversion to Unicode strings.) + + If there is a character stream specified, the SAX parser will ignore any byte + stream and will not attempt to open a URI connection to the system identifier. + + +.. method:: InputSource.getCharacterStream() + + Get the character stream for this input source. + + +.. _attributes-objects: + +The :class:`Attributes` Interface +--------------------------------- + +:class:`Attributes` objects implement a portion of the mapping protocol, +including the methods :meth:`copy`, :meth:`get`, :meth:`has_key`, :meth:`items`, +:meth:`keys`, and :meth:`values`. The following methods are also provided: + + +.. method:: Attributes.getLength() + + Return the number of attributes. + + +.. method:: Attributes.getNames() + + Return the names of the attributes. + + +.. method:: Attributes.getType(name) + + Returns the type of the attribute *name*, which is normally ``'CDATA'``. + + +.. method:: Attributes.getValue(name) + + Return the value of attribute *name*. + +.. % getValueByQName, getNameByQName, getQNameByName, getQNames available +.. % here already, but documented only for derived class. + + +.. _attributes-ns-objects: + +The :class:`AttributesNS` Interface +----------------------------------- + +This interface is a subtype of the :class:`Attributes` interface (see section +:ref:`attributes-objects`). All methods supported by that interface are also +available on :class:`AttributesNS` objects. + +The following methods are also available: + + +.. method:: AttributesNS.getValueByQName(name) + + Return the value for a qualified name. + + +.. method:: AttributesNS.getNameByQName(name) + + Return the ``(namespace, localname)`` pair for a qualified *name*. + + +.. method:: AttributesNS.getQNameByName(name) + + Return the qualified name for a ``(namespace, localname)`` pair. + + +.. method:: AttributesNS.getQNames() + + Return the qualified names of all attributes. + diff --git a/Doc/library/xml.sax.rst b/Doc/library/xml.sax.rst new file mode 100644 index 0000000..43d17c2 --- /dev/null +++ b/Doc/library/xml.sax.rst @@ -0,0 +1,143 @@ + +:mod:`xml.sax` --- Support for SAX2 parsers +=========================================== + +.. module:: xml.sax + :synopsis: Package containing SAX2 base classes and convenience functions. +.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no> +.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.0 + +The :mod:`xml.sax` package provides a number of modules which implement the +Simple API for XML (SAX) interface for Python. The package itself provides the +SAX exceptions and the convenience functions which will be most used by users of +the SAX API. + +The convenience functions are: + + +.. function:: make_parser([parser_list]) + + Create and return a SAX :class:`XMLReader` object. The first parser found will + be used. If *parser_list* is provided, it must be a sequence of strings which + name modules that have a function named :func:`create_parser`. Modules listed + in *parser_list* will be used before modules in the default list of parsers. + + +.. function:: parse(filename_or_stream, handler[, error_handler]) + + Create a SAX parser and use it to parse a document. The document, passed in as + *filename_or_stream*, can be a filename or a file object. The *handler* + parameter needs to be a SAX :class:`ContentHandler` instance. If + *error_handler* is given, it must be a SAX :class:`ErrorHandler` instance; if + omitted, :exc:`SAXParseException` will be raised on all errors. There is no + return value; all work must be done by the *handler* passed in. + + +.. function:: parseString(string, handler[, error_handler]) + + Similar to :func:`parse`, but parses from a buffer *string* received as a + parameter. + +A typical SAX application uses three kinds of objects: readers, handlers and +input sources. "Reader" in this context is another term for parser, i.e. some +piece of code that reads the bytes or characters from the input source, and +produces a sequence of events. The events then get distributed to the handler +objects, i.e. the reader invokes a method on the handler. A SAX application +must therefore obtain a reader object, create or open the input sources, create +the handlers, and connect these objects all together. As the final step of +preparation, the reader is called to parse the input. During parsing, methods on +the handler objects are called based on structural and syntactic events from the +input data. + +For these objects, only the interfaces are relevant; they are normally not +instantiated by the application itself. Since Python does not have an explicit +notion of interface, they are formally introduced as classes, but applications +may use implementations which do not inherit from the provided classes. The +:class:`InputSource`, :class:`Locator`, :class:`Attributes`, +:class:`AttributesNS`, and :class:`XMLReader` interfaces are defined in the +module :mod:`xml.sax.xmlreader`. The handler interfaces are defined in +:mod:`xml.sax.handler`. For convenience, :class:`InputSource` (which is often +instantiated directly) and the handler classes are also available from +:mod:`xml.sax`. These interfaces are described below. + +In addition to these classes, :mod:`xml.sax` provides the following exception +classes. + + +.. exception:: SAXException(msg[, exception]) + + Encapsulate an XML error or warning. This class can contain basic error or + warning information from either the XML parser or the application: it can be + subclassed to provide additional functionality or to add localization. Note + that although the handlers defined in the :class:`ErrorHandler` interface + receive instances of this exception, it is not required to actually raise the + exception --- it is also useful as a container for information. + + When instantiated, *msg* should be a human-readable description of the error. + The optional *exception* parameter, if given, should be ``None`` or an exception + that was caught by the parsing code and is being passed along as information. + + This is the base class for the other SAX exception classes. + + +.. exception:: SAXParseException(msg, exception, locator) + + Subclass of :exc:`SAXException` raised on parse errors. Instances of this class + are passed to the methods of the SAX :class:`ErrorHandler` interface to provide + information about the parse error. This class supports the SAX :class:`Locator` + interface as well as the :class:`SAXException` interface. + + +.. exception:: SAXNotRecognizedException(msg[, exception]) + + Subclass of :exc:`SAXException` raised when a SAX :class:`XMLReader` is + confronted with an unrecognized feature or property. SAX applications and + extensions may use this class for similar purposes. + + +.. exception:: SAXNotSupportedException(msg[, exception]) + + Subclass of :exc:`SAXException` raised when a SAX :class:`XMLReader` is asked to + enable a feature that is not supported, or to set a property to a value that the + implementation does not support. SAX applications and extensions may use this + class for similar purposes. + + +.. seealso:: + + `SAX: The Simple API for XML <http://www.saxproject.org/>`_ + This site is the focal point for the definition of the SAX API. It provides a + Java implementation and online documentation. Links to implementations and + historical information are also available. + + Module :mod:`xml.sax.handler` + Definitions of the interfaces for application-provided objects. + + Module :mod:`xml.sax.saxutils` + Convenience functions for use in SAX applications. + + Module :mod:`xml.sax.xmlreader` + Definitions of the interfaces for parser-provided objects. + + +.. _sax-exception-objects: + +SAXException Objects +-------------------- + +The :class:`SAXException` exception class supports the following methods: + + +.. method:: SAXException.getMessage() + + Return a human-readable message describing the error condition. + + +.. method:: SAXException.getException() + + Return an encapsulated exception object, or ``None``. + diff --git a/Doc/library/xml.sax.utils.rst b/Doc/library/xml.sax.utils.rst new file mode 100644 index 0000000..0585a9b --- /dev/null +++ b/Doc/library/xml.sax.utils.rst @@ -0,0 +1,83 @@ + +:mod:`xml.sax.saxutils` --- SAX Utilities +========================================= + +.. module:: xml.sax.saxutils + :synopsis: Convenience functions and classes for use with SAX. +.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no> +.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> + + +.. versionadded:: 2.0 + +The module :mod:`xml.sax.saxutils` contains a number of classes and functions +that are commonly useful when creating SAX applications, either in direct use, +or as base classes. + + +.. function:: escape(data[, entities]) + + Escape ``'&'``, ``'<'``, and ``'>'`` in a string of data. + + You can escape other strings of data by passing a dictionary as the optional + *entities* parameter. The keys and values must all be strings; each key will be + replaced with its corresponding value. + + +.. function:: unescape(data[, entities]) + + Unescape ``'&'``, ``'<'``, and ``'>'`` in a string of data. + + You can unescape other strings of data by passing a dictionary as the optional + *entities* parameter. The keys and values must all be strings; each key will be + replaced with its corresponding value. + + .. versionadded:: 2.3 + + +.. function:: quoteattr(data[, entities]) + + Similar to :func:`escape`, but also prepares *data* to be used as an + attribute value. The return value is a quoted version of *data* with any + additional required replacements. :func:`quoteattr` will select a quote + character based on the content of *data*, attempting to avoid encoding any + quote characters in the string. If both single- and double-quote characters + are already in *data*, the double-quote characters will be encoded and *data* + will be wrapped in double-quotes. The resulting string can be used directly + as an attribute value:: + + >>> print "<element attr=%s>" % quoteattr("ab ' cd \" ef") + <element attr="ab ' cd " ef"> + + This function is useful when generating attribute values for HTML or any SGML + using the reference concrete syntax. + + .. versionadded:: 2.2 + + +.. class:: XMLGenerator([out[, encoding]]) + + This class implements the :class:`ContentHandler` interface by writing SAX + events back into an XML document. In other words, using an :class:`XMLGenerator` + as the content handler will reproduce the original document being parsed. *out* + should be a file-like object which will default to *sys.stdout*. *encoding* is + the encoding of the output stream which defaults to ``'iso-8859-1'``. + + +.. class:: XMLFilterBase(base) + + This class is designed to sit between an :class:`XMLReader` and the client + application's event handlers. By default, it does nothing but pass requests up + to the reader and events on to the handlers unmodified, but subclasses can + override specific methods to modify the event stream or the configuration + requests as they pass through. + + +.. function:: prepare_input_source(source[, base]) + + This function takes an input source and an optional base URL and returns a fully + resolved :class:`InputSource` object ready for reading. The input source can be + given as a string, a file-like object, or an :class:`InputSource` object; + parsers will use this function to implement the polymorphic *source* argument to + their :meth:`parse` method. + diff --git a/Doc/library/xmlrpclib.rst b/Doc/library/xmlrpclib.rst new file mode 100644 index 0000000..cd507c4 --- /dev/null +++ b/Doc/library/xmlrpclib.rst @@ -0,0 +1,422 @@ + +:mod:`xmlrpclib` --- XML-RPC client access +========================================== + +.. module:: xmlrpclib + :synopsis: XML-RPC client access. +.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> +.. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com> + + +.. % Not everything is documented yet. It might be good to describe +.. % Marshaller, Unmarshaller, getparser, dumps, loads, and Transport. + +.. versionadded:: 2.2 + +XML-RPC is a Remote Procedure Call method that uses XML passed via HTTP as a +transport. With it, a client can call methods with parameters on a remote +server (the server is named by a URI) and get back structured data. This module +supports writing XML-RPC client code; it handles all the details of translating +between conformable Python objects and XML on the wire. + + +.. class:: ServerProxy(uri[, transport[, encoding[, verbose[, allow_none[, use_datetime]]]]]) + + A :class:`ServerProxy` instance is an object that manages communication with a + remote XML-RPC server. The required first argument is a URI (Uniform Resource + Indicator), and will normally be the URL of the server. The optional second + argument is a transport factory instance; by default it is an internal + :class:`SafeTransport` instance for https: URLs and an internal HTTP + :class:`Transport` instance otherwise. The optional third argument is an + encoding, by default UTF-8. The optional fourth argument is a debugging flag. + If *allow_none* is true, the Python constant ``None`` will be translated into + XML; the default behaviour is for ``None`` to raise a :exc:`TypeError`. This is + a commonly-used extension to the XML-RPC specification, but isn't supported by + all clients and servers; see http://ontosys.com/xml-rpc/extensions.php for a + description. The *use_datetime* flag can be used to cause date/time values to + be presented as :class:`datetime.datetime` objects; this is false by default. + :class:`datetime.datetime`, :class:`datetime.date` and :class:`datetime.time` + objects may be passed to calls. :class:`datetime.date` objects are converted + with a time of "00:00:00". :class:`datetime.time` objects are converted using + today's date. + + Both the HTTP and HTTPS transports support the URL syntax extension for HTTP + Basic Authentication: ``http://user:pass@host:port/path``. The ``user:pass`` + portion will be base64-encoded as an HTTP 'Authorization' header, and sent to + the remote server as part of the connection process when invoking an XML-RPC + method. You only need to use this if the remote server requires a Basic + Authentication user and password. + + The returned instance is a proxy object with methods that can be used to invoke + corresponding RPC calls on the remote server. If the remote server supports the + introspection API, the proxy can also be used to query the remote server for the + methods it supports (service discovery) and fetch other server-associated + metadata. + + :class:`ServerProxy` instance methods take Python basic types and objects as + arguments and return Python basic types and classes. Types that are conformable + (e.g. that can be marshalled through XML), include the following (and except + where noted, they are unmarshalled as the same Python type): + + +---------------------------------+---------------------------------------------+ + | Name | Meaning | + +=================================+=============================================+ + | :const:`boolean` | The :const:`True` and :const:`False` | + | | constants | + +---------------------------------+---------------------------------------------+ + | :const:`integers` | Pass in directly | + +---------------------------------+---------------------------------------------+ + | :const:`floating-point numbers` | Pass in directly | + +---------------------------------+---------------------------------------------+ + | :const:`strings` | Pass in directly | + +---------------------------------+---------------------------------------------+ + | :const:`arrays` | Any Python sequence type containing | + | | conformable elements. Arrays are returned | + | | as lists | + +---------------------------------+---------------------------------------------+ + | :const:`structures` | A Python dictionary. Keys must be strings, | + | | values may be any conformable type. Objects | + | | of user-defined classes can be passed in; | + | | only their *__dict__* attribute is | + | | transmitted. | + +---------------------------------+---------------------------------------------+ + | :const:`dates` | in seconds since the epoch (pass in an | + | | instance of the :class:`DateTime` class) or | + | | a :class:`datetime.datetime`, | + | | :class:`datetime.date` or | + | | :class:`datetime.time` instance | + +---------------------------------+---------------------------------------------+ + | :const:`binary data` | pass in an instance of the :class:`Binary` | + | | wrapper class | + +---------------------------------+---------------------------------------------+ + + This is the full set of data types supported by XML-RPC. Method calls may also + raise a special :exc:`Fault` instance, used to signal XML-RPC server errors, or + :exc:`ProtocolError` used to signal an error in the HTTP/HTTPS transport layer. + Both :exc:`Fault` and :exc:`ProtocolError` derive from a base class called + :exc:`Error`. Note that even though starting with Python 2.2 you can subclass + builtin types, the xmlrpclib module currently does not marshal instances of such + subclasses. + + When passing strings, characters special to XML such as ``<``, ``>``, and ``&`` + will be automatically escaped. However, it's the caller's responsibility to + ensure that the string is free of characters that aren't allowed in XML, such as + the control characters with ASCII values between 0 and 31 (except, of course, + tab, newline and carriage return); failing to do this will result in an XML-RPC + request that isn't well-formed XML. If you have to pass arbitrary strings via + XML-RPC, use the :class:`Binary` wrapper class described below. + + :class:`Server` is retained as an alias for :class:`ServerProxy` for backwards + compatibility. New code should use :class:`ServerProxy`. + + .. versionchanged:: 2.5 + The *use_datetime* flag was added. + + .. versionchanged:: 2.6 + Instances of new-style classes can be passed in if they have an *__dict__* + attribute and don't have a base class that is marshalled in a special way. + + +.. seealso:: + + `XML-RPC HOWTO <http://www.tldp.org/HOWTO/XML-RPC-HOWTO/index.html>`_ + A good description of XML operation and client software in several languages. + Contains pretty much everything an XML-RPC client developer needs to know. + + `XML-RPC Hacks page <http://xmlrpc-c.sourceforge.net/hacks.php>`_ + Extensions for various open-source libraries to support introspection and + multicall. + + +.. _serverproxy-objects: + +ServerProxy Objects +------------------- + +A :class:`ServerProxy` instance has a method corresponding to each remote +procedure call accepted by the XML-RPC server. Calling the method performs an +RPC, dispatched by both name and argument signature (e.g. the same method name +can be overloaded with multiple argument signatures). The RPC finishes by +returning a value, which may be either returned data in a conformant type or a +:class:`Fault` or :class:`ProtocolError` object indicating an error. + +Servers that support the XML introspection API support some common methods +grouped under the reserved :attr:`system` member: + + +.. method:: ServerProxy.system.listMethods() + + This method returns a list of strings, one for each (non-system) method + supported by the XML-RPC server. + + +.. method:: ServerProxy.system.methodSignature(name) + + This method takes one parameter, the name of a method implemented by the XML-RPC + server.It returns an array of possible signatures for this method. A signature + is an array of types. The first of these types is the return type of the method, + the rest are parameters. + + Because multiple signatures (ie. overloading) is permitted, this method returns + a list of signatures rather than a singleton. + + Signatures themselves are restricted to the top level parameters expected by a + method. For instance if a method expects one array of structs as a parameter, + and it returns a string, its signature is simply "string, array". If it expects + three integers and returns a string, its signature is "string, int, int, int". + + If no signature is defined for the method, a non-array value is returned. In + Python this means that the type of the returned value will be something other + that list. + + +.. method:: ServerProxy.system.methodHelp(name) + + This method takes one parameter, the name of a method implemented by the XML-RPC + server. It returns a documentation string describing the use of that method. If + no such string is available, an empty string is returned. The documentation + string may contain HTML markup. + +Introspection methods are currently supported by servers written in PHP, C and +Microsoft .NET. Partial introspection support is included in recent updates to +UserLand Frontier. Introspection support for Perl, Python and Java is available +at the `XML-RPC Hacks <http://xmlrpc-c.sourceforge.net/hacks.php>`_ page. + + +.. _boolean-objects: + +Boolean Objects +--------------- + +This class may be initialized from any Python value; the instance returned +depends only on its truth value. It supports various Python operators through +:meth:`__cmp__`, :meth:`__repr__`, :meth:`__int__`, and :meth:`__bool__` +methods, all implemented in the obvious ways. + +It also has the following method, supported mainly for internal use by the +unmarshalling code: + + +.. method:: Boolean.encode(out) + + Write the XML-RPC encoding of this Boolean item to the out stream object. + + +.. _datetime-objects: + +DateTime Objects +---------------- + +This class may be initialized with seconds since the epoch, a time tuple, an ISO +8601 time/date string, or a :class:`datetime.datetime`, :class:`datetime.date` +or :class:`datetime.time` instance. It has the following methods, supported +mainly for internal use by the marshalling/unmarshalling code: + + +.. method:: DateTime.decode(string) + + Accept a string as the instance's new time value. + + +.. method:: DateTime.encode(out) + + Write the XML-RPC encoding of this :class:`DateTime` item to the *out* stream + object. + +It also supports certain of Python's built-in operators through :meth:`__cmp__` +and :meth:`__repr__` methods. + + +.. _binary-objects: + +Binary Objects +-------------- + +This class may be initialized from string data (which may include NULs). The +primary access to the content of a :class:`Binary` object is provided by an +attribute: + + +.. attribute:: Binary.data + + The binary data encapsulated by the :class:`Binary` instance. The data is + provided as an 8-bit string. + +:class:`Binary` objects have the following methods, supported mainly for +internal use by the marshalling/unmarshalling code: + + +.. method:: Binary.decode(string) + + Accept a base64 string and decode it as the instance's new data. + + +.. method:: Binary.encode(out) + + Write the XML-RPC base 64 encoding of this binary item to the out stream object. + +It also supports certain of Python's built-in operators through a +:meth:`__cmp__` method. + + +.. _fault-objects: + +Fault Objects +------------- + +A :class:`Fault` object encapsulates the content of an XML-RPC fault tag. Fault +objects have the following members: + + +.. attribute:: Fault.faultCode + + A string indicating the fault type. + + +.. attribute:: Fault.faultString + + A string containing a diagnostic message associated with the fault. + + +.. _protocol-error-objects: + +ProtocolError Objects +--------------------- + +A :class:`ProtocolError` object describes a protocol error in the underlying +transport layer (such as a 404 'not found' error if the server named by the URI +does not exist). It has the following members: + + +.. attribute:: ProtocolError.url + + The URI or URL that triggered the error. + + +.. attribute:: ProtocolError.errcode + + The error code. + + +.. attribute:: ProtocolError.errmsg + + The error message or diagnostic string. + + +.. attribute:: ProtocolError.headers + + A string containing the headers of the HTTP/HTTPS request that triggered the + error. + + +MultiCall Objects +----------------- + +.. versionadded:: 2.4 + +In http://www.xmlrpc.com/discuss/msgReader%241208, an approach is presented to +encapsulate multiple calls to a remote server into a single request. + + +.. class:: MultiCall(server) + + Create an object used to boxcar method calls. *server* is the eventual target of + the call. Calls can be made to the result object, but they will immediately + return ``None``, and only store the call name and parameters in the + :class:`MultiCall` object. Calling the object itself causes all stored calls to + be transmitted as a single ``system.multicall`` request. The result of this call + is a generator; iterating over this generator yields the individual results. + +A usage example of this class is :: + + multicall = MultiCall(server_proxy) + multicall.add(2,3) + multicall.get_address("Guido") + add_result, address = multicall() + + +Convenience Functions +--------------------- + + +.. function:: boolean(value) + + Convert any Python value to one of the XML-RPC Boolean constants, ``True`` or + ``False``. + + +.. function:: dumps(params[, methodname[, methodresponse[, encoding[, allow_none]]]]) + + Convert *params* into an XML-RPC request. or into a response if *methodresponse* + is true. *params* can be either a tuple of arguments or an instance of the + :exc:`Fault` exception class. If *methodresponse* is true, only a single value + can be returned, meaning that *params* must be of length 1. *encoding*, if + supplied, is the encoding to use in the generated XML; the default is UTF-8. + Python's :const:`None` value cannot be used in standard XML-RPC; to allow using + it via an extension, provide a true value for *allow_none*. + + +.. function:: loads(data[, use_datetime]) + + Convert an XML-RPC request or response into Python objects, a ``(params, + methodname)``. *params* is a tuple of argument; *methodname* is a string, or + ``None`` if no method name is present in the packet. If the XML-RPC packet + represents a fault condition, this function will raise a :exc:`Fault` exception. + The *use_datetime* flag can be used to cause date/time values to be presented as + :class:`datetime.datetime` objects; this is false by default. Note that even if + you call an XML-RPC method with :class:`datetime.date` or :class:`datetime.time` + objects, they are converted to :class:`DateTime` objects internally, so only + :class:`datetime.datetime` objects will be returned. + + .. versionchanged:: 2.5 + The *use_datetime* flag was added. + + +.. _xmlrpc-client-example: + +Example of Client Usage +----------------------- + +:: + + # simple test program (from the XML-RPC specification) + from xmlrpclib import ServerProxy, Error + + # server = ServerProxy("http://localhost:8000") # local server + server = ServerProxy("http://betty.userland.com") + + print server + + try: + print server.examples.getStateName(41) + except Error as v: + print "ERROR", v + +To access an XML-RPC server through a proxy, you need to define a custom +transport. The following example, written by NoboNobo, shows how: + +.. % fill in original author's name if we ever learn it + +.. % Example taken from http://lowlife.jp/nobonobo/wiki/xmlrpcwithproxy.html + +:: + + import xmlrpclib, httplib + + class ProxiedTransport(xmlrpclib.Transport): + def set_proxy(self, proxy): + self.proxy = proxy + def make_connection(self, host): + self.realhost = host + h = httplib.HTTP(self.proxy) + return h + def send_request(self, connection, handler, request_body): + connection.putrequest("POST", 'http://%s%s' % (self.realhost, handler)) + def send_host(self, connection, host): + connection.putheader('Host', self.realhost) + + p = ProxiedTransport() + p.set_proxy('proxy-server:8080') + server = xmlrpclib.Server('http://time.xmlrpc.com/RPC2', transport=p) + print server.currentTime.getCurrentTime() + diff --git a/Doc/library/zipfile.rst b/Doc/library/zipfile.rst new file mode 100644 index 0000000..5e51bfc --- /dev/null +++ b/Doc/library/zipfile.rst @@ -0,0 +1,408 @@ + +:mod:`zipfile` --- Work with ZIP archives +========================================= + +.. module:: zipfile + :synopsis: Read and write ZIP-format archive files. +.. moduleauthor:: James C. Ahlstrom <jim@interet.com> +.. sectionauthor:: James C. Ahlstrom <jim@interet.com> + + +.. % LaTeX markup by Fred L. Drake, Jr. <fdrake@acm.org> + +.. versionadded:: 1.6 + +The ZIP file format is a common archive and compression standard. This module +provides tools to create, read, write, append, and list a ZIP file. Any +advanced use of this module will require an understanding of the format, as +defined in `PKZIP Application Note +<http://www.pkware.com/business_and_developers/developer/appnote/>`_. + +This module does not currently handle ZIP files which have appended comments, or +multi-disk ZIP files. It can handle ZIP files that use the ZIP64 extensions +(that is ZIP files that are more than 4 GByte in size). It supports decryption +of encrypted files in ZIP archives, but it cannot currently create an encrypted +file. + +The available attributes of this module are: + + +.. exception:: BadZipfile + + The error raised for bad ZIP files (old name: ``zipfile.error``). + + +.. exception:: LargeZipFile + + The error raised when a ZIP file would require ZIP64 functionality but that has + not been enabled. + + +.. class:: ZipFile + + The class for reading and writing ZIP files. See section + :ref:`zipfile-objects` for constructor details. + + +.. class:: PyZipFile + + Class for creating ZIP archives containing Python libraries. + + +.. class:: ZipInfo([filename[, date_time]]) + + Class used to represent information about a member of an archive. Instances + of this class are returned by the :meth:`getinfo` and :meth:`infolist` + methods of :class:`ZipFile` objects. Most users of the :mod:`zipfile` module + will not need to create these, but only use those created by this + module. *filename* should be the full name of the archive member, and + *date_time* should be a tuple containing six fields which describe the time + of the last modification to the file; the fields are described in section + :ref:`zipinfo-objects`. + + +.. function:: is_zipfile(filename) + + Returns ``True`` if *filename* is a valid ZIP file based on its magic number, + otherwise returns ``False``. This module does not currently handle ZIP files + which have appended comments. + + +.. data:: ZIP_STORED + + The numeric constant for an uncompressed archive member. + + +.. data:: ZIP_DEFLATED + + The numeric constant for the usual ZIP compression method. This requires the + zlib module. No other compression methods are currently supported. + + +.. seealso:: + + `PKZIP Application Note <http://www.pkware.com/business_and_developers/developer/appnote/>`_ + Documentation on the ZIP file format by Phil Katz, the creator of the format and + algorithms used. + + `Info-ZIP Home Page <http://www.info-zip.org/>`_ + Information about the Info-ZIP project's ZIP archive programs and development + libraries. + + +.. _zipfile-objects: + +ZipFile Objects +--------------- + + +.. class:: ZipFile(file[, mode[, compression[, allowZip64]]]) + + Open a ZIP file, where *file* can be either a path to a file (a string) or a + file-like object. The *mode* parameter should be ``'r'`` to read an existing + file, ``'w'`` to truncate and write a new file, or ``'a'`` to append to an + existing file. If *mode* is ``'a'`` and *file* refers to an existing ZIP file, + then additional files are added to it. If *file* does not refer to a ZIP file, + then a new ZIP archive is appended to the file. This is meant for adding a ZIP + archive to another file, such as :file:`python.exe`. Using :: + + cat myzip.zip >> python.exe + + also works, and at least :program:`WinZip` can read such files. If *mode* is + ``a`` and the file does not exist at all, it is created. *compression* is the + ZIP compression method to use when writing the archive, and should be + :const:`ZIP_STORED` or :const:`ZIP_DEFLATED`; unrecognized values will cause + :exc:`RuntimeError` to be raised. If :const:`ZIP_DEFLATED` is specified but the + :mod:`zlib` module is not available, :exc:`RuntimeError` is also raised. The + default is :const:`ZIP_STORED`. If *allowZip64* is ``True`` zipfile will create + ZIP files that use the ZIP64 extensions when the zipfile is larger than 2 GB. If + it is false (the default) :mod:`zipfile` will raise an exception when the ZIP + file would require ZIP64 extensions. ZIP64 extensions are disabled by default + because the default :program:`zip` and :program:`unzip` commands on Unix (the + InfoZIP utilities) don't support these extensions. + + .. versionchanged:: 2.6 + If the file does not exist, it is created if the mode is 'a'. + + +.. method:: ZipFile.close() + + Close the archive file. You must call :meth:`close` before exiting your program + or essential records will not be written. + + +.. method:: ZipFile.getinfo(name) + + Return a :class:`ZipInfo` object with information about the archive member + *name*. Calling :meth:`getinfo` for a name not currently contained in the + archive will raise a :exc:`KeyError`. + + +.. method:: ZipFile.infolist() + + Return a list containing a :class:`ZipInfo` object for each member of the + archive. The objects are in the same order as their entries in the actual ZIP + file on disk if an existing archive was opened. + + +.. method:: ZipFile.namelist() + + Return a list of archive members by name. + + +.. method:: ZipFile.open(name[, mode[, pwd]]) + + Extract a member from the archive as a file-like object (ZipExtFile). *name* is + the name of the file in the archive. The *mode* parameter, if included, must be + one of the following: ``'r'`` (the default), ``'U'``, or ``'rU'``. Choosing + ``'U'`` or ``'rU'`` will enable universal newline support in the read-only + object. *pwd* is the password used for encrypted files. Calling :meth:`open` + on a closed ZipFile will raise a :exc:`RuntimeError`. + + .. note:: + + The file-like object is read-only and provides the following methods: + :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`__iter__`, + :meth:`next`. + + .. note:: + + If the ZipFile was created by passing in a file-like object as the first + argument to the constructor, then the object returned by :meth:`open` shares the + ZipFile's file pointer. Under these circumstances, the object returned by + :meth:`open` should not be used after any additional operations are performed + on the ZipFile object. If the ZipFile was created by passing in a string (the + filename) as the first argument to the constructor, then :meth:`open` will + create a new file object that will be held by the ZipExtFile, allowing it to + operate independently of the ZipFile. + + .. versionadded:: 2.6 + + +.. method:: ZipFile.printdir() + + Print a table of contents for the archive to ``sys.stdout``. + + +.. method:: ZipFile.setpassword(pwd) + + Set *pwd* as default password to extract encrypted files. + + .. versionadded:: 2.6 + + +.. method:: ZipFile.read(name[, pwd]) + + Return the bytes of the file in the archive. The archive must be open for read + or append. *pwd* is the password used for encrypted files and, if specified, it + will override the default password set with :meth:`setpassword`. Calling + :meth:`read` on a closed ZipFile will raise a :exc:`RuntimeError`. + + .. versionchanged:: 2.6 + *pwd* was added. + + +.. method:: ZipFile.testzip() + + Read all the files in the archive and check their CRC's and file headers. + Return the name of the first bad file, or else return ``None``. Calling + :meth:`testzip` on a closed ZipFile will raise a :exc:`RuntimeError`. + + +.. method:: ZipFile.write(filename[, arcname[, compress_type]]) + + Write the file named *filename* to the archive, giving it the archive name + *arcname* (by default, this will be the same as *filename*, but without a drive + letter and with leading path separators removed). If given, *compress_type* + overrides the value given for the *compression* parameter to the constructor for + the new entry. The archive must be open with mode ``'w'`` or ``'a'`` -- calling + :meth:`write` on a ZipFile created with mode ``'r'`` will raise a + :exc:`RuntimeError`. Calling :meth:`write` on a closed ZipFile will raise a + :exc:`RuntimeError`. + + .. note:: + + There is no official file name encoding for ZIP files. If you have unicode file + names, please convert them to byte strings in your desired encoding before + passing them to :meth:`write`. WinZip interprets all file names as encoded in + CP437, also known as DOS Latin. + + .. note:: + + Archive names should be relative to the archive root, that is, they should not + start with a path separator. + + .. note:: + + If ``arcname`` (or ``filename``, if ``arcname`` is not given) contains a null + byte, the name of the file in the archive will be truncated at the null byte. + + +.. method:: ZipFile.writestr(zinfo_or_arcname, bytes) + + Write the string *bytes* to the archive; *zinfo_or_arcname* is either the file + name it will be given in the archive, or a :class:`ZipInfo` instance. If it's + an instance, at least the filename, date, and time must be given. If it's a + name, the date and time is set to the current date and time. The archive must be + opened with mode ``'w'`` or ``'a'`` -- calling :meth:`writestr` on a ZipFile + created with mode ``'r'`` will raise a :exc:`RuntimeError`. Calling + :meth:`writestr` on a closed ZipFile will raise a :exc:`RuntimeError`. + +The following data attribute is also available: + + +.. attribute:: ZipFile.debug + + The level of debug output to use. This may be set from ``0`` (the default, no + output) to ``3`` (the most output). Debugging information is written to + ``sys.stdout``. + + +.. _pyzipfile-objects: + +PyZipFile Objects +----------------- + +The :class:`PyZipFile` constructor takes the same parameters as the +:class:`ZipFile` constructor. Instances have one method in addition to those of +:class:`ZipFile` objects. + + +.. method:: PyZipFile.writepy(pathname[, basename]) + + Search for files :file:`\*.py` and add the corresponding file to the archive. + The corresponding file is a :file:`\*.pyo` file if available, else a + :file:`\*.pyc` file, compiling if necessary. If the pathname is a file, the + filename must end with :file:`.py`, and just the (corresponding + :file:`\*.py[co]`) file is added at the top level (no path information). If the + pathname is a file that does not end with :file:`.py`, a :exc:`RuntimeError` + will be raised. If it is a directory, and the directory is not a package + directory, then all the files :file:`\*.py[co]` are added at the top level. If + the directory is a package directory, then all :file:`\*.py[co]` are added under + the package name as a file path, and if any subdirectories are package + directories, all of these are added recursively. *basename* is intended for + internal use only. The :meth:`writepy` method makes archives with file names + like this:: + + string.pyc # Top level name + test/__init__.pyc # Package directory + test/testall.pyc # Module test.testall + test/bogus/__init__.pyc # Subpackage directory + test/bogus/myfile.pyc # Submodule test.bogus.myfile + + +.. _zipinfo-objects: + +ZipInfo Objects +--------------- + +Instances of the :class:`ZipInfo` class are returned by the :meth:`getinfo` and +:meth:`infolist` methods of :class:`ZipFile` objects. Each object stores +information about a single member of the ZIP archive. + +Instances have the following attributes: + + +.. attribute:: ZipInfo.filename + + Name of the file in the archive. + + +.. attribute:: ZipInfo.date_time + + The time and date of the last modification to the archive member. This is a + tuple of six values: + + +-------+--------------------------+ + | Index | Value | + +=======+==========================+ + | ``0`` | Year | + +-------+--------------------------+ + | ``1`` | Month (one-based) | + +-------+--------------------------+ + | ``2`` | Day of month (one-based) | + +-------+--------------------------+ + | ``3`` | Hours (zero-based) | + +-------+--------------------------+ + | ``4`` | Minutes (zero-based) | + +-------+--------------------------+ + | ``5`` | Seconds (zero-based) | + +-------+--------------------------+ + + +.. attribute:: ZipInfo.compress_type + + Type of compression for the archive member. + + +.. attribute:: ZipInfo.comment + + Comment for the individual archive member. + + +.. attribute:: ZipInfo.extra + + Expansion field data. The `PKZIP Application Note + <http://www.pkware.com/business_and_developers/developer/appnote/>`_ contains + some comments on the internal structure of the data contained in this string. + + +.. attribute:: ZipInfo.create_system + + System which created ZIP archive. + + +.. attribute:: ZipInfo.create_version + + PKZIP version which created ZIP archive. + + +.. attribute:: ZipInfo.extract_version + + PKZIP version needed to extract archive. + + +.. attribute:: ZipInfo.reserved + + Must be zero. + + +.. attribute:: ZipInfo.flag_bits + + ZIP flag bits. + + +.. attribute:: ZipInfo.volume + + Volume number of file header. + + +.. attribute:: ZipInfo.internal_attr + + Internal attributes. + + +.. attribute:: ZipInfo.external_attr + + External file attributes. + + +.. attribute:: ZipInfo.header_offset + + Byte offset to the file header. + + +.. attribute:: ZipInfo.CRC + + CRC-32 of the uncompressed file. + + +.. attribute:: ZipInfo.compress_size + + Size of the compressed data. + + +.. attribute:: ZipInfo.file_size + + Size of the uncompressed file. + diff --git a/Doc/library/zipimport.rst b/Doc/library/zipimport.rst new file mode 100644 index 0000000..f2b2358 --- /dev/null +++ b/Doc/library/zipimport.rst @@ -0,0 +1,137 @@ + +:mod:`zipimport` --- Import modules from Zip archives +===================================================== + +.. module:: zipimport + :synopsis: support for importing Python modules from ZIP archives. +.. moduleauthor:: Just van Rossum <just@letterror.com> + + +.. versionadded:: 2.3 + +This module adds the ability to import Python modules (:file:`\*.py`, +:file:`\*.py[co]`) and packages from ZIP-format archives. It is usually not +needed to use the :mod:`zipimport` module explicitly; it is automatically used +by the builtin :keyword:`import` mechanism for ``sys.path`` items that are paths +to ZIP archives. + +Typically, ``sys.path`` is a list of directory names as strings. This module +also allows an item of ``sys.path`` to be a string naming a ZIP file archive. +The ZIP archive can contain a subdirectory structure to support package imports, +and a path within the archive can be specified to only import from a +subdirectory. For example, the path :file:`/tmp/example.zip/lib/` would only +import from the :file:`lib/` subdirectory within the archive. + +Any files may be present in the ZIP archive, but only files :file:`.py` and +:file:`.py[co]` are available for import. ZIP import of dynamic modules +(:file:`.pyd`, :file:`.so`) is disallowed. Note that if an archive only contains +:file:`.py` files, Python will not attempt to modify the archive by adding the +corresponding :file:`.pyc` or :file:`.pyo` file, meaning that if a ZIP archive +doesn't contain :file:`.pyc` files, importing may be rather slow. + +The available attributes of this module are: + + +.. exception:: ZipImportError + + Exception raised by zipimporter objects. It's a subclass of :exc:`ImportError`, + so it can be caught as :exc:`ImportError`, too. + + +.. class:: zipimporter + + The class for importing ZIP files. See section :ref:`zipimporter-objects` + for constructor details. + + +.. seealso:: + + `PKZIP Application Note <http://www.pkware.com/business_and_developers/developer/appnote/>`_ + Documentation on the ZIP file format by Phil Katz, the creator of the format and + algorithms used. + + :pep:`0273` - Import Modules from Zip Archives + Written by James C. Ahlstrom, who also provided an implementation. Python 2.3 + follows the specification in PEP 273, but uses an implementation written by Just + van Rossum that uses the import hooks described in PEP 302. + + :pep:`0302` - New Import Hooks + The PEP to add the import hooks that help this module work. + + +.. _zipimporter-objects: + +zipimporter Objects +------------------- + + +.. class:: zipimporter(archivepath) + + Create a new zipimporter instance. *archivepath* must be a path to a zipfile. + :exc:`ZipImportError` is raised if *archivepath* doesn't point to a valid ZIP + archive. + + +.. method:: zipimporter.find_module(fullname[, path]) + + Search for a module specified by *fullname*. *fullname* must be the fully + qualified (dotted) module name. It returns the zipimporter instance itself if + the module was found, or :const:`None` if it wasn't. The optional *path* + argument is ignored---it's there for compatibility with the importer protocol. + + +.. method:: zipimporter.get_code(fullname) + + Return the code object for the specified module. Raise :exc:`ZipImportError` if + the module couldn't be found. + + +.. method:: zipimporter.get_data(pathname) + + Return the data associated with *pathname*. Raise :exc:`IOError` if the file + wasn't found. + + +.. method:: zipimporter.get_source(fullname) + + Return the source code for the specified module. Raise :exc:`ZipImportError` if + the module couldn't be found, return :const:`None` if the archive does contain + the module, but has no source for it. + + +.. method:: zipimporter.is_package(fullname) + + Return True if the module specified by *fullname* is a package. Raise + :exc:`ZipImportError` if the module couldn't be found. + + +.. method:: zipimporter.load_module(fullname) + + Load the module specified by *fullname*. *fullname* must be the fully qualified + (dotted) module name. It returns the imported module, or raises + :exc:`ZipImportError` if it wasn't found. + + +Examples +-------- + +.. _zipimport-examples: + +Here is an example that imports a module from a ZIP archive - note that the +:mod:`zipimport` module is not explicitly used. :: + + $ unzip -l /tmp/example.zip + Archive: /tmp/example.zip + Length Date Time Name + -------- ---- ---- ---- + 8467 11-26-02 22:30 jwzthreading.py + -------- ------- + 8467 1 file + $ ./python + Python 2.3 (#1, Aug 1 2003, 19:54:32) + >>> import sys + >>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path + >>> import jwzthreading + >>> jwzthreading.__file__ + '/tmp/example.zip/jwzthreading.py' + diff --git a/Doc/library/zlib.rst b/Doc/library/zlib.rst new file mode 100644 index 0000000..e57a156 --- /dev/null +++ b/Doc/library/zlib.rst @@ -0,0 +1,209 @@ + +:mod:`zlib` --- Compression compatible with :program:`gzip` +=========================================================== + +.. module:: zlib + :synopsis: Low-level interface to compression and decompression routines compatible with + gzip. + + +For applications that require data compression, the functions in this module +allow compression and decompression, using the zlib library. The zlib library +has its own home page at http://www.zlib.net. There are known +incompatibilities between the Python module and versions of the zlib library +earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using +1.1.4 or later. + +zlib's functions have many options and often need to be used in a particular +order. This documentation doesn't attempt to cover all of the permutations; +consult the zlib manual at http://www.zlib.net/manual.html for authoritative +information. + +The available exception and functions in this module are: + + +.. exception:: error + + Exception raised on compression and decompression errors. + + +.. function:: adler32(string[, value]) + + Computes a Adler-32 checksum of *string*. (An Adler-32 checksum is almost as + reliable as a CRC32 but can be computed much more quickly.) If *value* is + present, it is used as the starting value of the checksum; otherwise, a fixed + default value is used. This allows computing a running checksum over the + concatenation of several input strings. The algorithm is not cryptographically + strong, and should not be used for authentication or digital signatures. Since + the algorithm is designed for use as a checksum algorithm, it is not suitable + for use as a general hash algorithm. + + +.. function:: compress(string[, level]) + + Compresses the data in *string*, returning a string contained compressed data. + *level* is an integer from ``1`` to ``9`` controlling the level of compression; + ``1`` is fastest and produces the least compression, ``9`` is slowest and + produces the most. The default value is ``6``. Raises the :exc:`error` + exception if any error occurs. + + +.. function:: compressobj([level]) + + Returns a compression object, to be used for compressing data streams that won't + fit into memory at once. *level* is an integer from ``1`` to ``9`` controlling + the level of compression; ``1`` is fastest and produces the least compression, + ``9`` is slowest and produces the most. The default value is ``6``. + + +.. function:: crc32(string[, value]) + + .. index:: + single: Cyclic Redundancy Check + single: checksum; Cyclic Redundancy Check + + Computes a CRC (Cyclic Redundancy Check) checksum of *string*. If *value* is + present, it is used as the starting value of the checksum; otherwise, a fixed + default value is used. This allows computing a running checksum over the + concatenation of several input strings. The algorithm is not cryptographically + strong, and should not be used for authentication or digital signatures. Since + the algorithm is designed for use as a checksum algorithm, it is not suitable + for use as a general hash algorithm. + + .. % + + +.. function:: decompress(string[, wbits[, bufsize]]) + + Decompresses the data in *string*, returning a string containing the + uncompressed data. The *wbits* parameter controls the size of the window + buffer. If *bufsize* is given, it is used as the initial size of the output + buffer. Raises the :exc:`error` exception if any error occurs. + + The absolute value of *wbits* is the base two logarithm of the size of the + history buffer (the "window size") used when compressing data. Its absolute + value should be between 8 and 15 for the most recent versions of the zlib + library, larger values resulting in better compression at the expense of greater + memory usage. The default value is 15. When *wbits* is negative, the standard + :program:`gzip` header is suppressed; this is an undocumented feature of the + zlib library, used for compatibility with :program:`unzip`'s compression file + format. + + *bufsize* is the initial size of the buffer used to hold decompressed data. If + more space is required, the buffer size will be increased as needed, so you + don't have to get this value exactly right; tuning it will only save a few calls + to :cfunc:`malloc`. The default size is 16384. + + +.. function:: decompressobj([wbits]) + + Returns a decompression object, to be used for decompressing data streams that + won't fit into memory at once. The *wbits* parameter controls the size of the + window buffer. + +Compression objects support the following methods: + + +.. method:: Compress.compress(string) + + Compress *string*, returning a string containing compressed data for at least + part of the data in *string*. This data should be concatenated to the output + produced by any preceding calls to the :meth:`compress` method. Some input may + be kept in internal buffers for later processing. + + +.. method:: Compress.flush([mode]) + + All pending input is processed, and a string containing the remaining compressed + output is returned. *mode* can be selected from the constants + :const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`, + defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and + :const:`Z_FULL_FLUSH` allow compressing further strings of data, while + :const:`Z_FINISH` finishes the compressed stream and prevents compressing any + more data. After calling :meth:`flush` with *mode* set to :const:`Z_FINISH`, + the :meth:`compress` method cannot be called again; the only realistic action is + to delete the object. + + +.. method:: Compress.copy() + + Returns a copy of the compression object. This can be used to efficiently + compress a set of data that share a common initial prefix. + + .. versionadded:: 2.5 + +Decompression objects support the following methods, and two attributes: + + +.. attribute:: Decompress.unused_data + + A string which contains any bytes past the end of the compressed data. That is, + this remains ``""`` until the last byte that contains compression data is + available. If the whole string turned out to contain compressed data, this is + ``""``, the empty string. + + The only way to determine where a string of compressed data ends is by actually + decompressing it. This means that when compressed data is contained part of a + larger file, you can only find the end of it by reading data and feeding it + followed by some non-empty string into a decompression object's + :meth:`decompress` method until the :attr:`unused_data` attribute is no longer + the empty string. + + +.. attribute:: Decompress.unconsumed_tail + + A string that contains any data that was not consumed by the last + :meth:`decompress` call because it exceeded the limit for the uncompressed data + buffer. This data has not yet been seen by the zlib machinery, so you must feed + it (possibly with further data concatenated to it) back to a subsequent + :meth:`decompress` method call in order to get correct output. + + +.. method:: Decompress.decompress(string[, max_length]) + + Decompress *string*, returning a string containing the uncompressed data + corresponding to at least part of the data in *string*. This data should be + concatenated to the output produced by any preceding calls to the + :meth:`decompress` method. Some of the input data may be preserved in internal + buffers for later processing. + + If the optional parameter *max_length* is supplied then the return value will be + no longer than *max_length*. This may mean that not all of the compressed input + can be processed; and unconsumed data will be stored in the attribute + :attr:`unconsumed_tail`. This string must be passed to a subsequent call to + :meth:`decompress` if decompression is to continue. If *max_length* is not + supplied then the whole input is decompressed, and :attr:`unconsumed_tail` is an + empty string. + + +.. method:: Decompress.flush([length]) + + All pending input is processed, and a string containing the remaining + uncompressed output is returned. After calling :meth:`flush`, the + :meth:`decompress` method cannot be called again; the only realistic action is + to delete the object. + + The optional parameter *length* sets the initial size of the output buffer. + + +.. method:: Decompress.copy() + + Returns a copy of the decompression object. This can be used to save the state + of the decompressor midway through the data stream in order to speed up random + seeks into the stream at a future point. + + .. versionadded:: 2.5 + + +.. seealso:: + + Module :mod:`gzip` + Reading and writing :program:`gzip`\ -format files. + + http://www.zlib.net + The zlib library home page. + + http://www.zlib.net/manual.html + The zlib manual explains the semantics and usage of the library's many + functions. + |